Pandas Suggestion: documentation examples should use meaningful data where possible

Something like this is a pretty good example. I know something about animals, cats, dogs, and hair, so I can sort of keep the structure of the data in my head and follow along with the transformations without having to scroll up and down to check against the original dataframe.

But a lot of examples just use meaningless column names like A, B, C, D... or foo, bar, baz..., which makes it a lot harder to gain an intuition about what's going on. For example, if you don't know what groupby does, this example:

In [13]: df2 = pd.DataFrame({'X' : ['B', 'B', 'A', 'A'], 'Y' : [1, 2, 3, 4]})

In [14]: df2.groupby(['X']).sum()
Out[14]: 
   Y
X   
A  7
B  3

might be less useful than this version:

In [13]: pets = pd.DataFrame({'animal' : ['dog', 'dog', 'cat', 'cat'], 'weight' : [10, 20, 8, 9]})

In [14]: pets.groupby(['weight']).mean()
Out[14]: 
   weight
animal   
dog  15
cat  8.5

I realize re-doing all the examples like this would be a significant amount of work, but if there's agreement that this is a desirable thing, I'd be happy to kick things off with a small P.R.

Also, I think it would be good to add this as a guideline to the documentation section of the contributing doc. (Again, if people agree this is worthwhile and not misguided.)

Comment From: jreback

I am -0 on this. I don't think this adds much value, would be a lot of works to change until its consistent, and the current examples are a bit shorter.

Comment From: TomAugspurger

https://github.com/pandas-dev/pandas/pull/16520 is starting on something like this.

Comment From: jorisvandenbossche

Closing this in favor of https://github.com/pandas-dev/pandas/issues/19710. @colinmorris I think this is a good idea for certain cases like the groupby example above (not for all docstrings), feel free to comment on https://github.com/pandas-dev/pandas/issues/19710.