Per @jreback: https://github.com/pydata/pandas/pull/8410#issuecomment-57100206
Groupby for large values of ngroups, 10000 in this case, is very slow.
Invoked with :
--ncalls: 3
--repeats: 3
---------------------------------------------------------
Test name | #0 |
---------------------------------------------------------
...
groupby_large_ngroups_value_counts | 23527.0356 |
groupby_large_ngroups_nunique | 19496.5300 |
groupby_large_ngroups_describe | 82329.8963 |
groupby_large_ngroups_mad | 21441.5947 |
groupby_large_ngroups_pct_change | 18865.5750 |
See also: https://gist.github.com/dlovell/ea3400273314e7612f6e
Comment From: TomAugspurger
Not immediately clear if this is still an issue.
Can reopen with a runnable example (preferable an asv benchmark).