Pandas groupby aggregate: class as arg value

Feature request: it would be nice if aggregate allowed classes as an argument, using its constructor as the function to aggregate by. In the code sample it recognises frozenset as an iterable because it happens to have such an instance method.

I'd suggest a check like:

>>> inspect.isclass(frozenset)
True
>>> inspect.isclass(frozenset())
False

(I don't think a sane person would make a class with a @staticmethod __iter__ returning the functions to aggregate by)

Code Sample, a copy-pastable example if possible

>>> pd.DataFrame(np.ones((2,2)), columns=[0,1]).groupby(1)[[0]].agg(frozenset)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/groupby.py", line 3597, in aggregate
    return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/groupby.py", line 3114, in aggregate
    result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/base.py", line 564, in _aggregate
    return self._aggregate_multiple_funcs(arg, _level=_level), None
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/base.py", line 616, in _aggregate_multiple_funcs
    return concat(results, keys=keys, axis=1)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/tools/merge.py", line 845, in concat
    copy=copy)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/tools/merge.py", line 878, in __init__
    raise ValueError('No objects to concatenate')
ValueError: No objects to concatenate

It can be worked around by using a lambda:

pd.DataFrame(np.ones((2,2)), columns=[0,1]).groupby(1)[[0]].agg(lambda x: frozenset(x))

Expected Output

         0
1         
1.0  (1.0)

output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.6.4-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8

pandas: 0.18.1 nose: 1.3.7 pip: 8.1.1 setuptools: 20.6.7 Cython: None numpy: 1.11.1 scipy: 0.18.0 statsmodels: None xarray: None IPython: None sphinx: 1.4.5 patsy: None dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: 1.0.0 tables: None numexpr: 2.4.6 matplotlib: 1.5.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: 1.1.0b2 pymysql: None psycopg2: None jinja2: 2.8 boto: None pandas_datareader: None

Comment From: jreback

use .apply; .agg has some guarantees about inference on the return type

In [5]: pd.DataFrame(np.ones((2,2)), columns=[0,1]).groupby(1)[[0]].apply(frozenset)
Out[5]: 
1
1.0    (0)
dtype: object

Comment From: timdiels

Thanks, though note that when selecting multiple columns, one still has to use agg(lambda):

>>> df = pd.DataFrame([[1,2,1],[3,4,1],[5,6,2],[7,8,2]])
>>> df['irrelevant'] = [1,2,3,4]
>>> df
   0  1  2  irrelevant
0  1  2  1           1
1  3  4  1           2
2  5  6  2           3
3  7  8  2           4
>>> df.groupby(2)[[0,1]].apply(frozenset)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/groupby.py", line 651, in apply
    return self._python_apply_general(f)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/groupby.py", line 660, in _python_apply_general
    not_indexed_same=mutated or self.mutated)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/groupby.py", line 3375, in _wrap_applied_output
    return (Series(values, index=key_index, name=self.name)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/series.py", line 233, in __init__
    self.name = name
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/generic.py", line 2694, in __setattr__
    object.__setattr__(self, name, value)
  File "/home/limyreth/.local/lib/python3.5/site-packages/pandas/core/series.py", line 309, in name
    raise TypeError('Series.name must be a hashable type')
TypeError: Series.name must be a hashable type
>>> df.groupby(2)[[0,1]].agg(lambda x: frozenset(x))
        0       1
2                
1  (1, 3)  (2, 4)
2  (5, 7)  (8, 6)

Pandas groupby aggregate: class as arg value

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

INSTALLED VERSIONS

output of `pd.show_versions()`