df = pd.DataFrame([[1, 2, 3], [2, 3, 3], [1, 4, 3], [1, 5, 4], [2, 6, 4]])
def onlycol0(group):
return group[[group.columns[0]]]
g = df.groupby(0)
g.transform(onlycol0)
raises an "AssertionError: invalid dtype determination in get_concat_dtype".
def onemorecol(group):
group['new'] = 2
return group
g.transform(onemorecol)
raises a "ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements".
def reversecols(group):
return group[group.columns[::-1]]
g.transform(reversecols)
apparently works fine, but the resulting DataFrame has columns unchanged.
So: the docs should probably state that func must return a DataFrame with same index and columns (confusion is possible also because the docs for apply() refer to applying "transform function" but in a more general sense). I will submit a PR.
Comment From: jreback
This says that the frame should be like-indexed having the same indexes as the original. Pretty clear to me.
In [77]: g.transform?
Type: instancemethod
String form: <bound method DataFrameGroupBy.transform of <pandas.core.groupby.DataFrameGroupBy object at 0x109fa8d10>>
File: /Users/jreback/pandas/pandas/core/groupby.py
Definition: g.transform(self, func, *args, **kwargs)
Docstring:
Call function producing a like-indexed DataFrame on each group and
return a DataFrame having the same indexes as the original object
filled with the transformed values
Parameters
----------
f : function
Function to apply to each subframe
Notes
-----
Each subframe is endowed the attribute 'name' in case you need to know
which group you are working on.
Examples
--------
>>> grouped = df.groupby(lambda x: mapping[x])
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
But if you wish submit a PR that clarifies ok.
Comment From: toobaz
"indexes" == "index and columns" ?!
(this is not so e.g. in the docstring of merge...)
Comment From: simonjayhawkins
on master we now have:
>>> import pandas as pd
>>> df = pd.DataFrame([[1, 2, 3], [2, 3, 3], [1, 4, 3], [1, 5, 4], [2, 6, 4]])
>>> def onlycol0(group):
... return group[[group.columns[0]]]
...
>>> g = df.groupby(0)
>>> g.transform(onlycol0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\groupby\generic.py", line 535, in transform
return self._transform_general(func, *args, **kwargs)
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\groupby\generic.py", line 486, in _transform_general
path, res = self._choose_path(fast_path, slow_path, group)
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\groupby\generic.py", line 583, in _choose_path
res = slow_path(group)
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\groupby\generic.py", line 578, in <lambda>
lambda x: func(x, *args, **kwargs), axis=self.axis)
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\frame.py", line 6516, in apply
return op.get_result()
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\apply.py", line 151, in get_result
return self.apply_standard()
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\apply.py", line 257, in apply_standard
self.apply_series_generator()
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\apply.py", line 286, in apply_series_generator
results[i] = self.f(v)
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\groupby\generic.py", line 578, in <lambda>
lambda x: func(x, *args, **kwargs), axis=self.axis)
File "<stdin>", line 2, in onlycol0
File "C:\Users\simon\OneDrive\code\pandas-simonjayhawkins\pandas\core\generic.py", line 5096, in __getattr__
return object.__getattribute__(self, name)
AttributeError: ("'Series' object has no attribute 'columns'", 'occurred at index 1')
>>>
is this issue still relevant or should the title be changed?
Comment From: rhshadrach
The docs state:
if this is a DataFrame, f must support application column-by-column in the subframe.
The function in the OP does not satisfy this. I think this can be closed.
Comment From: rhshadrach
Moreover, we now don't support mutating objects in transform. Closing.