Feature Type
-
[X] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
Currently, the transform() method only allows users to normalize across the entire DataFrame. I would like to have a normgroup function that is made to normalizea specific column.
Feature Description
def normgroup(self, by, norm='12'):
"""
Normalize each group of a DataFrame by a specified column.
Parameters
----------
by : string
The name of the column to group by.
norm : string or callable
The normalization method to use (default 'l2').
Returns:
--------
pandas.DataFrame: A DataFrame with each group normalized by the specified column.
Examples
--------
>>> data = {'group': ['A', 'A', 'B', 'B', 'B', 'C', 'C'],
... 'value': [1, 2, 3, 4, 5, 6, 7]}
>>> df = pd.DataFrame(data)
>>> normalized_df = df.normgroup('group', norm='l2')
>>> print(normalized_df)
value
0 0.447214
1 0.894427
2 0.424264
3 0.565685
4 0.707107
5 0.640513
6 0.767188
"""
groups = self._groupby(by)
return groups.transform(lambda x: (x / np.linalg.norm(x, ord=norm)))
Alternative Solutions
Write now, I believe, that to do the same thing you need to use a comand like: df[['A', 'B']].groupby(df['group']).transform(lambda x: x.div(np.linalg.norm(x['C'])))
Additional Context
No response
Comment From: attack68
-1. This is too specific a function to be included in the pandas namespace, particularly when the function itself as you implemented requires a chained one line.
Comment From: mroeschke
Agreed, with @attack68. Thanks for the suggestion but closing