AFAICT this exists to allow users to apply a different function to each group. I could be convinced that this is worth supporting, but it shouldn't be shoehorned into SeriesGroupBy.aggregate the way it is.

AFAICT this is not documented, supports only one test (parametrized over 4 dtypes):

@pytest.mark.parametrize("dtype", ["int64", "int32", "float64", "float32"])
def test_basic(dtype):

    data = pd.Series(np.arange(9) // 3, index=np.arange(9), dtype=dtype)

    index = np.arange(9)
    np.random.shuffle(index)
    data = data.reindex(index)

    grouped = data.groupby(lambda x: x // 3)
    [...]

    group_constants = {0: 10, 1: 20, 2: 30}
    agged = grouped.agg(lambda x: group_constants[x.name] + x.mean())
    assert agged[1] == 21

Comment From: jbrockmendel

Tracking down all the places where we do this pinning of "name", disabling them all breaks 16 tests

xref https://github.com/pandas-dev/pandas/pull/15062 "I also think that the .name arg is somewhat undocumented when passed to .apply, if you want to add a note there (or small example), might be useful."

Comment From: jbrockmendel

Discussed in April dev meeting, decided this was probably too-breaking

Comment From: jbrockmendel

@rhshadrach id like to revive this and am hoping to get you on board. all the places in the groupby code where we do object.__setattr__(group, "name", name) i want to deprecate. Thoughts?

Comment From: rhshadrach

I could be convinced that this is worth supporting, but it shouldn't be shoehorned into SeriesGroupBy.aggregate the way it is.

I agree with this in its entirety. I personally haven't found a use for the group keys, but don't have a sense how utilized this is. Without this, currently I think users would be left with iterating a groupby object themselves and combining the results (essentially reimplementating apply / agg / transform).

While I'm hesitant to add arguments to apply / agg / transform, it seems to me having a pass_keys or something similar would be a better approach. Defaulting to False, when true, the UDF would be passed two arguments: keys and group. I think this would at least be straightforward to support and test. Thoughts on this alternative?

Comment From: jbrockmendel

would be left with iterating a groupby object themselves and combining the results

I'd be OK with telling users to do this; it's not as if our implementation does anything fancy for performance. And we do have an uncomfortable amount of guessing when it comes time to wrap/concat the results.

I think this would at least be straightforward to support and test. Thoughts on this alternative?

I'd prefer this to the status quo, but share your hesitancy to add keywords. A separate method might be even cleaner? Maybe we just deprecate this behavior entirely (how?) and wait to add an alternative until someone complains?

It's ugly as sin, but we could introspect a UDF for keywords.

Comment From: jbrockmendel

We have 6 places in the groupby code where we do this. Of those, disabling it in DFGB._transform_general (twice) and in filter does not break any tests. The filter docstring does mention this pinning, so that would need a deprecation despite not being tested.

The usage in _aggregate_named will be easy to deprecate bc _aggregate_named is only called from one place and that is only after the non-pinning variant raises.

The usage in SGB._transform_general is needed for test_transform_lambda_with_datetimetz, which actually looks like a pretty legit use case for something like this.

Comment From: rhshadrach

I'd prefer this to the status quo, but share your hesitancy to add keywords. A separate method might be even cleaner? Maybe we just deprecate this behavior entirely (how?) and wait to add an alternative until someone complains?

It's ugly as sin, but we could introspect a UDF for keywords.

With the exception introspecting, I'd be good with any of those options; and I even could probably be dragged into being okay with introspection. A separate method seems a little odd to me because it'd be the same implementation except what is passed to the UDF.

I'm good with deprecating and seeing if users raise any issues. I suspect this is a feature that users aren't aware of let alone use, but could very much be wrong here.

Comment From: jbrockmendel

OK let's start with just deprecating and go from there.

Comment From: jbrockmendel

For the places where we do pinning other than _aggregate_named, we could do the deprecation by adding a _pin_name keyword and doing something like:

def filter(...):

    try:
        return self._filter(..., _pin_name=False)
    except Exception:
        out = self._filter(..., _pin_name=True)
        warnings.warn(...)
        return out

and put the object.__setattr__s behind a if _pin_name: check. The two potential problems are a) unlikely case UDF has a _pin_name keyword, b) case where user is depending on this behavior for their UDF fails to raise with _pin_name=False.

Comment From: jbrockmendel

Dangit. Looks like GroupBy.hist relies on the name being pinned in apply_groupwise, but doesn't raise if that pinning is disabled, just gives silently incorrect results (e.g. test_groupby_hist_series_with_legend)

Comment From: jbrockmendel

Looks like #25457 is an example of a bug caused by this name-pinning