Edit: Instead of implementing the as_index=True case mentioned below, the as_index=False case should be deprecated, and the docs to SeriesGroupBy.agg should be updated. See the discussion below for details.

According to https://github.com/pandas-dev/pandas/pull/15931#issue-220092580, this was deprecated in 0.20.0 and raises on main now:

df = pd.DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]})
gb = df.groupby("a", as_index=True)["b"]
result = gb.agg({"c": "sum"})
print(result)
# pandas.errors.SpecificationError: nested renamer is not supported

However, the docs say SeriesGroupBy.agg supports dict arguments. Also, when as_index=False it works

df = pd.DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]})
gb = df.groupby("a", as_index=False)["b"]
result = gb.agg({"c": "sum"})
print(result)
#    a  c
# 0  1  7
# 1  2  5

This is because when as_index=False, using __getitem__ with "b" still returns a DataFrameGroupBy.

Assuming the implementation in this case isn't difficult, I'm thinking the easiest way forward is to support dictionaries in SeriesGroupBy.agg.

cc @jreback, @jorisvandenbossche

Comment From: rhshadrach

Also #9052

Comment From: mroeschke

Hmm doesn't this mean dictionary inputs have different interpretations when __getitem__ is called?

I thought normally, a dictionary input e.g. {"b": sum} would mean "for axis label b apply the sum aggregation" while above example appears to make {"b": sum} mean "apply sum to the selected Series and name the result b"

Comment From: rhshadrach

Yes - DataFrameGroupBy.agg and SeriesGroupBy.agg will have different semantics. I'm not in favor for it also because users can just rename after the groupby. However, this partially works (with as_index=False) and this inconsistency is holding up further work on #46944.

I think we can resolve the inconsistency by enabling it for as_index=True in 2.0 since that isn't a breaking change, then deprecate passing a dict to SeriesGroupBy properly in 2.0. The other two options I see are less desirable:

  • Deprecate without enabling for as_index=True; this would make work on #46944 harder, to the point where I would probably shelf it until 3.0. It also leaves us with an inconsistent API for 2.x.
  • Raise when as_index=False in 2.0. This would be a backwards incompatible change without deprecation that I'd rather avoid.

Thoughts?

Comment From: mroeschke

I see. I would be okay with enabling it for 2.0 and hopefully deprecating passing dicts in agg when it's a SeriesGroupby in the same release so users know to transition.

Comment From: rhshadrach

Makes sense. I'll add the deprecation to the linked PR.

Comment From: rhshadrach

@mroeschke - I overlooked a better solution. We can maintain the current behavior and move forward with progress toward #46944. This is the approach I took in #50744. I think I overlooked this because it's odd to do if self.as_index: raise ValueError when the code would otherwise work fine.

I've reworked this issue to be about the remaining deprecation.