Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df = pd.MultiIndex.from_product([[1, 2], [3, 4]], names=['a', 'b']).to_frame()
idx1, _ = next(iter(df.groupby(level=[0])))
idx2, _ = next(iter(df.groupby(level=[0, 1])))
assert type(idx1) == type(idx2) # Error
Issue Description
For consistency, when grouping using a single level that's specified as multiple levels (e.g. in a list) on a dataframe that has multiindex, the return types of the index should be consistent to the multiple-multiple levels. Same idea as the difference between df.iloc[0]
and df.iloc[[0]]
.
Expected Behavior
assert idx1 == (1, )
assert idx2 == (1, 3)
Installed Versions
Comment From: phofl
cc @rhshadrach Thoughts?
Comment From: rhshadrach
We recently did this for the by
argument, +1 on doing it for level
too.
Comment From: august-tengland
Take
Comment From: august-tengland
Hi! New Pandas contributor here (apologies for potentially stupid questions): Does this issue only consider the indexes that are "generated" when iterating? For iter(), we create them in group_keys_seq@/core/groupby/ops.py:
def group_keys_seq(self):
if len(self.groupings) == 1: # This is a problem
return self.levels[0]
...
If I add the condition that self.groupings[0].level is None
and omit the following in grouper.py (line 835):
if isinstance(group_axis, MultiIndex):
if is_list_like(level) and len(level) == 1:
level = level[0] # remove this
I get the expected behavior, however, it will not affect any other potential ways to get the index which doesn't use groups_key_seq (if such exists).
Comment From: rhshadrach
Thanks for picking this up!
Does this issue only consider the indexes that are "generated" when iterating?
Yes.
I get the expected behavior, however, it will not affect any other potential ways to get the index which doesn't use groups_key_seq (if such exists).
I'm not 100% sure I follow, but if it's related to the above question, then I think this is okay. We're only looking to impact __iter__
here.
Comment From: august-tengland
Thanks for your response!
I'm not 100% sure I follow, but if it's related to the above question, then I think this is okay. We're only looking to impact iter here.
That's great, I think I can get a PR on this by next week.