Code Sample, a copy-pastable example if possible
lst = [1, 2, 3, 1, 2, 3, 2]
s = pd.Series([1, 2, 3, 10, 20, 30, 80], index=lst)
grouped = s.groupby(level=0)
grouped.first()
grouped.last()
grouped.sum()
for x in grouped:
print(x)
Problem description
The snippet is taken from the document with slightly modifies. While grouped.last()
outputs
1 10
2 80
3 30
Length: 3, dtype: int64
the for
loop gives
(1, 1 1
1 10
Length: 2, dtype: int64)
(2, 2 2
2 20
2 80
Length: 3, dtype: int64)
(3, 3 3
3 30
Length: 2, dtype: int64)
So, does 80
belong to the 2nd
or the 3rd
group?
Expected Output
Output of pd.show_versions()
Comment From: gcbeltramini
According to the documentation: "If a non-unique index is used as the group key in a groupby operation, all values for the same index value will be considered to be in one group and thus the output of aggregation functions will only contain unique index values".
In your case, you are grouping by lst
, and the value 80
corresponds to index 2
. So 80
belongs to the group of index 2
, which is the second in the for-loop.
Comment From: Yevgnen
@gcbeltramini Thanks for your reply. So .last
means the last element of each group (not necessarily has the same length)?
Comment From: gcbeltramini
Exactly. Just look at the last element for each group in the output of the for-loop.
Comment From: Yevgnen
Thanks you.