Pandas Logic behind groupby.first and groupby.last

Code Sample, a copy-pastable example if possible

lst = [1, 2, 3, 1, 2, 3, 2]
s = pd.Series([1, 2, 3, 10, 20, 30, 80], index=lst)
grouped = s.groupby(level=0)

grouped.first()
grouped.last()
grouped.sum()

for x in grouped:
    print(x)

Problem description

The snippet is taken from the document with slightly modifies. While grouped.last() outputs

1    10
2    80
3    30
Length: 3, dtype: int64

the for loop gives

(1, 1     1
1    10
Length: 2, dtype: int64)
(2, 2     2
2    20
2    80
Length: 3, dtype: int64)
(3, 3     3
3    30
Length: 2, dtype: int64)

So, does 80 belong to the 2nd or the 3rd group?

Expected Output

Output of `pd.show_versions()`

/usr/local/lib/python3.6/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 16.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.20.1 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: 0.9.5 IPython: 6.0.0 sphinx: 1.5.5 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999999999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: gcbeltramini

According to the documentation: "If a non-unique index is used as the group key in a groupby operation, all values for the same index value will be considered to be in one group and thus the output of aggregation functions will only contain unique index values".

In your case, you are grouping by lst, and the value 80 corresponds to index 2. So 80 belongs to the group of index 2, which is the second in the for-loop.

Comment From: Yevgnen

@gcbeltramini Thanks for your reply. So .last means the last element of each group (not necessarily has the same length)?