df = pd.DataFrame({'A' : [1, 1, 3, 4], 'B': [7, 7, 7, 8], 'C': [5, 6, 7, 8], 'D': [5, 6, 7, 8]}) print(df) print("") print(df.groupby(["A", "B"]).rank()) print("") print(df.groupby(["A", "B"]).sum()) print("") print(df.groupby(["A", "B"]).count()) print("")

Expected Output

Would expect all three functions to show similar output. Sum and count work as expected but rank does not show A/B.

output of pd.show_versions()

A B C D 0 1 7 5 5 1 1 7 6 6 2 3 7 7 7 3 4 8 8 8

 C    D

0 1.0 1.0 1 2.0 2.0 2 1.0 1.0 3 1.0 1.0

  C   D

A B
1 7 11 11 3 7 7 7 4 8 8 8

 C  D

A B
1 7 2 2 3 7 1 1 4 8 1 1

Comment From: simonm3

[pip.vcs:DEBUG]:Registered VCS backend: git (C:\Users\s\Anaconda3\lib\site-packages\pip\vcs__init__.py\59, time=16:56) [pip.vcs:DEBUG]:Registered VCS backend: hg (C:\Users\s\Anaconda3\lib\site-packages\pip\vcs__init__.py\59, time=16:56) [pip.pep425tags:DEBUG]:Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect (C:\Users\s\Anaconda3\lib\site-packages\pip\pep425tags.py\80, time=16:56) [pip.pep425tags:DEBUG]:Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect (C:\Users\s\Anaconda3\lib\site-packages\pip\pep425tags.py\80, time=16:56) [pip.pep425tags:DEBUG]:Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect (C:\Users\s\Anaconda3\lib\site-packages\pip\pep425tags.py\80, time=16:56) [pip.pep425tags:DEBUG]:Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect (C:\Users\s\Anaconda3\lib\site-packages\pip\pep425tags.py\80, time=16:56) [pip.vcs:DEBUG]:Registered VCS backend: svn (C:\Users\s\Anaconda3\lib\site-packages\pip\vcs__init__.py\59, time=16:56) [pip.vcs:DEBUG]:Registered VCS backend: bzr (C:\Users\s\Anaconda3\lib\site-packages\pip\vcs__init__.py\59, time=16:56)

INSTALLED VERSIONS

commit: None python: 3.5.1.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.18.1 nose: 1.3.7 pip: 8.1.2 setuptools: 20.7.0 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.1 statsmodels: 0.6.1 xarray: None IPython: 4.1.1 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.4.2 pytz: 2015.7 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.5.2 matplotlib: 1.5.1 openpyxl: 2.2.6 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: 0.7.7 lxml: 3.4.4 bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.9 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.38.0 pandas_datareader: None time: 4.43 s

Comment From: jorisvandenbossche

sum/count on the one hand and rank on the other hand are different kind of methods. sum and count are 'aggregation' methods (they return one value for each group, and by default they use the group keys as the index in the result), while rank is a 'transform' method (it returns an object of the same shape, with the same index as the original dataframe).

For this reason, it is normal that you see another index for the different results.

Comment From: jorisvandenbossche

@simonm3 I am closing this issue for now, but that certainly does not mean you cannot further comment or clarify your question!