Code Sample, a copy-pastable example if possible

import pandas as pd

df=pd.DataFrame({
    "KEY1": ["KEY1"],
    "KEY2": ["KEY2"],
    "INT64": [1583715738627261039]
})

# Grouping by one column produces correct output
print(df.groupby(["KEY1"]).agg(lambda x: x))

# Grouping by more than one column produces incorrect output
print(df.groupby(["KEY1", "KEY2"]).agg(lambda x: x))

Problem description

Precision loss on the int64 column when grouping by multiple columns.

Similar behaviour can be seen when doing

>>> df.groupby(["KEY1"]).sum()
                    INT64
KEY1                     
KEY1  1583715738627260928

but that seems like a different issue with cythonized group sum not supporting int64? According to: https://github.com/pandas-dev/pandas/issues/15027#issuecomment-269902341

Expected Output

      KEY2                INT64
KEY1                           
KEY1  KEY2  1583715738627261039
                         INT64
KEY1 KEY2                     
KEY1 KEY2  1583715738627261039

Actual Output

      KEY2                INT64
KEY1                           
KEY1  KEY2  1583715738627261039
                         INT64
KEY1 KEY2                     
KEY1 KEY2  1583715738627260928

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.5.final.0 python-bits : 64 OS : Linux OS-release : 3.16.0-77-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_AU.UTF-8 LOCALE : en_AU.UTF-8 pandas : 1.0.3 numpy : 1.18.2 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.3 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : None

Comment From: simonjayhawkins

This also applies to nullable integers

>>> import pandas as pd
>>>
>>> pd.__version__
'1.1.0.dev0+1068.g49bc8d8c9'
>>>
>>> df = pd.DataFrame(
...     {
...         "KEY1": ["KEY1"],
...         "KEY2": ["KEY2"],
...         "INT64": pd.array([1583715738627261039], dtype="Int64"),
...     }
... )
>>> df
   KEY1  KEY2                INT64
0  KEY1  KEY2  1583715738627261039
>>>
>>> # Grouping by one column produces correct output
>>> print(df.groupby(["KEY1"]).agg(lambda x: x))
      KEY2                INT64
KEY1
KEY1  KEY2  1583715738627261039
>>>
>>> df.groupby(["KEY1"]).agg(lambda x: x).dtypes
KEY2     object
INT64     Int64
dtype: object
>>>
>>> # Grouping by more than one column produces incorrect output
>>> print(df.groupby(["KEY1", "KEY2"]).agg(lambda x: x))
                  INT64
KEY1 KEY2
KEY1 KEY2  1.583716e+18
>>>
>>> df.groupby(["KEY1", "KEY2"]).agg(lambda x: x).dtypes
INT64    float64
dtype: object
>>>

Comment From: mroeschke

This looks okay on master. Could use a test

In [7]: import pandas as pd
   ...:
   ...: df=pd.DataFrame({
   ...:     "KEY1": ["KEY1"],
   ...:     "KEY2": ["KEY2"],
   ...:     "INT64": [1583715738627261039]
   ...: })
   ...:
   ...: # Grouping by one column produces correct output
   ...: print(df.groupby(["KEY1"]).agg(lambda x: x))
   ...:
   ...: # Grouping by more than one column produces incorrect output
   ...: print(df.groupby(["KEY1", "KEY2"]).agg(lambda x: x))
      KEY2                INT64
KEY1
KEY1  KEY2  1583715738627261039
                         INT64
KEY1 KEY2
KEY1 KEY2  1583715738627261039

Comment From: whitneymichelle

take

Comment From: simonjayhawkins

removing milestone