Pandas BUG: Wrong groupby indices

Code Sample, a copy-pastable example if possible

# Your code here
df = pd.DataFrame({'a':['1', '2'], 'b':[None, '20']})
df.groupby(['a', 'b']).indices.keys()

Output is:

dict_keys([('1', '20'), ('2', '20')])

Problem description

Current behavior creates a group index item that does not exist in the DataFrame, because of the presence of a NaN.

Expected Output

The expected output I guess it should be the same as the one given by .groups

In [12]: df.groupby(['a', 'b']).groups.keys()
Out[12]: dict_keys([('1', nan), ('2', '20')])

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-431.29.2.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C LOCALE: None.None pandas: 0.20.3 pytest: 3.0.7 pip: 9.0.1 setuptools: 36.5.0 Cython: 0.25.2 numpy: 1.13.3 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.3.0 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: jreback

.indicies is internal

In [5]: df.groupby(['a', 'b']).groups
Out[5]: 
{('1', nan): Int64Index([0], dtype='int64'),
 ('2', '20'): Int64Index([1], dtype='int64')}

is this what you are looking for? what you are trying to do?

Comment From: jreback

in any event, this is a duplicate: https://github.com/pandas-dev/pandas/issues/9304

if you'd like to have a look there would be great.

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`