Issue description
df.loc[key]
will raise a KeyError
exception when df
has a tuple index and key
is a tuple but no exception is raised when key
is a subclass of tuple, in which case df.loc[key]
selects a row.
Reproducible example
from collections import namedtuple
import pandas as pd
# tuple subclasses
mynamedtuple = namedtuple("MyNamedTuple", ["a", "b"])
class mytuple(tuple): pass
# keys
tuple_key = ("foo", "bar")
mytuple_key = mytuple(["foo", "bar"])
mynamedtuple_key = mynamedtuple("foo", "bar")
# pandas objects
tuple_list = [("foo", "bar"), ("bar", "baz")]
tuple_index = pd.Index(tuple_list, tupleize_cols=False)
tuple_df = pd.DataFrame([(1,2), (3,4)], index=tuple_index, columns=["A", "B"])
mytuple_list = [mytuple(["foo", "bar"]), mytuple(["bar", "baz"])]
mytuple_index = pd.Index(mytuple_list, tupleize_cols=False)
mytuple_df = pd.DataFrame([(1,2), (3,4)], index=mytuple_index, columns=["A", "B"])
mynamedtuple_list = [mynamedtuple("foo", "bar"), mynamedtuple("bar", "baz")]
mynamedtuple_index = pd.Index(mynamedtuple_list, tupleize_cols=False)
mynamedtuple_df = pd.DataFrame([(1,2), (3,4)], index=mynamedtuple_index, columns=["A", "B"])
# Examples
print(mytuple_df.loc[mytuple_key]) # Does not raise
print(mynamedtuple_df.loc[mynamedtuple_key]) # Does not raise
print(tuple_df.loc[tuple_key]) # Raises KeyError
The following part of the example shows every possible combination of index and key types. On master the only cases where an exception is raised is when key
is a tuple.
# Every possible example
from itertools import product
keys = [tuple_key, mytuple_key, mynamedtuple_key]
dfs = [tuple_df, mytuple_df, mynamedtuple_df]
for k, df in product(keys, dfs):
print(f"Key: {type(k)}, Index: {type(df.index[0])}")
try:
res = df.loc[k]
except KeyError:
res = "**KeyError**"
print(f"{res}\n")
Expected behavior
Everyone of the examples should raise KeyError
just like when key
is a tuple.
Deprecate or fix bug?
This behavior is tested for the case where key
and the index elements are named tuples:
https://github.com/pandas-dev/pandas/blob/6b10bb875add0030e95b33869e974150dec78670/pandas/tests/indexing/test_loc.py#L1581-L1589
I became aware of this issue when I was trying to write a patch for issue #48188, which caused the above test to fail. Because there is a test for the behavior described in this issue I think that it is more appropriate to give a deprecation warning before changing it.