Pandas best practice for combining partial string indexing + regular mask?

Consider the simple example as follows

import pandas as pd

dataframe = pd.DataFrame({'time' : [pd.to_datetime('2016-06-06'),
                                    pd.to_datetime('2016-06-06'),
                                    pd.to_datetime('2016-06-07'),
                                    pd.to_datetime('2016-06-08')],
                            'value' : [1,2,3,4],
                            'group' : ['A','B','A','B']})

dataframe.set_index('time', inplace = True)

dataframe
Out[13]: 
           group  value
time                   
2016-06-06     A      1
2016-06-06     B      2
2016-06-07     A      3
2016-06-08     B      4

Now I want to use all the cool partial string indexing functions, but also be able to filter on other variables. The solution I come up with is the syntax

dataframe.loc[dataframe['group'] == 'A'].loc['2016-06-06']

which looks really horrible with the two loc combined. Is that the correct pandonic way to do so (while keeping the index?) I have read and read again the documentation but I cannot find the answer.

Thanks!

Comment From: gfyoung

@randomgambit : Thanks for the question! I think you might be better served emailing this question out to the pandas-dev mailing list or reaching out on our Gitter channel.

I will close this issue for now unless there turns out to be a real issue (or enhancement) with the pandas codebase that can be well addressed with the example you have provided.

Comment From: randomgambit

thanks @gfyoung but can you at least tell me if that's the correct syntax here? Its not easy to send en example via email, and there is nobody on gitter

Comment From: jreback

We don't directly have an exposed partial string indexer. Instead .get_loc will do this, but it can return various things (e.g. a single index, array of booleans, or slice), so it is not directly user facing.

In [15]: dfi = dataframe.index

In [16]: dataframe.loc[(dataframe['group'] == 'A') & dfi.isin(dfi[dfi.get_loc('2016-06-06')])]
Out[16]: 
           group  value
time                   
2016-06-06     A      1

Comment From: randomgambit

@jreback thanks a lot. I think this would be a really nice feature to have. partial string indexing is so useful, but 90% of the time I need to combine if with some other logical condition so I end up doing some df.reset_index().loc[] which is super annoying.

When you say its not user facing, you mean I should not use it?

Comment From: jreback

i mean there is no direct way to do this aside from the verbose code that i posted

i suppose u could make an index function to wrap this as a convenience

if u want to post a signature / doc string can take a look

Comment From: gfyoung

Given where the discussion has gone, I'll reopen this.

Comment From: gfyoung

but can you at least tell me if that's the correct syntax here? Its not easy to send en example via email, and there is nobody on gitter

@randomgambit : Sorry about that! I was responding from my phone, so I was hoping email or Gitter could provide you with a more comprehensive response.

Comment From: randomgambit

@gfyoung @jreback I tried to find more conventional solutions and actually I found this funny thing:

dataframe.loc[(dataframe.reset_index()['time'] == pd.to_datetime('2016-06-06')) & (dataframe['group'] == 'A')]

FAILS

dataframe.reset_index(inplace = True)
dataframe.loc[(dataframe['time'] == pd.to_datetime('2016-06-06')) & (dataframe['group'] == 'A')]

WORKS

in purely logical terms, I dont see why 1. does not work here

Comment From: gfyoung

@randomgambit : What indices do you get from each call (i.e. compare what is being passed in-between the outermost brackets)?

Comment From: randomgambit

@gfyoung

dataframe.set_index('time', inplace = True)

dataframe.reset_index()['time']
Out[35]: 
0   2016-06-06
1   2016-06-06
2   2016-06-07
3   2016-06-08
Name: time, dtype: datetime64[ns]

dataframe.reset_index(inplace = True)

dataframe['time']
Out[37]: 
0   2016-06-06
1   2016-06-06
2   2016-06-07
3   2016-06-08
Name: time, dtype: datetime64[ns]

Comment From: gfyoung

And in your first example, what does the index look like? The DataFrame in your two examples are not the same because one had its index reset (second example) but not the other (first example).

Comment From: randomgambit

@gfyoung I am sorry I dont understand what you want me to do here. I get that they are not the same, but I dont see why dataframe.loc[(dataframe.reset_index()['time'] == pd.to_datetime('2016-06-06')) & (dataframe['group'] == 'A')] does not work

(dataframe.reset_index()['time'] == pd.to_datetime('2016-06-06')) is simply a boolean test

Comment From: gfyoung

Right, but you get a Series returned from that boolean test. The indices in that Series won't match those of the dataframe.

Comment From: randomgambit

@gfyoung got it, that works

dataframe.loc[(dataframe.reset_index()['time'] == pd.to_datetime('2016-06-06')).tolist() & (dataframe['group'] == 'A')]

Comment From: gfyoung

Awesome! Glad we could resolve that. Feel free to post a PR to wrap the functionality you shared above so that we don't have to write this out again :smile:

Comment From: mroeschke

Seems like the new API to wrap functionality hasn't gained much traction or interest over the years so going to close for now. Can reopen if there's more interest from the community in the future