df = pd.DataFrame({'a': [-2, -1, 1, 10, 8, 11, -1],
'b': list('abdceff'),
'c': [1.0, 2.0, 4.0, 3.2, np.nan, 3.0, 4.0]})
df
Out[316]:
a b c
0 -2 a 1.0
1 -1 b 2.0
2 1 d 4.0
3 10 c 3.2
4 8 e NaN
5 11 f 3.0
6 -1 f 4.0
df.nlargest(5, ['a', 'c'])
Out[317]:
a b c
6 -1 f 4.0
5 11 f 3.0
3 10 c 3.2
4 8 e NaN
2 1 d 4.0
df.sort_values(by=['a','c'], ascending=False).head(5)
Out[318]:
a b c
5 11 f 3.0
3 10 c 3.2
4 8 e NaN
2 1 d 4.0
6 -1 f 4.0
I think their results should be the same.
Comment From: TomAugspurger
nlargest
doesn't sort the values (which is part of why it's faster).
You can sort afterwards.
In [32]: df.sort_values(['a', 'c'], ascending=False).head(5)
Out[32]:
a b c
5 11 f 3.0
3 10 c 3.2
4 8 e NaN
2 1 d 4.0
6 -1 f 4.0
In [33]: df.nlargest(5, ['a', 'c']).sort_values(['a', 'c'], ascending=False)
Out[33]:
a b c
5 11 f 3.0
3 10 c 3.2
4 8 e NaN
2 1 d 4.0
6 -1 f 4.0