In [90]: df
Out[90]:
A B C D E
0 0.444939 0.407554 0.460148 0.465239 0.462691
1 0.016545 0.850445 0.817744 0.777962 0.757983
2 0.934829 0.831104 0.879891 0.926879 0.721535
3 0.117642 0.145906 0.199844 0.437564 0.100702
In [91]: df2 = df.apply(lambda row: df.columns[np.argsort(row)], axis=1)
In [92]: df2
Out[92]:
Output
A B C D E
0 B A C E D
1 A E D C B
2 E B C D A
3 E A B C D
Expected Output
1 2 3 4 5
0 B A C E D
1 A E D C B
2 E B C D A
3 B C D E A
Last row of the dataframe has different values. But rest of the rows has same value as expected.
http://stackoverflow.com/questions/39605512/twist-dataframe-by-rank/39610093#39610093
Comment From: bkandel
I don't see the problem here. np.argsort
sorts its arguments in increasing order. In the last column, E has the lowest value (0.1), followed by A (0.118), followed by B (0.1459), etc. So this looks to me like it's performing as expected.
Comment From: harikongu
In the above code, data is replaced with column names such that values are in ascending order. In the last row, Column E as the lowest value, so it should be replaced as "A". But in last column alone it is disordered. Expected Output: [0.117642 0.145906 0.199844 0.437564 0.100702] [B C D E A]
Comment From: chris-b1
This is not a bug, np.argsort
returns an indexer (i.e. which element should be in that position) not a rank. You probably want something like the rank
SO answer.
ranks = df.rank(axis=1).astype(int)-1
new_values = df.columns.values.take(ranks)
pd.DataFrame(new_values)
Out[101]:
0 1 2 3 4
0 B A C E D
1 A E D C B
2 E B C D A
3 B C D E A