I often want to remove certain elements from an index while retaining the original ordering of the index. For this reason a sort keyword in the index.difference
method would be required:
In: index = pd.Index([0, 1, 2, 4, 3])
In: index.difference({1, 2})
Out: Int64Index([0, 3, 4], dtype='int64')
# It would be cool to have instead
In: index.difference({1, 2}, sort=False)
Out: Int64Index([0, 4, 3], dtype='int64')
I think that this usecase appears frequently and setting the default to sort=True
doesn't affect existing code.
Comment From: TomAugspurger
+1 for this (and all the other set-ops).
@mcocdawc are you interested in submitting a pull request?
Comment From: mcocdawc
Interested definetely, but I don't have time for it now. :(
Comment From: Giftlin
I am interested in submitting a pull request. Can I?
Comment From: jreback
sure
Comment From: jreback
this needs to be done for all Index set methods (union, difference, intersect) and tested systematically. Further there are several related issues here:
xref #17010, #17378, #17376
Comment From: Licht-T
Oops, I am almost done patching in all Index classes.
difference
and intersection
are okay, but I have a question do we really need sort
option in union
?
Comment From: Licht-T
If the sort
option needed in union
, how do we define the unsorted union
order?
idx1.join(idx2.difference(idx1))
?
Comment From: jbrockmendel
The set methods all have sort keyword. Closing a complete.