Add notin
convenience method to DataFrame
/Series
classes
This may already have been discussed, but I couldn't find any existing issue/enhancement with this title. It's trivial to implement, but I think it might be syntactically helpful to write
df.notin(df1)
instead of
~df.isin(df1)
especially in longer expressions where the boolean array is being used to slice a dataframe.
Describe the solution you'd like
Add a convenience function which returns the not
value of df.isin
API breaking implications
No impact on API.
Additional context
This has already been done for isna
/notna
, and I find it very helpful—just borrowing the same idea.
Copied from frame.py
:
@doc(NDFrame.isna, klass=_shared_doc_kwargs["klass"])
def isna(self) -> DataFrame:
result = self._constructor(self._mgr.isna(func=isna))
return result.__finalize__(self, method="isna")
@doc(NDFrame.notna, klass=_shared_doc_kwargs["klass"])
def notna(self) -> DataFrame:
return ~self.isna()
Comment From: jreback
-0.5 on additional api like this
would take an update to the isin docs showing how to use invert as an enhancement
Comment From: ShivnarenSrinivasan
Sure, I'll do that
Comment From: ms7463
Would we possibly reconsider adding the notin
method? I agree with @ShivnarenSrinivasan ‘s opinion that the notin syntax is superior to ~isin
syntax. For one, it nicely emulates python’s x not in y
syntax. For me personally I find the current syntax adds just enough cognitive load to break my train of thought in both reading and writing code. For example, I often find myself writing out the isin portion then backtracking and adding the unary operator in the proper place to negate the expression.
Not to mention that all other is*
methods have a complementary not*
method (which I fully support).
I understand the reluctance to add more methods to these classes, but I think that notin
is a natural method to expect, and one that would be used much more than many other methods, to produce code that’s easier to read, write and understand.
Comment From: lucas-abbate
I agree with @aa1371. It makes code consistent with outside of pandas stuff, it makes reading a lot harder, and I use (and think many people use) ~isin
all the time. I know it adds more methods, but I think it makes it much easier to use (and to learn: all my students have to google always something like "pandas not in"), and I don´t think that the 3-4 lines it takes to add this method make any difference on the 12k+ lines of frame.py
.
I've never contributed, but I'm willing to learn just to make .notin()
work.