I get a NameError exception when I try to filter DataFrame by selected index values (inside IPython session). You can see that valid
is numpy.array
while lab
is a pandas.DataFrame
object. Both of them are initialized and accessible. However I cannot put them together. Here is the error:
In [51]: valid
Out[51]:
array([38661, 44593, 38705, 38918, 38727, 38757, 38751, 38777, 38787,
...,
45328, 45337, 43645, 43694, 43701])
In [52]: lab
Out[52]:
0
39333 -1
39173 -1
42756 -1
39633 -1
38661 -1
44801 81
... ...
39379 -1
39742 -1
44765 108
44279 -1
40584 -1
41047 -1
41833 98
[3299 rows x 1 columns]
In [53]: lab[lab.index.map(lambda x: x in valid)]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <module>()
----> 1 lab[lab.index.map(lambda x: x in valid)]
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/core/index.pyc in map(self, mapper)
1558
1559 def map(self, mapper):
-> 1560 return self._arrmap(self.values, mapper)
1561
1562 def isin(self, values, level=None):
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.arrmap_int64 (pandas/algos.c:78469)()
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <lambda>(x)
----> 1 lab[lab.index.map(lambda x: x in valid)]
NameError: global name 'valid' is not defined
What's wrong with this code? I use pandas 0.15.1. (This issue is a duplicate of SO's question http://stackoverflow.com/questions/27637825/variable-visibility-issue-with-pandas-and-ipython/27643446#27643446)
Comment From: jreback
looks like its trying to call a shell command (you usually need ! to do that).
In [16]: valid = np.array([100,0,300])
In [17]: lab = Series([-1,2,-1],[100,200,300])
In [18]: lab[lab.index.map(lambda x: x in valid)]
Out[18]:
100 -1
300 -1
dtype: int64
# you should just do this in any event
In [19]: lab[lab.index.isin(valid)]
Out[19]:
100 -1
300 -1
dtype: int64