Code Sample, a copy-pastable example if possible
v=veh_frame.Accident_Index[(veh_frame.Accident_Index.isin(v2.tolist()))&(veh_frame.Vehicle_Type==11)]
Problem description
V2 is a series which is passed into the pandas dataframe. I have recently upgraded from pandas 0.19.1 to the latest stable release 0.20.3
In the previous version the series was passed as list and pandas was happy processing it, but now this throws an error with the Trace back as below
Output
Traceback (most recent call last):
File "
File "C:\Users\TRLuser\Anaconda3\lib\site-packages\pandas\core\series.py", line 2555, in isin result = algorithms.isin(_values_from_object(self), values)
File "C:\Users\TRLuser\Anaconda3\lib\site-packages\pandas\core\algorithms.py", line 426, in isin return f(comps, values)
File "C:\Users\TRLuser\Anaconda3\lib\site-packages\pandas\core\algorithms.py", line 406, in
File "C:\Users\TRLuser\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py", line 401, in in1d ar2 = np.unique(ar2)
File "C:\Users\TRLuser\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py", line 214, in unique ar.sort()
TypeError: '>' not supported between instances of 'int' and 'str'
Output of pd.show_versions()
Comment From: TomAugspurger
Could you edit your issue to include a reproducible example?
Comment From: krishnan2107
unfortunately not, Its part of a bigger database sample. The weird observation is that the same accident index in other frames (shown below) works fine, only in this frame (veh_frame) it fails. ` v1=acc_frame.Accident_Index[(acc_frame.Latitude>=51.2537)&(acc_frame.Latitude<=51.7181)]
v2=acc_frame.Accident_Index[(acc_frame.Accident_Index.isin(v1.tolist()))&((acc_frame.Longitude>=-0.5501)&(acc_frame.Longitude<=0.2644))] ` Also when I tried to trim the list (in order to paste here) to about 10-20 long it works. But when I run it with a list of about 10000 strings it fails. This was not a problem on the previous version.
Comment From: jreback
your issue is likely https://github.com/pandas-dev/pandas/issues/16012, which is fixed for 0.21.0 (soon), but w/o a reproducible it is impossible to tell.
Comment From: krishnan2107
some more information: I tested how long a list it can take before throwing an error and the number seems to be 26626. Could this be a datatype issue with the value that stores the series length ??
Comment From: TomAugspurger
Hard to say without an example. You don't need to share any real data, but you may have to do a bit of work to make something that's small enough to be copy-pastable, but still demonstrates the issue.
On Thu, Oct 5, 2017 at 8:00 AM, krishnan2107 notifications@github.com wrote:
some more information: I tested how long a list it can take before throwing an error and the number seems to be 26626. Could this be a datatype issue with the value that stores the series length ??
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/17794#issuecomment-334457290, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIiGx59_HZzzYmA7vLEKBHUgBHu3sks5spNLegaJpZM4Pu8aX .
Comment From: krishnan2107
Cracked it .. the datatype in the argument for isin() had both string and int types in it. This was working fine before but like stated under issue in #16012 , this does affect. converted the int in v2 to string and then it worked fine.
Sorry for the trouble, thanks all for responding so quickly