Code Sample, a copy-pastable example if possible
# Your code here
data = [{'id': 11, 'text': 'Osv1wbZoL'},
{'id': 0, 'text': 'KQpPReW3S9nZOS3'},
{'id': 0, 'text': 'cbqLhjrb0B2Ah6E'},
{'id': 3, 'text': 'qu1Jlnyba'},
{'id': 14, 'text': 'aJUv5DBjbcGc3'},
{'id': 12, 'text': 'Yobf9'},
{'id': 4, 'text': 'awzZCV'},
{'id': 4, 'text': '3NvBAVL'},
{'id': 11, 'text': '80sPCxIf9s5wmEZ1'},
{'id': 5, 'text': 'afrPD0X6mIzFK'}]
df = pd.DataFrame(data)
# out:
# id int64
# text object
# dtype: object
type(df[['id', 'text']].to_dict(orient='records')[0]['id'])
# out: int
type(df[['id']].to_dict(orient='records')[0]['id'])
# out: numpy.int64
Problem description
depending on the count of output columns, numpy integers getting converted to python integers
afterwards both json.dumps
and ujson.dumps
fails to encode
Expected Output
int for both cases
Output of pd.show_versions()
Comment From: TomAugspurger
Looks like a duplicate of https://github.com/pandas-dev/pandas/issues/13258
@kszucs Feel free to dive in if you're interested.
Comment From: TomAugspurger
afterwards both json.dumps and ujson.dumps fails to encode
If you're only dumping to JSON, then df.to_json
might work for you. It knows how to serialize numpy types.
Comment From: kszucs
Sadly there is no orient I could use in to_json.
Comment From: makmanalp
@kszucs I think that has been fixed recently FYI: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html
Comment From: 0anton
Still observe the same behavior (numpy.int64 instead of python int type) in current pandas 0.23.4. Related #13258, which @TomAugspurger used to close the issue, must bring the fix in 0.21.0. Vote for re-open.
Comment From: nataschaZima
I am having the same issue. I have to convert CSV output to dict, doing it with pandas (df.to_dict('records')) gives me in one case correct python types, in the other case still leaves the types as numpy. Is there any workaround at all? I cannot find proper documentation on how to user pandas.compat with pandas method to_dict.
Comment From: stevenanton-bc
For posterity, I ended up doing something like json.loads(df.to_json())
to coerce the numpy data types to internal types (e.g. np.int64
to int
).
Comment From: kumudraj
use: from pandas.io.json import dumps import json data_dictionary = df.iloc[0].to_dict() data_dictionary=json.loads(dumps(data_dictionary,double_precision=0))