I have a dataframe that I need to convert to a dict and add it to another dict which is converted to json in the end. Unfortunately since .to_dict() does not use standard python types I get the following:

TypeError: Object of type 'int32' is not JSON serializable

My code is something like the following:

output = {
    'foo': 1,
    'bar': 2
}
...
df = ...
output['baz'] = df.to_dict()

send(json.dumps(output))

Using dumps results in the above error. This is related to https://github.com/pandas-dev/pandas/issues/13258 where an 'easy way' was mentioned but I'm not sure what it is.

Would appreciate any help

Comment From: chris-b1

Please make a reproducible example - in some cases at least we are returning python types. (agree we should in all)

In [210]: df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})

In [211]: df.to_dict()
Out[211]: {'a': {0: 1, 1: 2, 2: 3}, 'b': {0: 4, 1: 5, 2: 6}}

In [213]: type(df.to_dict()['a'][0])
Out[213]: int

Comment From: affanshahid

I am loading data from a database. I then call .describe() on a column(Series) of my df before calling .to_dict() and then attempting the json conversion. Below is a simplified example that throws the following error: TypeError: Object of type 'int32' is not JSON serializable

import pandas as pd
import json
import datetime

data = [
    datetime.date(1987, 2, 12),
    datetime.date(1987, 2, 12),
    datetime.date(1987, 2, 12),
    None,
    None,
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15),
    datetime.date(1989, 6, 15)
]

df = pd.DataFrame(columns=['foo'])
df['foo'] = data
ds = df['foo'].describe()
d = ds.to_dict()
j = json.dumps(d)
print(j)

Comment From: jreback

show your versions this is fixed in 0.21 iirc

Comment From: chris-b1

Thanks for the example - this is actually a symptom / dupe of #15385 - some of the aggregations used in describe are return numpy scalars instead of python ones, which are getting passed along.