Code Sample, a copy-pastable example if possible
I want to transfer the follwing dataframe to JSON using a dictionary style.
weightedRisk riskAsset safetyAsset weightedAsset riskRank
0 0.029839 0.029839 0.220161 0.25 2
1 0.053840 0.053840 0.446160 0.50 3
2 0.015485 0.015485 0.234515 0.25 4
Expected Output
the expected ouput should be:
"groupedWeightedRisk": {
"safetyAsset": [
0.2201611228,
0.4461597981,
0.2345148013
],
"riskAsset": [
0.0298388772,
0.0538402019,
0.0154851987
],
"riskRank": [
2,
3,
4
],
"weightedRisk": [
0.0298388772,
0.0538402019,
0.0154851987
],
"weightedAsset": [
0.25,
0.5,
0.25
]
}
But in pandas, all of the to_json method makes the ouput contains the index.
So.I use to_dict(orient='list") and then using json.dumps to serialize the data.
this method works well in most of the time. However , in the pervious case, the json.dumps raise an error saying: TypeError: 2 is not JSON serializable
then I checked the dataframe and locate the problem.
type(groupedWeightedRisk['riskRank'][0])
<class 'numpy.int64'>
one of the series is made up by numpy.int64, and this data type could not be JSON serializable directly.
Finally, I solve this problem using a function with a stupid way,that I serialise the data columns one by one.
def dataFrameToDict(df,indexMethod=None):
result={}
for i in df.columns:
result[i]=json.loads(df[i].to_json(orient='records'))
if indexMethod is not None:
if df.index.name is not None:
indexName=df.index.name
else:
indexName='index'
if type(df.index) == pd.tseries.index.DatetimeIndex:
result[indexName]=list(df.index.strftime("%Y-%m-%d"))
else:
result[indexName] = list(df.index)
return result
So, is there any possibility that pandas could add a method in to_json that could provide the a JSON data using dictionary style?
Thanks a lot
Comment From: sinhrks
Sorry, can't understand what you mean... Could you attach copy-pastable script and pd.show_version()
as indicated in github issue template.
Following script works on my env (py2.7, mac)
data = u"""weightedRisk,riskAsset,safetyAsset,weightedAsset,riskRank
0,0.029839,0.029839,0.220161,0.25,2
1,0.053840,0.053840,0.446160,0.50,3
2,0.015485,0.015485,0.234515,0.25,4"""
from io import StringIO
df = pd.read_csv(StringIO(data), sep=',')
import json
json.dumps(df.to_dict(orient='list'))
# '{"weightedAsset": [0.25, 0.5, 0.25],
# "riskRank": [2, 3, 4],
# "riskAsset": [0.029838999999999997, 0.05384, 0.015484999999999999],
# "weightedRisk": [0.029838999999999997, 0.05384, 0.015484999999999999],
# "safetyAsset": [0.22016100000000002, 0.44616000000000006, 0.234515]}'
Comment From: Wall-ee
sorry about the late repro. the sys info is: Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32
and OS is: windows 10
Comment From: Wall-ee
INSTALLED VERSIONS
commit: None python: 3.4.3.final.0 python-bits: 64 OS: Windows OS-release: 8 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None
pandas: 0.18.0 nose: None pip: 8.1.0 setuptools: 12.0.5 Cython: None numpy: 1.10.4 scipy: 0.17.0 statsmodels: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.5.0 pytz: 2015.7 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: 0.7.5.None psycopg2: None jinja2: 2.8 boto: None