Loading a json file with large integers (> 2^32), results in "Value is too big". I have tried changing the orient to "records" and also passing in dtype={'id': numpy.dtype('uint64')}. The error is the same.
import pandas
data = pandas.read_json('''{"id": 10254939386542155531}''')
print(data.describe())
Expected Output
id
count 1
unique 1
top 10254939386542155531
freq 1
Actual Output (even with dtype passed in)
File "./parse_dispatch_table.py", line 34, in <module>
print(pandas.read_json('''{"id": 10254939386542155531}''', dtype=dtype_conversions).describe())
File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 234, in read_json
date_unit).parse()
File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 302, in parse
self._parse_no_numpy()
File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 519, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Value is too big
No problem using read_csv:
import pandas
import io
print(pandas.read_csv(io.StringIO('''id\n10254939386542155531''')).describe())
Output using read_csv
id
count 1
unique 1
top 10254939386542155531
freq 1
Output of pd.show_versions()
Comment From: jreback
that's not valid JSON. your numbers should be quoted if they are in fact not numbers. That is out of range and should raise.
In [34]: pd.read_json('''{"id": "10254939386542155531"}''', dtype=object, orient='record', typ='series')
Out[34]:
id 10254939386542155531
dtype: object
Comment From: jxramos
I'm bumping up into this same issue too where a 64bit integer is being used as an id. Any workaround for overriding? Would have been nice if the dtype
specification drove an override but type coercion must occur after default inferred type loading
This comes up during a system log archive collection in MacOS High Sierra executing from a bash shell that is later rendered to text with a json styling...
log collect
log show --style json > ~/syslogarchive.json
python
>import pandas
>dfSysLog = pandas.read_json( '~/syslogarchive.json' )
...
ValueError: Value is too big