Code Sample


import pandas as pd

df = pd.DataFrame({'test': ['34343_43434']})
json = df.to_json(orient='records')
result = pd.read_json(json, orient='records')
print(result)

output: test 0 3434343434

Problem description

Pandas appears to be converting the initial string ("34343_43434") to an integer (3434343434) and removes the underscore to do it.

This only occurs when all characters in the string (besides the underscore) are integers. For example, if the initial value were "34343_43434X" then the output would correctly "34343_43434X". This issue does not occur when dtypes=False.

Expected Output

test

0 34343_43434

Output of pd.show_versions()

[paste the output of ``pd.show_versions()`` here below this line] INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.3 pytest: 3.0.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.2.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: bobhaffner

I was able to replicate this on python 3.6.2 but not on 3.5.3. Not sure why though

Comment From: bobhaffner

MIght be related to PEP 515

I'm able to do things like this in 3.6, but not in 3.5

In [5]: num = 34343_43434

In [6]: type(num)
Out[6]: int

In [7]: num
Out[7]: 3434343434

Comment From: jreback

this is an unfortunate side effect of the pep, but this looks valid to me.