Discovered by @ArtificialQualia in #31748
Previous behavior on 0.25.3:
>>> pd.Series([1], index=[np.nan]).to_json()
'{"null":1}'
>>> pd.Series([1], index=[pd.NaT]).to_json()
'{"null":1}'
>>> pd.Series([1], index=[None]).to_json()
'{"null":1}'
And now on master:
>>> pd.Series([1], index=[np.nan]).to_json()
'{"nan":1}'
>>> pd.Series([1], index=[pd.NaT]).to_json()
'{"null":1}'
>>> pd.Series([1], index=[None]).to_json()
'{"None":1}'
Note that pd.NA also just has its str representation written out:
>>> pd.Series([1], index=[pd.NA]).to_json()
'{"<NA>":1}'
pd.NaT still works because of this logic:
https://github.com/pandas-dev/pandas/blob/6d3cc14fc52685511329cedf1e11d15651ac6a8e/pandas/_libs/src/ujson/python/objToJSON.c#L1455
So probably need to generalize for other "null" values
Comment From: jbrockmendel
on master the pd.NA case is now giving '{"null":1}'
@WillAyd what is the desired behavior here? do you have a gameplan for this?
Comment From: WillAyd
I'm not actively working on this but if someone wanted to I think the best solution is to take the NA check while encoding object keys:
https://github.com/pandas-dev/pandas/blob/6d3cc14fc52685511329cedf1e11d15651ac6a8e/pandas/_libs/src/ujson/python/objToJSON.c#L1455
And move it up in the function so it gets hit generally. The get_nat
function should probably be a generic null check, importable from the missing cython file
Comment From: simonjayhawkins
Previous behavior on 0.25.3:
```python
pd.Series([1], index=[np.nan]).to_json() '{"null":1}' pd.Series([1], index=[pd.NaT]).to_json() '{"null":1}' pd.Series([1], index=[None]).to_json() '{"null":1}' ```
d75ee703efc0d201af2f05bd166b0f58ec5977b5 is the first bad commit commit d75ee703efc0d201af2f05bd166b0f58ec5977b5 Author: William Ayd william.ayd@gmail.com Date: Sat Aug 24 00:38:17 2019 +0200
Remove Encoding of values in char** For Labels (#27618)
Comment From: simonjayhawkins
on master the pd.NA case is now giving
'{"null":1}'
this was fixed by #32214
35537dddab75f82b83d4dc05f0f42404f459673a is the first new commit commit 35537dddab75f82b83d4dc05f0f42404f459673a Author: Daniel Saxton 2658661+dsaxton@users.noreply.github.com Date: Wed Feb 26 06:39:55 2020 -0600
BUG: Cast pd.NA to pd.NaT in to_datetime (#32214)
Comment From: simonjayhawkins
pd.NaT still works because of this logic:
https://github.com/pandas-dev/pandas/blob/6d3cc14fc52685511329cedf1e11d15651ac6a8e/pandas/_libs/src/ujson/python/objToJSON.c#L1455
this was also initially broken by #27618
>>> import pandas as pd
>>>
>>> pd.__version__
'0.26.0.dev0+228.gd75ee703e'
>>>
>>> pd.Series([1], index=[pd.NaT]).to_json()
'{"-9223372036854":1}'
>>>
however, this was fixed in #30977
9c33464768071184a6e26dcae79f75b1abf840a0 is the first new commit commit 9c33464768071184a6e26dcae79f75b1abf840a0 Author: William Ayd will_ayd@innobi.io Date: Sat Jan 18 07:34:12 2020 -0800
JSON Date Handling 1.0 Regressions (#30977)