Hi - I recently upgraded to pandas version 0.21.0 from version 0.20.3 and got the following error writing a dataframe to_json
:
OverflowError: Unsupported UTF-8 sequence length when encoding string
.
The traceback looks like this:
As you can see it fails on converting dates(?). My dataframe has no dates.
Here is how my df looks:
However to_json
does work for these cases:
df = pd.DataFrame(np.random.randint(0,100,size=(5, 4)), columns=list('ABCD'))
and also for a standard string
df['some_string'] = 'blah'
- The same exact dataframe does not throw an error in 0.20.3.
Comment From: gfyoung
@gryBox : Thanks for reporting this! Could you provide the exact DataFrame
that you were using OR provide a smaller one that we can use to confirm the error?
Comment From: gryBox
@gfyoung Yes. Would putting a link to a tiny .csv
to my github work?
Comment From: gfyoung
@gryBox : Yes, that would. You can also upload one directly into the issue BTW.
Comment From: gryBox
@gfyoung Here you go. data
Comment From: gfyoung
@gryBox : I can't replicate the error on master
using this dataset:
df = read_csv("<your-data-file>")
df.to_json() # No error
Comment From: gryBox
@gfyoung I was afraid of that. Can you try it via Jupyter notebook 5.2? If not, I will do more digging.
Comment From: gfyoung
I was afraid of that
What do you mean? Does this example raise for you when you try to execute?
As for Jupyter, I was unable to re-produce either.
Comment From: gryBox
I meant that it could be my system environment. Let me investigate some more.
Comment From: gfyoung
Okay, sounds good. Feel free to re-open if something turns up.