I use pandas.read_csv()
to read an URL of an uploaded CSV file.
However, sometimes it caused an issue that read_csv()
returns an empty DataFrame with NaN.
I think it caused by the mime type of the CSV application/vnd.ms-excel
, because all CSVs with mimetype text/csv
doesn't have any issue.
Is that a known issue?
I found this post that seems like the same issue.
Is this read_csv()
's issue? Or something else?
The traceback:
Traceback (most recent call last):
File "/env/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/env/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/env/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/env/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/env/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/vmagent/app/main.py", line 144, in visualize_file
df = process(df)
File "/home/vmagent/app/main.py", line 122, in process
result = foo(df)
File "/home/vmagent/app/main.py", line 85, in foo
len_ts = int(np.array(data)[:, 0].max())
ValueError: cannot convert float NaN to integer
And the code:
@app.route('/upload', methods=['GET', 'POST'])
def upload_file():
# Upload file to Google Cloud Storage
......
import numpy as np
def foo(data):
len_ts = int(np.array(data)[:, 0].max())
......
return df
def process(df):
df = np.array(df)
......
df = foo(df)
return bar(df)
@app.route('/graph/<filename>')
def visualize_file(filename):
import pandas as pd
df = pd.read_csv('https://storage.cloud.google.com/project-name/' + filename + '.csv'))
df = process(df)
df = pd.DataFrame(df, columns=['col1', 'data1', 'data2'])
chart_data = df.to_dict(orient='records')
chart_data = json.dumps(chart_data, indent=2)
data = {'chart_data': chart_data}
return render_template("graph.html", data=data)
Comment From: jreback
you would have to show a reproducible example
Comment From: hangyao
@jreback I can't completely reproduce the error. Sometimes there was no error for the CSV with application/vnd.ms-excel
. I tried to rename the CSV with the error by adding or removing _
or -
in the filenames and re-upload the CSV file. Sometimes it just works that way. I don't know if the mimetype causes the issue.
Comment From: gfyoung
@hangyao : Unfortunately, without a reproducible example (i.e. we need to be able to take whatever code you provide and run it ourselves with no changes), we can't help you out too much here. The SO post is also not particularly useful for us to diagnose any issue. I am inclined to close this if that's okay.