Feature Type
-
[ ] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
I often copy-paste code like
pd.read_csv('some/long/file/path/data.csv')
and then just change the file name to a file with a different extension, forgetting to also change pd.read_csv
.
pd.read_csv('some/long/file/path/other-data.json') # seems to be stuck in an infinite loop
Currently pd.read_csv
seems to run forever trying to read a JSON file without raising an error or ever returning.
Feature Description
pd.read_csv
could issue a warning on suspicious extensions:
pd.read_csv('some/long/file/path/other-data.json')
> Warning: This looks like the wrong reader for `.json` extension
Alternative Solutions
Could also add a flag warnings: 'raise' | 'warn' | 'error' = 'warn'
to control this behavior.
Additional Context
No response
Comment From: phofl
Hi thanks for your report. I don't think we should raise warnings based on the file extension. Basically every file extension could be some sort of csv file. It would be better to figure out why we are running into an infinite loop here
Comment From: janosh
I think CSV files with .json
extension are very rare. Seems like a 99+% good heuristic. But not looping forever and throwing some error right away would also be good.
Comment From: phofl
The bigger problem is which we want to allow and which we don't want to allow, this is tricky. I don't think we want to start this.
Comment From: janosh
Forgot to mention the JSON file was gzipped (.json.gz
extension).