Feature Type

  • [ ] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

I often copy-paste code like

pd.read_csv('some/long/file/path/data.csv')

and then just change the file name to a file with a different extension, forgetting to also change pd.read_csv.

pd.read_csv('some/long/file/path/other-data.json')  # seems to be stuck in an infinite loop

Currently pd.read_csv seems to run forever trying to read a JSON file without raising an error or ever returning.

Feature Description

pd.read_csv could issue a warning on suspicious extensions:

pd.read_csv('some/long/file/path/other-data.json')
> Warning: This looks like the wrong reader for `.json` extension

Alternative Solutions

Could also add a flag warnings: 'raise' | 'warn' | 'error' = 'warn' to control this behavior.

Additional Context

No response

Comment From: phofl

Hi thanks for your report. I don't think we should raise warnings based on the file extension. Basically every file extension could be some sort of csv file. It would be better to figure out why we are running into an infinite loop here

Comment From: janosh

I think CSV files with .json extension are very rare. Seems like a 99+% good heuristic. But not looping forever and throwing some error right away would also be good.

Comment From: phofl

The bigger problem is which we want to allow and which we don't want to allow, this is tricky. I don't think we want to start this.

Comment From: janosh

Forgot to mention the JSON file was gzipped (.json.gz extension).