Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas

# You can use any simple csv for demonstration, but this FAILS
df = pandas.read_csv('file://Users/kvelayutham/Documents/testing/yellow_tripdata_2015-01.csv', nrows=10)

# This seems to work with the extra /
df = pandas.read_csv('file:///Users/kvelayutham/Documents/testing/yellow_tripdata_2015-01.csv', nrows=10)

Issue Description

I'm not sure if this is intended behavior or not (please close if it is), but it seems like when we use file:// as is, the path isn't get parsed correctly and is throwing an error in urllib. My gut is saying that there is a small bug in the order of if-statement checks in pandas/io/common.py

Expected Behavior

I would expect relative paths to work with the file:// prefix to work, but I could be wrong here.

Installed Versions

INSTALLED VERSIONS ------------------ commit : e8093ba372f9adfe79439d90fe74b0b5b6dea9d6 python : 3.9.12.final.0 python-bits : 64 OS : Darwin OS-release : 21.3.0 Version : Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_ARM64_T6000 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.4.3 numpy : 1.22.4 pytz : 2022.1 dateutil : 2.8.2 setuptools : 61.2.0 pip : 22.1.2 Cython : None pytest : 7.1.2 hypothesis : None sphinx : 5.0.1 blosc : None feather : 0.4.1 xlsxwriter : None lxml.etree : 4.9.1 html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.4.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : fastparquet : 0.8.1 fsspec : 2022.5.0 gcsfs : None markupsafe : 2.1.1 matplotlib : 3.5.2 numba : None numexpr : 2.8.1 odfpy : None openpyxl : 3.0.10 pandas_gbq : 0.17.6 pyarrow : 8.0.0 pyreadstat : None pyxlsb : None s3fs : 2022.5.0 scipy : 1.8.1 snappy : None sqlalchemy : 1.4.37 tables : 3.7.0 tabulate : None xarray : 2022.3.0 xlrd : 2.0.1 xlwt : None zstandard : None

Comment From: pyrito

It looks like this might be expected behavior?

Comment From: mroeschke

I think this follows the file URI scheme behavior: https://en.wikipedia.org/wiki/File_URI_scheme#How_many_slashes?

// should be followed by the hostname while /// implies an empty hostname

Comment From: twoertwein

This might be because we try to infer whether we should be using urllib or fsspec to open the file.

It might be nice to remove this urllib code https://github.com/pandas-dev/pandas/blob/9757d1f93faaa517161fd719e884be7344c18b62/pandas/io/common.py#L352 and always use fsspec. But there are several problems with that 1) fsspec is an optional dependency 2) behavior changes (urllib infers some compression and they might take different storage_options)

Comment From: pyrito

I think this follows the file URI scheme behavior: https://en.wikipedia.org/wiki/File_URI_scheme#How_many_slashes?

// should be followed by the hostname while /// implies an empty hostname

That makes sense, but yeah, I guess like @twoertwein alluded to, maybe the error message should reflect that instead of the urllib error. But this may not even be a big issue, so please feel free to close the issue.