Pandas read_csv from HTTPs + basic-auth + custom port throws an error (urlopen error)

Code Sample, a copy-pastable example if possible

import pandas as pd
df = pd.read_csv('https://user-name:pwd@my.fqdn.com:8080/get_content.csv')
# fails with urlopen error [Errno 11003] getaddrinfo failed Python2.7 as well as Python 3.x

Problem description

HTTPS basic auth is very common. This URL format works in Excel, other text editors, etc. This url works in requests library. It seems like the scenario doesnt work because the underlying urlopen doesnt work

# I think it fails because the underlying code here fails in pandas/io/common.py:
from urllib2 import urlopen as _urlopen
b = _urlopen('https://un-name:pwd@my.fqdn.com:8080/get_content.csv')
# fails with urlopen error [Errno 11003] getaddrinfo failed

see stackoverflow issue urllib basic auth Only way to overcome this is using requests + StringIO ?

Expected Output

be able to get a CSV loaded dataframe

Output of `pd.show_versions()`

python: 2.7.13.final.0 python-bits: 64 pandas: 0.20.1

Comment From: gfyoung

HTTPS basic auth is very common

Apparently not with pandas :wink:

But joking aside, that's an awkward problem to have (can't replicate since I have no endpoint against which to test this). Also, this is not easy to test without publicly providing credentials to an endpoint.

I read the SO post, and I was wondering: can you confirm that requests was the only way for you to get the request to work? I read a bunch of solutions in that discussion, so I just wanted to check.

If that is the case, I suppose we could implement a wrapper that makes the request and then returns the response content in place of urlopen, though doing it in a clean way will be the challenge. Not to mention, we would need to explicitly add requests as a Python 2.x dependency.

Comment From: skynss

With requests and StringIO ot works well. I am travelling today but over next 1-2 days i can post a reproducable test url or simple reprocode. I suspect it repros with http and no real auth.. because i suspect this is simply a parsing error in hostname and port even before outbound call initiates. Can confirm in next 1-2 days when i am back unless you beat me to it

Comment From: gfyoung

Yeah if you could provide repro code that would be useful since it is hard to reproduce without our own endpoint. However, if you could confirm none of the other solutions proposed in the SO work besides the requests ones, that would also be useful.

Comment From: skynss

It appears to me that an existing url is not needed, because the issue is just in the parsing of the URL. eg:

df = pd.read_csv('http://username:pwd@cnn.com:8080/get_content.csv')
# returns URLError: <urlopen error [Errno 11003] getaddrinfo failed>
df = pd.read_csv('http://username:pwd@cnn.com/get_content.csv')
# returns InvalidURL: nonnumeric port: 'pwd@cnn.com' ( failing to find port number)

However, simplest way to fake it is: Create a text file: aaa.csv and put some csv contents

with open('/path1/aaa.csv', 'w') as f:
  f.write( 'animal,bird\ncat,pigeon\nmonkey,swan')

cd /path1 run

python -m SimpleHTTPServer 8080

Now the url http://localhost:8080/aaa.csv should exist

import requests
import pandas as pd
u1 = 'http://localhost:8080/aaa.csv'
print( requests.get(u1).text) # as expected prints contents of aaa.csv
df = pd.read_csv(u1)  # as expected, loads contents of aaa.csv into df

u2 = 'http://uname:pwd@localhost:8080/aaa.csv'
print( requests.get(u2).text) # as expected prints contents of aaa.csv (ignores unnecessary uname pwd)
df = pd.read_csv(u2)  # URLError: <urlopen error [Errno 11003] getaddrinfo failed>

I'll provide a working positive test case url later tomorrow once I set it up. It would be good to engineer a fix to allow self-signed certs for testing too - eg: requests has verify=False. Helpful for testing.

Comment From: TomAugspurger

So since we can't use requests, we would follow https://stackoverflow.com/a/4188709/1889400

detect and extract the username / password in the url
setup a urllib.Request context for the (stripped) url in https://github.com/pandas-dev/pandas/blob/125c414389320dc67ecaffddc65878c01822064e/pandas/io/common.py#L197
encode the user / password and add to the Authorization header
urlopen the request context

Comment From: gfyoung

@TomAugspurger :

1) Why can't we use requests ?

2) Does this work with the localhost repro example that @skynss provided?

Comment From: TomAugspurger

Why can't we use requests ?

This alone isn't worth adding it as a dependency I think.

Comment From: gfyoung

This alone isn't worth adding it as a dependency I think.

That's very different from "we can't use requests"

For starters, it's builtin with Python 3.x, so we would only be adding it as a dependency for Python 2.x. However, if we could find a way to use just urlopen, that would be nice though.

Comment From: TomAugspurger

Does this work with the localhost repro example that @skynss provided?

Can't really test that since I'd have to write the implementation first :) I think it'll work though.

For starters, it's builtin with Python 3.x,

I don't think it is... It's documented as the recommended way for making high-level http calls though (but still requires a separate install).

Comment From: gfyoung

I don't think it is... It's documented as the recommended way for making high-level http calls though (but still requires a separate install).

Oh right, I stand corrected. 😄

Comment From: skynss

Here are working urls using self signed certs. Would be good to get it to work with self-signed certs too as that can be very useful especially in testing. Please let me know once you are done testing.. so I can shut down the demo because thats all it is up for.

import pandas as pd
import requests
from io import StringIO

# both urls use self signed cert. Both will remain working for few days
u1 = 'https://pandasusr:pandaspwd@pandastest.mooo.com:5000/aaa.csv' # non default ssl port 
u2 = 'https://pandasusr:pandaspwd@pandastest.mooo.com/aaa.csv' # default ssl port


r1 = requests.get(u1, verify=False)
print(r1.text) # prints ok
df1 = pd.read_csv(StringIO(r1.text))  # works

r2 = requests.get(u1, verify=False)
print(r2.text) # prints ok
df2 = pd.read_csv(StringIO(r2.text)) # works

# without requests
df1 = pd.read_csv(u1) # URLError: <urlopen error [Errno 11003] getaddrinfo failed>

df2 = pd.read_csv(u2)  # InvalidURL: nonnumeric port: 'pandaspwd@pandastest.mooo.com'

Comment From: skynss

Is anyone going to use the live endpoints referred above (and below) to repro and test? If not, I will shut down the server in the next 4 days. I havent seen anyone attempt to use it over past 11 days.


u1 = 'https://pandasusr:pandaspwd@pandastest.mooo.com:5000/aaa.csv' # non default ssl port 
u2 = 'https://pandasusr:pandaspwd@pandastest.mooo.com/aaa.csv' # default ssl port

Comment From: gfyoung

@skynss : Sorry about that! I imagine that outside work has caught up with a bunch of us (including myself). I'll see if I can look at it later today.

Comment From: gfyoung

@skynss : Can replicate the issues you were experiencing.

Comment From: skynss

Please run the source code I pasted above.. and I can replicate the issues I am experiencing.

Comment From: gfyoung

Please run the source code I pasted above.. and I can replicate the issues I am experiencing.

Did you read the comment I made above here?

Comment From: skynss

I am not sure which comment you are referring to.. as the link doesnt work. But trying my best to answer..

I read the SO post, and I was wondering: can you confirm that requests was the only way for you to get the request to work? I read a bunch of solutions in that discussion, so I just wanted to check

The only way I know that worked for me is 1) get txt content 2) load it in StringIO 3) give the StringIO buffer to pandas to read. For step 1) to get text content, I imagine any method would work. I used requests and that worked. And the repro code above follows that step. And I just verified that if I copy paste the repro code, it replicates the issue.

If I didnt answer your question, kindly re-state your question.

Comment From: gfyoung

@skynss : Not sure why the link doesn't work. It's just a URL to an earlier comment I made. However, I'm wondering if you can access those files by passing into urlopen authentication as @TomAugspurger mentioned here.

A solution that doesn't use requests would be to do that instead, so that works for you, we could go that way.

Comment From: skynss

The following working code does not depend on requests library. Just copy / paste and run to verify.

import urllib2, base64, ssl
from urlparse import urlparse
#from io import StringIO # python 3.x
from StringIO import StringIO
import pandas as pd

def split_uname_from_url(url_with_uname):
    o = urlparse( url_with_uname)
    uname = o.username
    pwd = o.password
    # create url without username and pwd
        usrch = '{}:{}@{}'.format( uname, pwd, o.hostname)
    url_no_usrpwd = url_with_uname.replace( usrch , o.hostname)
    return uname, pwd, url_no_usrpwd

def get_https_basic_auth_ignore_invalid_cert( url_with_uname, verify_ssl=True):
    uname, pwd, url_no_usrpwd = split_uname_from_url(url_with_uname)
    print('Calling [{}] -- uname:[{}] -- pwd[{}]'.format(url_no_usrpwd, uname, pwd))
    request = urllib2.Request( url_no_usrpwd )
    base64string = base64.encodestring('%s:%s' % (uname, pwd)).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)   
    # I hope pandas can support self signed certs too 
        # because it is very difficult to get official SSL certs in testing scenarios
    if verify_ssl:
        result = urllib2.urlopen(request) 
    else: # in case of self signed SSL certificates. 
        result = urllib2.urlopen(request, context=ssl._create_unverified_context() )
    txt = result.read()
    return txt


url_with_uname = 'https://pandasusr:pandaspwd@pandastest.mooo.com:5000/aaa.csv'
csv_txt = get_https_basic_auth_ignore_invalid_cert( url_with_uname, verify_ssl=False) 
df = pd.read_csv( StringIO(csv_txt.strip()) ) # forgot to close the StringIO buffer

Comment From: gfyoung

@skynss @gfyoung : This works, but it's very Python 2-oriented and would need to be generalized for Python 3. I think if you can do that, that would make things a lot easier to incorporate.

Comment From: skynss

The following code:

Works on py 2.x py 3.x (verified on 3.6.x)
Is live working demo that you can instantly test by simply copy and paste.
does not depend on requests library
works with both valid signed SSL certs as well as allow bypass of verification for self-signed SSL certs

import sys
import ssl

def split_uname_from_url(url_with_uname):
    try:
        from urlparse import urlparse
    except:
        from urllib.parse import urlparse
    o = urlparse( url_with_uname)
    uname = o.username
    pwd = o.password
    # create url without username and pwd   
    usrch = '{}:{}@{}'.format( uname, pwd, o.hostname)
    url_no_usrpwd = url_with_uname.replace( usrch , o.hostname)
    return uname, pwd, url_no_usrpwd

def get_https_basic_auth_ignore_invalid_cert( url_with_uname, verify_ssl=True):
    uname, pwd, url_no_usrpwd = split_uname_from_url(url_with_uname)
    print('Calling [{}] -- uname:[{}] -- pwd[{}]'.format(url_no_usrpwd, uname, pwd))
    if sys.version_info[0] < 3:
        fn= get_py2_https_basic_auth_ignore_invalid_cert
    else:
        fn = get_py3_https_basic_auth_ignore_invalid_cert
    return fn( uname, pwd, url_no_usrpwd, verify_ssl=verify_ssl)

def get_py2_https_basic_auth_ignore_invalid_cert( uname, pwd, url_no_usrpwd, verify_ssl=True):
    import urllib2, base64
    request = urllib2.Request( url_no_usrpwd )
    base64string = base64.encodestring('%s:%s' % (uname, pwd)).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)   
    # I hope pandas can support self signed certs too 
    if verify_ssl:
        result = urllib2.urlopen(request) 
    else: # in case of self signed SSL certificates. 
        result = urllib2.urlopen(request, context=ssl._create_unverified_context() )
    return result.read()

def get_py3_https_basic_auth_ignore_invalid_cert( uname, pwd, url_no_usrpwd, verify_ssl=True):
    import urllib.request
    passman = urllib.request.HTTPPasswordMgrWithDefaultRealm()
    passman.add_password(None, url_no_usrpwd, uname, pwd)
    authhandler = urllib.request.HTTPBasicAuthHandler(passman)
    if verify_ssl:
        opener = urllib.request.build_opener(authhandler)
    else:
        context = ssl.create_default_context()
        context.check_hostname = False
        context.verify_mode = ssl.CERT_NONE
        opener = urllib.request.build_opener(authhandler, urllib.request.HTTPSHandler(context=context))
    urllib.request.install_opener(opener)
    res = urllib.request.urlopen(url_no_usrpwd)
    return res.read().decode('utf-8')


def csv_to_df(csv_txt, **kwargs):
    '''
    @param csv_txt: text of csv rows.
    @param kwargs: to pass to pd.read_csv
    @return df
    '''
    import pandas as pd
    try: 
        from StringIO import StringIO #python2.7
    except:  
        from io import StringIO #python3.x. 
    buf = None
    df = None
    try:
        buf = StringIO(csv_txt)
        df = pd.read_csv(buf, **kwargs)
    finally:
        if buf:
            try:
                buf.close()
            except:
                pass
    return df


url_with_uname = 'https://pandasusr:pandaspwd@pandastest.mooo.com:5000/aaa.csv'
csv_txt = get_https_basic_auth_ignore_invalid_cert( url_with_uname, verify_ssl=False) 
df = csv_to_df( csv_txt.strip() ) 
print(df.to_string(index=False))

Comment From: gfyoung

@skynss : Awesome! Thanks for doing this (I can check this later today). Now that we have something that acts as a workaround, I think the next step is seeing whether you can incorporate parts of this into the existing codebase. Want to give that a shot?

Comment From: skynss

@gfyoung I am in midst of travel. Feel free to check in incorporate, i wont get chance to look until couple of days.

Comment From: skynss

The following is updated code which makes it easier to merge into pandas/io/commons.py and call _urlopen with new params I am not sure how to capture verify_ssl=False user input from read_csv to allow user to bypass SSL cert verification.

import sys
is_py3 = sys.version_info[0] >= 3 # replace with 'compat.PY3'

## BEGIN SECTION modifications to pandas/io/common.py
import ssl
import base64


if is_py3:
    from urllib.parse import urlparse as parse_url
    from urllib.request import urlopen, build_opener, install_opener, \
            HTTPPasswordMgrWithDefaultRealm, HTTPBasicAuthHandler, HTTPSHandler
    _urlopen = urlopen  
else:
    from urlparse import urlparse as parse_url
    from urllib2 import urlopen as _urlopen
    from urllib2 import Request
    from contextlib import contextmanager, closing  # noqa
    from functools import wraps  # noqa

    # @wraps(_urlopen)
    @contextmanager
    def urlopen(*args, **kwargs):
        with closing(_urlopen(*args, **kwargs)) as f:
            yield f


def split_uname_from_url(url_with_uname):
    o = parse_url( url_with_uname)
    usrch = '{}:{}@{}'.format( o.username, o.password, o.hostname)
    url_no_usrpwd = url_with_uname.replace( usrch , o.hostname)
    return o.username, o.password, url_no_usrpwd

def get_urlopen_args( url_with_uname, verify_ssl=True):
    uname, pwd, url_no_usrpwd = split_uname_from_url(url_with_uname)
    print('Calling [{}] -- uname:[{}] -- pwd[{}]'.format(url_no_usrpwd, uname, pwd))
    if is_py3:
        fn= get_urlopen_args_py3
    else:
        fn = get_urlopen_args_py2
    req, kwargs = fn( uname, pwd, url_no_usrpwd, verify_ssl=verify_ssl)
    return req, kwargs

def get_urlopen_args_py2( uname, pwd, url_no_usrpwd, verify_ssl=True):
    req = Request( url_no_usrpwd )
    base64string = base64.encodestring('{}:{}'.format(uname, pwd)).replace('\n', '')
    req.add_header("Authorization", "Basic {}".format( base64string) )   
    # I hope pandas can support self signed certs too 
    kwargs = {}
    if not verify_ssl:
        kwargs['context'] = ssl._create_unverified_context()
    return req, kwargs

def get_urlopen_args_py3( uname, pwd, url_no_usrpwd, verify_ssl=True):
    passman = HTTPPasswordMgrWithDefaultRealm()
    passman.add_password(None, url_no_usrpwd, uname, pwd)
    authhandler = HTTPBasicAuthHandler(passman)
    if verify_ssl:
        opener = build_opener(authhandler)
    else:
        context = ssl.create_default_context()
        context.check_hostname = False
        context.verify_mode = ssl.CERT_NONE
        opener = build_opener(authhandler, HTTPSHandler(context=context))
    install_opener(opener)
    return url_no_usrpwd, {}

## END SECTION modifications to pandas/io/common.py

def call_urlopen( url_with_uname, verify_ssl=False):
    # in get_filepath_or_buffer prior to calling _urlopen get params
    # not sure where to obtain verify_ssl from pd.read_csv 
    req, kwargs = get_urlopen_args(url_with_uname, verify_ssl) 
    resp = _urlopen(req , **kwargs)
    return resp_to_csv(resp)


def resp_to_csv(resp):
    csv = resp.read()
    if is_py3:
        csv = csv.decode('utf-8')
    return csv


def csv_to_df(csv_txt, **kwargs):
    '''
    @param csv_txt: text of csv rows.
    @param kwargs: to pass to pd.read_csv
    @return df
    '''
    import pandas as pd
    try: 
        from StringIO import StringIO #python2.7
    except:  
        from io import StringIO #python3.x. 
    buf = None
    df = None
    try:
        buf = StringIO(csv_txt)
        df = pd.read_csv(buf, **kwargs)
    finally:
        if buf:
            try:
                buf.close()
            except:
                pass
    return df

url_with_uname = 'https://pandasusr:pandaspwd@pandastest.mooo.com:5000/aaa.csv'
csv_txt = call_urlopen( url_with_uname, verify_ssl=False) 
df = csv_to_df( csv_txt.strip() ) 
print(df.to_string(index=False))

I am not quiet familiar with process involved around contributing to pandas - so please feel free to take over.

Comment From: jreback

http://pandas.pydata.org/pandas-docs/stable/contributing.html

Comment From: skynss

@jreback Thx. Followed it. @gfyoung I forked and modified the codebase with modifications to the best of my ability. here it is: https://github.com/skynss/pandas/tree/basic-auth-https-self-signed

I don't know how to check in live test scenario so I am going to leave that out. Can one of you folks help verify I am ready? Here is verification that tests read_csv, read_json, read_excel and read_html It supports having username and password either contained within a url or passed separately as params to read_csv.. etc functions. Usually username, pwd will be passed as parameter.. so I decided to add that support.

# pip install --upgrade https://github.com/skynss/pandas/archive/basic-auth-https-self-signed.zip
# live working test that tests both scenarios:
# pd.read_csv('https://uname:pwd@fqdn:<port>/fname.csv', verify_ssl=False)
# pd.read_csv('https://fqdn:<port>/fname.csv', username='uname', password='pwd', verify_ssl=False)

import pandas as pd

uname='pandasusr'
pwd='pandaspwd'
url = 'https://{}pandastest.mooo.com:5000/'
verify_ssl=False

def get_df(url, uname, pwd, verify_ssl, pd_read_fn, fname):
  furl = url + fname
  kwargs = {}
  if uname:
    kwargs['username']=uname
  if pwd:
    kwargs['password']=pwd
  if verify_ssl is not None:
    kwargs['verify_ssl']=verify_ssl
  print('\n' +furl)
  df = pd_read_fn(furl, **kwargs)
  if type(df) is list: # html
    df = df[0]
  print(df.to_string(index=False))
  print(df.to_json())

fparams = [ (pd.read_csv,   'aaa.csv'),
             (pd.read_json,  'jdoc.json'),
             (pd.read_excel, 'ex_doc.xlsx'),
             (pd.read_html,  'html_file.html') ]

for pd_read_fn, fname in fparams:
  u = url.format('{}:{}@'.format(uname, pwd))
  get_df( u, None, None, verify_ssl, pd_read_fn, fname) #1 url with username/pwd as part of url
  u2 = url.format('')
  get_df( u2, uname, pwd, verify_ssl, pd_read_fn, fname) # url with username/pwd  as params

Comment From: gfyoung

@skynss : Thanks for doing this! A couple of points:

1) To truly verify that your code works, I imagine that we would need to set up our own endpoint with credentials that we are comfortable sharing since it will only be storing (hopefully) non-sensitive data.

2) It's unfortunate that you have to duplicate this logic of username, password, and SSL verification across all of these methods. I wonder if there might be an easier way to abstract this similar to what requests does by just accepting tuple instead of separate parameters.

3) You will definitely need to condense your get_urlopen_args functions. Having three functions for one thing will not work for any reviewer. Try coming up with ways to streamline your code and look for places where you truly need to differentiate between Python 2 and 3.

One place to examine: why can't you use the Request class in Python 3? It still exists under urllib.Request. Whether you looked into it or not, an explanation in this issue would be useful.

Comment From: skynss

@gfyoung Thx - implemented the changes. Please view. I had tried the 'Request' from py3 already and it didnt work. But I kept the py3 code the way it is because it seems extensible and correct way long term. I changed auth to match requests lib, and kept verify_ssl separate just like requests. Here is updated test code that uses the auth pair

import pandas as pd

uname='pandasusr'
pwd='pandaspwd'
url = 'https://{}pandastest.mooo.com:5000/'
verify_ssl=False

def get_df(url, uname, pwd, verify_ssl, pd_read_fn, fname):
  furl = url + fname
  kwargs = {}
  if uname or pwd:
    kwargs['auth']=(uname, pwd)
  if verify_ssl is not None:
    kwargs['verify_ssl']=verify_ssl
  print('\n' +furl)
  df = pd_read_fn(furl, **kwargs)
  if type(df) is list: # html
    df = df[0]
  print(df.to_string(index=False))
  print(df.to_json())

fparams = [ (pd.read_csv, 'aaa.csv'), (pd.read_json, 'jdoc.json'), (pd.read_excel, 'ex_doc.xlsx'), (pd.read_html, 'html_file.html') ]

for pd_read_fn, fname in fparams:
  u = url.format('{}:{}@'.format(uname, pwd))
  get_df( u, None, None, verify_ssl, pd_read_fn, fname)
  u2 = url.format('')
  get_df( u2, uname, pwd, verify_ssl, pd_read_fn, fname)

Comment From: gfyoung

@skynss : Cool! Thanks for the auth inputs. I think mirroring the requests library is a good idea because it should be more intuitive for people given that requests is pretty widely used. Nevertheless:

1) What was the reason for removing your test test_split_url_extract_uname_pwd ? I think you should be able to test that functionality.

2) I'm not convinced by your response regarding the Request class for two reasons:

Your comment in the code does not make mention it not working. You don't need to add this in the comments, but what happens when you used the Request object in Python 3? From what I understand, the change from Python 2 to 3 was largely a refactoring. Thus, I would be surprised if that implementation failed for Python 3.
I don't agree with your argument about the Python 3 implementation being more extensible. The main issue I have is the fact you have to install an opener as the global default via install_opener. Why do we need to change the user's Python environment to make this work? The Python 2 implementation is self-contained, and in the interests of modularity, that would be preferred.

3) Could you add your test code that you provide in this issue discussion as a file on your branch? It won't get merged, but as this is not a PR yet, you can just put it anywhere for now. :smile:

Comment From: skynss

@gfyoung 1. I was on the fence if the test was meaningful. I was trying to compact the code into 1 function. But added it back 2. you are right.. Modified. Much simpler and cleaner code now. I thought it didnt work initially with similar code earlier.. but it is now. 3. added test code to test_basic_auth_self_signed.py

Comment From: TomAugspurger

Sorry I haven't been following this closely, but

I don't know how to check in live test scenario so I am going to leave that out.

We don't need to test this against a live server. We just need to ensure that we structure the request properly, and mock the actual GET request is made. Then we can verify that the right bits of information were in the get.

@skynss when you're able, could you submit a pull request? It'll make reviewing this easier.

Comment From: gfyoung

We don't need to test this against a live server. We just need to ensure that we structure the request properly, and mock the actual GET request is made.

@TomAugspurger : Do we mock any requests in our current tests? I think we should still test that it actually works against some endpoint. That's the best way to confirm that it works and is not just a special case for @skynss IMO. We could just decorate such a test with @tm.network.

Comment From: gfyoung

@skynss when you're able, could you submit a pull request? It'll make reviewing this easier.

@skynss : I second @TomAugspurger on this now. The code looks in a lot better shape. We'll definitely make more changes to it, but it's at a stage where I think we can review it now. A couple of things you will need to do:

1) Remove the DO NOT MERGE file you added. Just keep that somewhere out of your branch (e.g. on your Desktop) for future use.

2) One of the major comments is going to be documentation. If you could, please write docstrings similar to the ones you'll find in the codebase with a one-line description, listing of parameters, description of the returned object, and any explicit errors raised if any.

Comment From: TomAugspurger

Do we mock any requests in our current tests?

At least one in https://github.com/pandas-dev/pandas/blob/9d13227345882daaa90f03078c09a9b44a18ce72/pandas/tests/io/formats/test_printing.py#L125 for mocking IPython.

Comment From: gfyoung

@TomAugspurger : I looked at the code. How does it actually use the mock library?

Comment From: TomAugspurger

@TomAugspurger : I looked at the code. How does it actually use the mock library?

Huh, it kind of looks like it's not actually used anymore :)

Comment From: gfyoung

So my question still stands:

Do we mock any requests in our current tests?

A quick GitHub search suggests not AFAICT.

Comment From: skynss

@TomAugspurger just created a pull request @gfyoung Done

Comment From: ron819

There is also same request for read_json https://github.com/pandas-dev/pandas/issues/10526

Comment From: mroeschke

Since https://github.com/pandas-dev/pandas/issues/36688 has been addressed which should address this issue so closing

Pandas read_csv from HTTPs + basic-auth + custom port throws an error (urlopen error)

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`