A small, complete example of the issue
pd.read_html() not working showing error in _html5lib.py,line 70 class TreeBuilderForHtml5lib
import pandas as pd
import html5lib
import lxml
data = pd.read_html('https://blockchain.info/blocks')
gives
File "C:\Anaconda2\lib\site-packages\bs4\builder\_html5lib.py", line 70, in <module>
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: 'module' object has no attribute '_base'
Output of pd.show_versions()
# Paste the output here
pandas: 0.19.0
nose: 1.3.7
pip: 8.1.2
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.2
scipy: 0.17.0
statsmodels: 0.6.1
xarray: 0.8.2
IPython: 5.1.0
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: None
html5lib: 0.999999999
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
Comment From: jorisvandenbossche
Can you include the output of pd.show_versions()
in your post?
Comment From: verhalenn
It was a problem with beautifulsoup that has since been updated. You may just need to update it in anaconda. http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/revision/406
Comment From: prakritidev
Thank you @verhalenn