Pandas read_html incorrectly parses a table from Wikipedia

Code Sample, a copy-pastable example if possible

import pandas as pd
import html5lib
import requests
from lxml import html
from lxml import etree

url = "https://en.wikipedia.org/wiki/Comparison_of_single-board_computers"
xpath = "//*[@id=\"mw-content-text\"]/table"

res = requests.get(url)
tree = html.fromstring(res.content)
tables = tree.xpath(xpath)

# for index, t in enumerate(tables):
#     print("Table index [%s]" % index)
#     print(t.xpath('descendant-or-self::th/text()'))
#     print

headers = ['Name', 'PCIe', 
           'USB 2.0', 'USB 3.0', 'USB devices', 
           'Storage on-board', 'Storage flash slots', 'Storage SATA', 
           'Networking ETH', 'Networking WiFi', 
           'Communication bluetooth', 'Communication I2C', 'Communication SPI', 
           'Generic I/O GPIO', 'Generic I/O Analog', 
           'Other interfaces']
# at the beginning and at the end there are 2 unnecessary rows, 
# because the original header was to complex for the parser, so it's removed
dta = pd.read_html(html.tostring(tables[2]), skiprows=2)[0][:-2]
dta.columns= headers
dta.sort_values(by=['Name'])[61:]

Problem description

Roughly, starting from id=69 everything seems to be mixed. Either it's because of the parsing library or the specific of the table from Wiki. Values looks like they are shifted to the left.

Expected output

e.g. "Orange Pi Lite" should have USB3.0 == "No", instead there is "b/g/n (RTL8189FTV)", and that value should be shifted 5 columns to the right

Output of `pd.show_versions()`

html5lib installed using pip

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-28-generic machine: x86_64 processor: byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.19.1 nose: None pip: 9.0.1 setuptools: 32.3.1 Cython: 0.23.5 numpy: 1.10.4 scipy: 0.17.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: None patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.7 blosc: None bottleneck: None tables: None numexpr: 2.6.1 matplotlib: 1.5.1 openpyxl: None xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: 3.7.1 bs4: 4.5.1 html5lib: 0.999999999 httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: None pandas_datareader: 0.2.1

Comment From: jreback

this is not a pandas issue. you might have better luck on StackOverflow.

Pandas read_html incorrectly parses a table from Wikipedia

Code Sample, a copy-pastable example if possible

Problem description

Expected output

Output of pd.show_versions()

Output of `pd.show_versions()`