The delim_whitespace
options no longer works when specifying a skip_footer
other then zero. I can replicate the behavior in 0.12 with the following example:
import pandas as pd
from StringIO import StringIO
indata = StringIO("""1.2 5.6 8.5
4.5 6.7 6.4
""")
indata.seek(0)
df = pd.read_csv(indata, delim_whitespace=True, header=None, skip_footer=2)
Which returns:
0
0 1.2 5.6 8.5
1 4.5 6.7 6.4
Note how its only one column, instead of three. Changing the skip_footer
to 0 makes it work as expected.
indata.seek(0)
df = pd.read_csv(indata, delim_whitespace=True, header=None, skip_footer=0)
Returns:
0 1 2
0 1.2 5.6 8.5
1 4.5 6.7 6.4
2 NaN NaN NaN
3 NaN NaN NaN
If the above example is used with something like names=['a','b','c']
an (obvious) 'ValueError' exception occurs: Expected 3 fields in line 1, saw 1
.
Comment From: jreback
pushing to 0.14
Comment From: jtratner
this combination is just not supported at all any more, so either close this or change this to a request to support both delim_whitespace and skip_footer in one engine and/or to deal with the bug listed at end?
In [5]: df = pd.read_csv(indata, delim_whitespace=True, header=None, skip_footer=2)
Traceback (most recent call last)
...
ValueError: Falling back to the 'python' engine because the 'c' engine does not support skip_footer, but this causes 'delim_whitespace' to be ignored as it is not supported by the 'python' engine.
However, it fails badly if you switch to a regex delimiter:
>>> pd.read_csv(indata, sep='\s+', header=None, skip_footer=2)
Empty DataFrame
Columns: [0, 1, 2]
Index: []
Comment From: gfyoung
@jreback : this isn't an issue anymore. You can most certainly specify both together now.
>>> from pandas import read_csv
>>> from pandas.comat import StringIO
>>> data = """1.2 5.6 8.5
4.5 6.7 6.4
"""
>>> read_csv(StringIO(data), delim_whitespace=True, header=None, skip_footer=2,
skip_blank_lines=False, engine='python') # skip_footer not supported in C engine
0 1 2
0 1.2 5.6 8.5
1 4.5 6.7 6.4
Comment From: jorisvandenbossche
Indeed. Note that the last comment from @jtratner about it failing was due to skip_blank_lines=True
by default because of which the skipfooter was not needed anymore