Pandas Implement label support for usecols in read_table "c" engine

pd.read_table can make use of usecols to reduce the memory footprint. usecols supports a list of labels, but this is only supported on the "python" engine. Using usecols with the "c" engine will result in the following error:

>>> pd.read_table('/nonsense', sep=':', usecols=['test'])
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/io/parsers.py", line 646, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/lib/python3/dist-packages/pandas/io/parsers.py", line 389, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/lib/python3/dist-packages/pandas/io/parsers.py", line 730, in __init__
    self._make_engine(self.engine)
  File "/usr/lib/python3/dist-packages/pandas/io/parsers.py", line 923, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/lib/python3/dist-packages/pandas/io/parsers.py", line 1434, in __init__
    raise ValueError("Usecols do not match names.")
ValueError: Usecols do not match names.

usecols works instead with labels if you specify the python engine manually. We should implement usecols with labels also for the "c" engine. The python engine is way slower than the "c" engine, and since it allocates the buffers in python, is actually much more wasteful also in terms of memory usage despite "usecols" being supported.

Comment From: chris-b1

I think you might have an error in your data? The c engine does support usecols:

from io import StringIO
pd.read_table(StringIO("""a:b:c
1:2:3
4:5:6"""), sep=':', usecols=['b'], engine='c')

Out[125]: 
   b
0  2
1  5

Comment From: wavexx

On Thu, Jun 08 2017, chris-b1 wrote:

I think you might have an error in your data? The c engine does support usecols:

Mh, after closing my python session and restarting from scratch I really cannot reproduce this, even myself.

Weird, as I debugged this for quite a bit, but I guess I had some issue. Sorry for the noise.