Code Sample

import pandas as pd

excel = pd.ExcelFile("test.xlsx")

for sheet in excel.sheet_names:
    reader = excel.parse(sheet, chunksize=1000)
    for chunk in reader:
        # process chunk

Problem description

In version 0.16.1 the chunksize argument was available.

See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html

But in latest version it's not available.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html

What was the reason that it was removed?

Also, how should I process excel file by chunks in latest version?

Comment From: jorisvandenbossche

This was removed in https://github.com/pandas-dev/pandas/pull/11198, stated there because the argument didn't do anything (xref https://github.com/pandas-dev/pandas/issues/8011)

Comment From: gfyoung

@jorisvandenbossche : Well, it is a functional parameter in read_csv, so potentially it could (or should) be implemented. That being said, I think your explanation should satisfy the initial inquiry, so closing for now.

Comment From: jorisvandenbossche

Yes, that feature request is covered by #8011

Comment From: EugeneKovalev

And what was the reason of removal? Ok, it hadn't been working before but where is the fix? Removing a part of a code that should work is not a fix

Comment From: jorisvandenbossche

@EugeneKovalev No keyword (for now) is better than a confusing keyword that does not have any effect. The feature request is still open (#8011), a code contribution to add it is always welcome.

Comment From: swarits

@EugeneKovalev It was removed because the excel files would read up into memory as a whole during parsing because of the nature of XLSX file format. Hence, it'd cause 'MemoryError' if the file was large. And chunksize wouldn't change this behavior in case of excel files, but it works perfect in CSVs coz they could be loaded into memory in parts.