Would someone please clarify what is meant by the description for the engine
arg to DataFrame.to_parquet
?
engine : {‘auto’, ‘pyarrow’, ‘fastparquet’}, default ‘auto’
Parquet reader library to use. If ‘auto’, then the option ‘io.parquet.engine’ is used.
If ‘auto’, then the first library to be installed is used.
I think this description refers to the logic given here:
https://github.com/pandas-dev/pandas/blob/a00154dcfe5057cb3fd86653172e74b6893e337d/pandas/io/parquet.py#L11-L28
Based on the code, should it read as follows?
engine : {‘auto’, ‘pyarrow’, ‘fastparquet’}, default ‘auto’
Parquet reader library to use. If ‘auto’, then the option ‘io.parquet.engine’ is used.
If ‘io.parquet.engine’ is not set, then the first library to be installed is used.
Comment From: gfyoung
So I do agree that the docs as they stand are not clear as to which "auto" we are talking about, but I'm uneasy about your proposed change. I personally would say that "If the io.parquet.engine
option is also set to "auto," then..."
Comment From: jorisvandenbossche
This has recently been changed to clarify this, see the diff in https://github.com/pandas-dev/pandas/pull/19669/files. I think that solves the raised problem, but please check if that does it for you as well.
Comment From: smsaladi
I think that does it! Thanks for the info and sorry for the bother!
Comment From: jorisvandenbossche
No problem, thanks for raising the issue (in future case, the development version of the doc can always be checked at http://pandas-docs.github.io/pandas-docs-travis/)