Pandas version checks

  • [X] I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html

Documentation problem

The to_csv mode method does not explicitly list the options for users. It links to the Python open() method that contains some options that aren't relevant.

Suggested fix for documentation

It'd be nice to list the mode options right in the to_csv docs. This should be significantly more user-friendly. Suggested text:

mode: str, default ‘w’

The file write mode which can be `w`, `x`, or `a`.  `w` will write the file and overwrite another file if it already exists.  `x` will write the file, but error out if a file with the same name already exists.  `a` will append to the existing file.

The available write modes are the same as [open()](https://docs.python.org/3/library/functions.html#open).

Comment From: MrPowers

@datapythonista - Thanks for the help on this one!

Comment From: datapythonista

Thanks for reporting.

When only a subset of the values is expected, instead of using the type str, we can directly use mode : {'w', 'x', 'a'}, default 'w'.

To explain the types, we can also use bullet points, we do that in some docs, may make things easier to read.

It'd be good to see which other to_* functions have mode as a parameter, and update them all. Probably better to start by only one, get a code review, and when we're happy with it, apply it to the rest.

Comment From: twoertwein

One important value is also "wb" for binary file handles that do not have the .mode attribute themselves (this is hinted in the documentation for path_or_buf).

Comment From: HamidrezaSK

take

Comment From: MrPowers

It'd be good to see which other to_* functions have mode as a parameter, and update them all.

@datapythonista - strongly agree with this, especially because the different to_* APIs have different write modes that make sense. I'm not sure if to_parquet supports mode because I don't see it in the docs, but appending obviously won't work because Parquet files are immutable.

I don't mean to overcomplicate this too much, but for to_parquet we should also think about how mode and partition_cols interact. "appending" in that case probably means something more like how Spark uses "append" (i.e. adding files to an existing folder).

I mention this because the to_csv writer should probably also support partition_cols as well. Just documenting the existing options is a good start, but we should probably do some longer-term planning as well.

Comment From: qudus4l

I agree that it would be helpful to have the available modes for to_csv() explicitly listed in the documentation. Thanks for suggesting this improvement! Just a small note: the suggested text says that mode defaults to 'w', but in fact the default value is 'w' only if path_or_buf is a file path (otherwise, it defaults to None). Maybe it would be good to clarify this in the text. Apart from that, the suggested text looks great to me! It would make it much easier for users to understand what the mode parameter does and what the available options are.

Comment From: datapythonista

From a quick look I can only see to_json having a mode parameter. The description of its docstring will have to be extended from a generic one, since mode only makes sense with lines=True, and we need to say that.