related #7615
Bit of a UI problem here (although it's behaving as the docstring says it does, so it doesn't quite qualify as a bug):
>>> df = pd.DataFrame({"A": [1,2,3], "B": [4,5,6]})
>>> df.to_csv("tmp.csv", sep=";")
>>> !cat tmp.csv
;A;B
0;1;4
1;2;5
2;3;6
>>> df.to_csv("tmp.csv", delimiter=";")
>>> !cat tmp.csv
,A,B
0,1,4
1,2,5
2,3,6
read_csv
accepts both sep
and delimiter
but to_csv
silently ignores delimiter
. Someone was recently tripped up by this on SO. I'm fine with either teaching to_csv
to behave the same way read_csv
does or, alternatively, raising if delimiter
is found as a keyword.
That is, I'm less bothered by the inconsistency than the silent unexpected behaviour.
Comment From: jreback
related to #7615
to_csv allows **kwargs I think we need to validate them (eg provide a nice message that you have passed an invalid parameter )
and then can do things like accept delimiter for sep
(must be a reason to_csv accepts **kwargs but don't remember why/when was changed)
Comment From: indera
👍 Spent about an hour trying to figure out why passing "delimiter" does not affect the "to_csv" call ...
Comment From: indera
Will try to fix this if nobody started.
Relevant code:
to_csv => https://github.com/pandas-dev/pandas/blob/ee374ee3bc49b82754e8269f69cf8f79d488bd5e/pandas/core/frame.py#L1303
read_csv => https://github.com/pandas-dev/pandas/blob/master/pandas/io/parsers.py#L565
Comment From: gfyoung
@indera : Just what are you trying to fix exactly? The **kw
option has been removed in to_csv
, so are you trying to add delimiter
to to_csv
?
Comment From: jreback
@dsm054 @gfyoung is right, this used to be a problem, but no-longer.
Comment From: indera
Yes, this is fixed in pandas==0.19.0
I get an error as expected:
TypeError: to_csv() got an unexpected keyword argument 'delimiter'
Comment From: indera
@gfyoung - we should probably now remove support for "delimiter" in read_csv for consistency ;)
Comment From: jorisvandenbossche
@indera the difference is that for read_csv
the delimeter
keyword actually works, so not sure if deprecating/removing is worth it. For to_csv
it never worked, so that is another situation.
Comment From: gfyoung
@jorisvandenbossche : It wouldn't be that hard to incorporate into the to_csv
signature though, so that's one way to go about it to make things consistent.
Comment From: jorisvandenbossche
Consistency is nice, but the recommended one sep
is already consistent for both, so IMO not needed to make the duplication consistent :-)
Comment From: gfyoung
That's fair, though we could consider deprecating delimiter
at some point, just not now...:smile:
Comment From: indera
@gfyoung deprecating delimiter
sounds good... removing code in general feels good 😄
Comment From: gfyoung
@jorisvandenbossche @jreback : Thoughts?
Comment From: dahlbaek
@gfyoung wrote:
That's fair, though we could consider deprecating delimiter at some point, just not now...
Any update on this? Is 'not now' now?
Comment From: gfyoung
@jorisvandenbossche @jreback : Thoughts?
Comment From: dahlbaek
At the moment, delimiter
seems to be silently overriding sep
if both are provided, so something should be done I guess.
Comment From: jandren
The above pull request was closed due to stalling and never merged. The inconsistency remains, which I just encountered. Used delimiter
on read_csv and then got errors when I later tried to use to_csv with delimiter
again.
Comment From: jreback
@jandren happy to take a new PR to deprecate
Comment From: jandren
Thanks for the opportunity @jreback! My intention was to just give updated feedback and poke the issue as still alive. I have to little knowledge about the pros and cons of sep
vs delimiter
, so I will leave it to people familiar with the code base.