df.to_csv(path_or_buf=f, sep=' ', header=False,
index=True, mode='a', quoting=QUOTE_NONE, escapechar=' ',
date_format='%m/%d/%Y %H:%M:%S')
generates 2 spaces in csv file between date and time. It should only be one space since sep=' '.
Comment From: jorisvandenbossche
@denfromufa can you provide a small reproducible example? (some data that show this) Isn't your date and time in one column?
Comment From: den-run-ai
import pandas as pd
from csv import QUOTE_NONE
import numpy as np
rng = pd.date_range('1/1/2011', periods=6, freq='H')
rng.values[:3]
df1=pd.DataFrame(np.random.rand(rng.shape[0],5),index=rng)
df1.head(3)
df1.to_csv('test.csv', sep=' ', header=False,
index=True, mode='a', quoting=QUOTE_NONE, escapechar=' ',
date_format='%m/%d/%Y %H:%M:%S')
# %load test.csv
01/01/2011 00:00:00 0.8135433263165096 0.32811684099353866 0.4145982626151403 0.20669348747780092 0.40366368662756147
01/01/2011 01:00:00 0.18014728764443488 0.5146938838675809 0.4536598423132532 0.34155733427820656 0.440876368183323
01/01/2011 02:00:00 0.4768181257700802 0.9281149054744247 0.1394599500466026 0.44100722743440857 0.4725515115245551
01/01/2011 03:00:00 0.8827699134085255 0.23117935993894556 0.7247609890740998 0.46107279818544544 0.3034632588430233
01/01/2011 04:00:00 0.7587177485651242 0.774790622810391 0.06699642692286567 0.1384981319339471 0.8932072731692663
01/01/2011 05:00:00 0.10015283348607174 0.38220578967027674 0.4148014776634852 0.11548217031729191 0.04768268242910356
Comment From: jorisvandenbossche
@denfromufa Thanks for the example. I made it a bit smaller:
In [48]: from csv import QUOTE_NONE
In [49]: rng = pd.date_range('1/1/2011', periods=3, freq='H')
In [50]: df = pd.DataFrame([[1,2],[3,4],[5,6]],index=rng)
In [51]: df
Out[51]:
0 1
2011-01-01 00:00:00 1 2
2011-01-01 01:00:00 3 4
2011-01-01 02:00:00 5 6
In [60]: print df.to_csv(sep=' ', quoting=QUOTE_NONE, escapechar=' ', date_format='%m/%d/%Y %H:%M:%S')
0 1
01/01/2011 00:00:00 1 2
01/01/2011 01:00:00 3 4
01/01/2011 02:00:00 5 6
and it has to do wiht your combination of quoting
and escapechar
. If I remove both:
In [61]: print df.to_csv(sep=' ', date_format='%m/%d/%Y %H:%M:%S')
0 1
"01/01/2011 00:00:00" 1 2
"01/01/2011 01:00:00" 3 4
"01/01/2011 02:00:00" 5 6
But I think this is expected behaviour, as escapechar
is meant to escape occurences of the delimiter that are not the delimiter (a space is occuring in the date-time column, while you set the delimiter as a space). Eg I can set it to a backslash:
In [64]: print df.to_csv(sep=' ', quoting=QUOTE_NONE, escapechar='\\', date_format='%m/%d/%Y %H:%M:%S')
0 1
01/01/2011\ 00:00:00 1 2
01/01/2011\ 01:00:00 3 4
01/01/2011\ 02:00:00 5 6
What do you try to achieve exactly? If you want to have it with only one space and without quotes, you will have to split it to two columns I think, as you it is not allowed to give no escapechar with QUOTE_NONE
Comment From: den-run-ai
I want to have it with only one space and without quotes without breaking to 2 columns
Comment From: jorisvandenbossche
If you want that, then it means that it are in practice 2 columns in the csv files, as there is no way to distinguish the space between date and time, and the space as delimiter between the other columns.
As I said above, the only way to do this is splitting it up in to 2 columns (as you cannot generate an invalid csv file).
Comment From: jorisvandenbossche
With my example of above, you can easily do this with:
In [28]: df
Out[28]:
0 1
2011-01-01 00:00:00 1 2
2011-01-01 01:00:00 3 4
2011-01-01 02:00:00 5 6
In [29]: df.index.to_series().apply(str).str.split(' ', return_type='frame')
Out[29]:
0 1
2011-01-01 00:00:00 2011-01-01 00:00:00
2011-01-01 01:00:00 2011-01-01 01:00:00
2011-01-01 02:00:00 2011-01-01 02:00:00
If you put that in the original dataframe, you can now write it as you want.
Comment From: jorisvandenbossche
Closing, as it is more a question as a pandas bug/enhancement (which does not mean you can ask further!)
Comment From: den-run-ai
This approach loses the date formatting date_format='%m/%d/%Y %H:%M:%S'
Comment From: den-run-ai
The final solution:
sep_dates=df.index.to_series().apply(lambda x: x.strftime('%m/%d/%Y %H:%M:%S')).str.split(' ', return_type='frame').values
dfreproc.index=pd.MultiIndex.from_arrays(sep_dates.T)