Pandas to_csv - Nineya|java/go/python

import pandas as pd text = 'this is "out text"' df = pd.DataFrame(index=['1'],columns=['1','2']) df.loc['1','1']=123 df.loc['1','2']=text df.to_csv('foo.txt',index=False,header=False,quoting=3,sep=' ')

Error Traceback (most recent call last) /nas02/home/v/a/valenal/scrpts/ in () ----> 1 df.to_csv('foo.txt',index=False,header=False,quoting=3,sep=' ')

/nas01/depts/ie/cempd/apps/python/python2.6/lib/python2.6/site-packages/pandas-0.16.0-py2.6-linux-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal, **kwds) 1183 escapechar=escapechar, 1184 decimal=decimal) -> 1185 formatter.save() 1186 1187 if path_or_buf is None:

/nas01/depts/ie/cempd/apps/python/python2.6/lib/python2.6/site-packages/pandas-0.16.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in save(self) 1430 1431 else: -> 1432 self._save() 1433 1434 finally:

/nas01/depts/ie/cempd/apps/python/python2.6/lib/python2.6/site-packages/pandas-0.16.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save(self) 1530 break 1531 -> 1532 self._save_chunk(start_i, end_i) 1533 1534 def _save_chunk(self, start_i, end_i):

/nas01/depts/ie/cempd/apps/python/python2.6/lib/python2.6/site-packages/pandas-0.16.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save_chunk(self, start_i, end_i) 1553 date_format=self.date_format) 1554 -> 1555 lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer) 1556 1557 # from collections import namedtuple

/nas01/depts/ie/cempd/apps/python/python2.6/lib/python2.6/site-packages/pandas-0.16.0-py2.6-linux-x86_64.egg/pandas/lib.so in pandas.lib.write_csv_rows (pandas/lib.c:16804)()

Error: need to escape, but no escapechar set

INSTALLED VERSIONS

commit: None python: 2.6.5.final.0 python-bits: 64 OS: Linux OS-release: 2.6.18-238.12.1.el5 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.iso885915

pandas: 0.16.0 nose: 0.11.1 Cython: 0.19.2 numpy: 1.7.0 scipy: 0.14.0 statsmodels: 0.6.0 IPython: 0.12.1 sphinx: 1.0.7 patsy: 0.3.0 dateutil: 2.1 pytz: 2013b bottleneck: None tables: None numexpr: 2.1 matplotlib: 1.3.1 openpyxl: 1.5.8 xlrd: 0.9.3 xlwt: None xlsxwriter: None lxml: 3.0.1 bs4: 4.3.2 html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: 0.6.1.None

psycopg2: 2.5.5 (dt dec pq3 ext)

I am trying to output to a text file using to_csv without extra quotes.

Comment From: jorisvandenbossche

What is your questions exactly?

The error message says that you have to specify an escapechar (from the docs: escapechar: Character used to escape sep and quotechar when appropriate). This is because the separator you use (whitespace) also is present in the data. For this, you have to escape this.

E.g.

In [29]: print df.to_csv(index=False,header=False,quoting=3,sep=' ', escapechar='\\')
123 this\ is\ "out\ text"

or use another separator

Comment From: valenal

Joris, thanks for the help. My question or what I am trying to achieve is to imitate another files format.This files format uses spaces as separators, specific spacing for each column and doesn't contain quotes at all. Like so (note varaible spaces between columns)

374140.00000 3756062.00000 9.663200e-03 30.00 30.00

369722.00000 3755236.00000 0.000000e+00 51.86 57.00

However, due to the specific spaces in each column and the fact that it uses spaces as delimiter I can only achieve this with the to_csv function by saving my output to the file with quotes and then removing the quotes of the created file.

" 374140.00000" 3756062.00000 " 9.663200e-03" " 30.00" " 30.00"

" 369722.00000" 3755236.00000 " 0.000000e+00" " 51.86" " 57.00"

Is there no way in pandas to not output quotes ? I thought that was what "quoting=3" was for. Thanks in advance.

On Thu, Sep 10, 2015, 3:35 AM Joris Van den Bossche notifications@github.com wrote:

What is your questions exactly?

The error message says that you have to specify an escapechar (from the docs: escapechar: Character used to escape sep and quotechar when appropriate). This is because the separator you use (whitespace) also is present in the data. For this, you have to escape this.

E.g.

In [29]: print df.to_csv(index=False,header=False,quoting=3,sep=' ', escapechar='\') 123 this\ is\ "out\ text"

or use another separator

— Reply to this email directly or view it on GitHub https://github.com/pydata/pandas/issues/11040#issuecomment-139143552.

Comment From: jorisvandenbossche

It is certainly possible to not have quotes in the csv output, that is indeed where quoting=3 is for. But, you have following problems: 1) your example data included quotes, do you expect these to go away? Because the quoting is only to put around the values, it does not change anything inside the strings itself. And 2) pandas does not support using a space as a delimiter if there are also spaces in the values. This would lead to a file that cannot be interpreted correctly.

But can you provide a small reproducible example of what you like to do? (because now you used some other data).

Comment From: valenal

Joris thanks again for the help. What I want to do is something in the lines of "2". This is a space delimited file where the columns have spacing as well, but I understand that pandas doesn't allow a space delimiter without the esacpechar.

I can achieve what I want by allowing the escapechar (#) in my to_csv output and then removing the "#" after the fact (example shown below), but I could have just done that be allowing quotes and then removing them. Ultimately, I want to avoid having to read the file again to have to remove extra characters.

>>>import numpy as np
>>>import pandas as pd
>>>import os
>>>import subprocess

>>>df = pd.DataFrame(np.random.randn(6,4))
>>>df[4] = 'Test'

>>>c0   = lambda x: '{0:14.5f}'.format(x)
>>>c1   = lambda x: '{0:13.5f}'.format(x)
>>>c2   = lambda x: '{0:13.6e}'.format(x)
>>>c3 = lambda x: '{0:8.2f}'.format(x)
>>>c4   = lambda x: '{0:>7}'.format(x)

>>>for ix,i in enumerate([c0,c1,c2,c3,c4]):
>>>    df[ix] = df[ix].map(i)

>>>with open('test1.txt', 'w') as outF:
>>>    df.to_csv(outF, header=False, index=False, sep='
',quoting=3,quotechar='',escapechar='#')

>>>#What I want my output to look like without having to use SED
>>>with open('test2.txt' ,'w') as f:
>>>    subprocess.call(['sed','s/#//g','test1.txt'],stdout=f)