Code Sample, a copy-pastable example if possible
Take a csv file with this content:
A B C D
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en HTTP/1.1" 200 7765
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en HTTP/1.1" 200 7765
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en'\" HTTP/1.1" 200 7765
Problem description
Try to read it using:
import pandas as pd
df = pd.read_csv("quotechartest.csv",quotechar="\"",sep=' ')
Expected Output
This leads to this error:
ParserError: Error tokenizing data. C error: Expected 4 fields in line 4, saw 5
The quotechar "
in the file is escaped but read_csv
does not recognize it as escaped which leads to the error message.
Output of pd.show_versions()
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None
pandas: 0.20.2
Comment From: chris-b1
You can specify an escape character with the escapechar
parameter.
from io import StringIO
pd.read_csv(StringIO(r"""A B C D
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en HTTP/1.1" 200 7765
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en HTTP/1.1" 200 7765
128.0.0.0 "GET /get-stuff-55/static/get-stuff/index.html?la=en'\" HTTP/1.1" 200 7765"""), sep=' ', quotechar='"', escapechar='\\')
Out[4]:
A B C D
0 128.0.0.0 GET /get-stuff-55/static/get-stuff/index.html?... 200 7765
1 128.0.0.0 GET /get-stuff-55/static/get-stuff/index.html?... 200 7765
2 128.0.0.0 GET /get-stuff-55/static/get-stuff/index.html?... 200 7765