Code Sample, a copy-pastable example if possible

import pandas as pd
from bokeh.charts import Bar, Line
from bokeh.plotting import output_file,show

output_file("test.html")

wt = pd.read_csv('test.csv', dtype={
    'word': str,
    'times': int,
})

print(wt['word'])
line = Line(wt,
            x='word',
            y='times'
            )
show(line)

The test.csv file showed below:

word,times
nan,1

Problem description

read_csv() will throw a exception when read csv file has str 'NaN'

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

I think it has a str 'nan' becuase it conflicts with python built-in words.

if i remove 'nan' string, it works good.

Output


Expected Output

Output of pd.show_versions()

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.8.0-2-amd64 machine: x86_64 processor: byteorder: little LC_ALL: LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.19.2 nose: None pip: 9.0.1 setuptools: 32.3.0 Cython: None numpy: 1.12.0 scipy: None statsmodels: None xarray: None IPython: 5.1.0 sphinx: None patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 2.0.0 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.5.1 html5lib: 0.999999999 httplib2: 0.9.2 apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.4 boto: None pandas_datareader: None

Comment From: jreback

are you sure?

In [1]: pd.__version__
Out[1]: '0.19.2'

In [2]: data = """word,times
   ...: nan,1"""

In [3]: pd.read_csv(StringIO(data), dtype={
   ...:     'word': str,
   ...:     'times': int,
   ...: })
Out[3]: 
  word  times
0  NaN      1

or if you really want to keep the actual work nan (and not mark it as missing)

In [7]: pd.read_csv(StringIO(data), dtype={
   ...:     'word': str,
   ...:     'times': int,
   ...: }, keep_default_na=False)
Out[7]: 
  word  times
0  nan      1

Comment From: netcan

@jreback I sorry, I have updated the code

Comment From: jreback

@netcan so what exactly is the issue? the read df looks fine.

Comment From: netcan

@jreback df looks file, but I worked with bokeh, if df has 'nan' str, it would throw a exception, and if I remove 'nan' from csv, it works find. I test bokeh with lists that has 'nan', it works find... so I supposed It would be pandas... ValueError: expected an element of either List(String) or List(Int), got [nan]

Comment From: jreback

you can address bokeh on there issue list. I don't really know if they support missing values in a graph like that.