Pandas BUG/PY3? astype("string") fails for python 3

From the categorical example docs, with python 2.7:

In [1]: s = pd.Series(["a","b","c","a"])

In [2]: s2 = s.astype('category')

In [3]: s2.astype('string')
Out[3]: 
0    a
1    b
2    c
3    a
dtype: object

However, with python 3, this fails:

In [74]: s = pd.Series(["a","b","c","a"])

In [75]: s2 = s.astype('category')

In [76]: s2.astype('string')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-76-70b55f934dfe> in <module>()
----> 1 s2.astype('string')

/home/joris/scipy/pandas/pandas/core/generic.py in astype(self, dtype, copy, raise_on_error, **kwargs)
   3179             conversion, with unconvertible values becoming NaT.
   3180         convert_numeric : boolean, default False
-> 3181             If True, attempt to coerce to numbers (including strings), with
   3182             unconvertible values becoming NaN.
   3183         convert_timedeltas : boolean, default True

/home/joris/scipy/pandas/pandas/core/internals.py in astype(self, dtype, **kwargs)
   3188 
   3189     def astype(self, dtype, **kwargs):
-> 3190         return self.apply('astype', dtype=dtype, **kwargs)
   3191 
   3192     def convert(self, **kwargs):

/home/joris/scipy/pandas/pandas/core/internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3055 
   3056             kwargs['mgr'] = self
-> 3057             applied = getattr(b, f)(**kwargs)
   3058             result_blocks = _extend_blocks(applied, result_blocks)
   3059 

/home/joris/scipy/pandas/pandas/core/internals.py in astype(self, dtype, copy, raise_on_error, values, **kwargs)
    459                **kwargs):
    460         return self._astype(dtype, copy=copy, raise_on_error=raise_on_error,
--> 461                             values=values, **kwargs)
    462 
    463     def _astype(self, dtype, copy=False, raise_on_error=True, values=None,

/home/joris/scipy/pandas/pandas/core/internals.py in _astype(self, dtype, copy, raise_on_error, values, klass, mgr)
   2158             values = self.values
   2159         else:
-> 2160             values = np.asarray(self.values).astype(dtype, copy=False)
   2161 
   2162         if copy:

TypeError: data type "string" not understood

Should this also work? (or should be just update the docs) In any case, consistency would be nice here.

The root cause is probably that this difference also exists in numpy's astype.

Output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-53-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0+270.gc72f297
nose: 1.3.7
pip: 8.1.2
setuptools: 23.0.0
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.0.0
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0rc2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
s3fs: 0.0.7
pandas_datareader: None

Comment From: chris-b1

xref https://github.com/numpy/numpy/issues/6023 (may be other open issues)

Comment From: jorisvandenbossche

OK, maybe we should just leave it as a numpy issue then. I am updating the docs in any case to have them py3 compat for now.

Comment From: jorisvandenbossche

Doc issue is addressed in https://github.com/pandas-dev/pandas/pull/15011

Comment From: TomAugspurger

Looks like we're good here.

Pandas BUG/PY3? astype("string") fails for python 3

Output of pd.show_versions()

Output of `pd.show_versions()`