Pandas NotImplementedError: > 1 ndim Categorical raised when array is read-only

Code Sample, a copy-pastable example if possible

>>> import struct
>>> struct.pack('qq', 1, 2)
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'
>>> buf = struct.pack('qq', 1, 2)
>>> import numpy
>>> ar = numpy.frombuffer(buf, dtype=numpy.int64)
>>> ar
array([1, 2])
>>> import pandas
>>> pandas.MultiIndex.from_arrays([ar, numpy.array([10, 20])])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.local/lib/python2.7/site-packages/pandas/indexes/multi.py", line 935, in from_arrays
    labels, levels = _factorize_from_iterables(arrays)
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2068, in _factorize_from_iterables
    return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2040, in _factorize_from_iterable
    cat = Categorical(values, ordered=True)
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 300, in __init__
    raise NotImplementedError("> 1 ndim Categorical are not "
NotImplementedError: > 1 ndim Categorical are not supported at this time

In-depth look:

>>> pandas.core.categorical.factorize(ar, sort = True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.local/lib/python2.7/site-packages/pandas/core/algorithms.py", line 313, in factorize
    labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
  File "pandas/src/hashtable_class_helper.pxi", line 485, in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:9966)
  File "stringsource", line 644, in View.MemoryView.memoryview_cwrapper (pandas/hashtable.c:32223)
  File "stringsource", line 345, in View.MemoryView.memoryview.__cinit__ (pandas/hashtable.c:28458)
ValueError: buffer source array is read-only

Problem description

If a NumPy array backed by a read-only buffer is used in MultiInted.from_arrays a confuzsing NotImplementedError: > 1 ndim Categorical are not supported at this time exception is raised.

A more in-depth look at the code shows that, when using a NumPy array backed by a read-only buffer, the Categorical constructor (which raises the NotImplementedError exception) calls factorize which raises ValueError: buffer source array is read-only.

Maybe the read-only aspect raised by factorize should propagate to the user so the real cause of the exception is known. As currently implemented, the Categorical constructor reinterprets the exception and throws a confusing NotImplementedError: > 1 ndim Categorical are not supported at this time

Expected Output

ValueError: buffer source array is read-only

Output of `pd.show_versions()`

# Paste the output here pd.show_versions() here >>> pandas.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 4.9.6-100.fc24.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: None.None pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 20.1.1 Cython: 0.25.2 numpy: 1.11.0 scipy: 0.16.1 statsmodels: None xarray: None IPython: 5.3.0 sphinx: None patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 0.6.0 tables: 3.2.2 numexpr: 2.6.1 matplotlib: 1.5.2rc2 openpyxl: None xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: None lxml: None bs4: 4.4.0 html5lib: 1.0b7 httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8.1 boto: 2.45.0 pandas_datareader: 0.3.0.post

Comment From: jreback

duplicate of #15286, with the cause in #12813 . Its actually not that hard to fix. pull-requests are welcome, though I will get to it after 0.20.0 is released.

Pandas NotImplementedError: > 1 ndim Categorical raised when array is read-only

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`