Code Sample, a copy-pastable example if possible
>>> import struct
>>> struct.pack('qq', 1, 2)
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'
>>> buf = struct.pack('qq', 1, 2)
>>> import numpy
>>> ar = numpy.frombuffer(buf, dtype=numpy.int64)
>>> ar
array([1, 2])
>>> import pandas
>>> pandas.MultiIndex.from_arrays([ar, numpy.array([10, 20])])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../.local/lib/python2.7/site-packages/pandas/indexes/multi.py", line 935, in from_arrays
labels, levels = _factorize_from_iterables(arrays)
File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2068, in _factorize_from_iterables
return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))
File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2040, in _factorize_from_iterable
cat = Categorical(values, ordered=True)
File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 300, in __init__
raise NotImplementedError("> 1 ndim Categorical are not "
NotImplementedError: > 1 ndim Categorical are not supported at this time
In-depth look:
>>> pandas.core.categorical.factorize(ar, sort = True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../.local/lib/python2.7/site-packages/pandas/core/algorithms.py", line 313, in factorize
labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
File "pandas/src/hashtable_class_helper.pxi", line 485, in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:9966)
File "stringsource", line 644, in View.MemoryView.memoryview_cwrapper (pandas/hashtable.c:32223)
File "stringsource", line 345, in View.MemoryView.memoryview.__cinit__ (pandas/hashtable.c:28458)
ValueError: buffer source array is read-only
Problem description
If a NumPy
array backed by a read-only buffer is used in MultiInted.from_arrays
a confuzsing NotImplementedError: > 1 ndim Categorical are not supported at this time
exception is raised.
A more in-depth look at the code shows that, when using a NumPy
array backed by a read-only buffer, the Categorical
constructor (which raises the NotImplementedError
exception) calls factorize
which raises ValueError: buffer source array is read-only
.
Maybe the read-only aspect raised by factorize
should propagate to the user so the real cause of the exception is known. As currently implemented, the Categorical
constructor reinterprets the exception and throws a confusing NotImplementedError: > 1 ndim Categorical are not supported at this time
Expected Output
ValueError: buffer source array is read-only
Output of pd.show_versions()
Comment From: jreback
duplicate of #15286, with the cause in #12813 . Its actually not that hard to fix. pull-requests are welcome, though I will get to it after 0.20.0 is released.