Currently df which include a categorical can be written to a hdf store and other on disc storage formats. For formats which include factors/categoricals, also ensure that such data is read into a categorical. - [ ] CSV #7217, #7444 (just testing, not a completely fungible format) - [x] HDF5 #7217, #7444 / #8793 - [x] msgpack #8632 (closed by #12573) - [x] stata #8633 / #8767

Comment From: jankatins

For discussion on this, see #7217 and #7444 and also the tests in test_categorical (test_io_hdf, test_io_csv)

Comment From: fkaufer

Stata Categorical export could be implemented using value labels, see http://www.stata.com/help.cgi?dta. As the name already indicates Stata's value labels are not as generic as pandas' categoricals but limited to strings (similar to R).

Support for categoricals-to-value-labels-conversion would be fantastic, but I would already be happy, if category dtype is captured (pandas/io/stata.py#L1262) and coerced to string. Guess this requires also to either fix to_records for categoricals (pandas/io/stata.py#L1633, #8626) or to explicitely decode (#8628) on the fly.

Comment From: jreback

cc @bashtage, interested?

Comment From: bashtage

I'll have to take a look at how hard it will be - haven't used Categoricals, so extra hurdle.

Comment From: jreback

closing, you cannot directly serialize a Categroical in csv, but you can specify the dtype when reading.