Pandas version checks

  • [X] I have checked that this issue has not already been reported.
  • [X] I have confirmed this bug exists on the latest version of pandas.
  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame(
        {
            "A": pd.Series(["a", "b", "a", "b", "c", "a", "a"], dtype="category"),
        }
)
dummies = pd.get_dummies(df, drop_first=True)
pd.from_dummies(dummies, sep="_", default_category="a")

returns

TypeError: Passed DataFrame contains non-dummy data

Issue Description

from_dummies can't undo ops when sparse.

Expected Behavior

Work with sparse or non-sparse.

Installed Versions

INSTALLED VERSIONS
------------------
commit           : 87cfe4e38bafe7300a6003a1d18bd80f3f77c763
python           : 3.9.12.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
Version          : 10.0.22621
machine          : AMD64
processor        : AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : English_United States.1252

pandas           : 1.5.0
numpy            : 1.23.3
pytz             : 2022.2.1
dateutil         : 2.8.2
setuptools       : 63.3.0
pip              : 22.2.2
Cython           : None
pytest           : 7.1.3
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.8.0
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 3.1.2
IPython          : 8.5.0
pandas_datareader: None
bs4              : None
bottleneck       : None
brotli           : None
fastparquet      : None
fsspec           : None
gcsfs            : None
matplotlib       : 3.5.3
numba            : None
numexpr          : 2.8.3
odfpy            : None
openpyxl         : 3.0.10
pandas_gbq       : None
pyarrow          : 9.0.0
pyreadstat       : 1.1.9
pyxlsb           : 1.0.9
s3fs             : None
scipy            : 1.9.1
snappy           : None
sqlalchemy       : 1.4.41
tables           : None
tabulate         : 0.8.10
xarray           : 2022.6.0
xlrd             : 2.0.1
xlwt             : None
zstandard        : None
tzdata           : None

Comment From: Pvlb1998

Take

Comment From: kostyafarber

I can't reproduce this with the latest version of pandas on main. Can we confirm? Otherwise we can close this.

Comment From: ghost

It looks like the version of pandas that you are using is 1.5.0, and it seems to be causing the issue. This version of pandas is quite old, and the function from_dummies may have been updated or fixed in later versions.

I recommend that you update to the latest version of pandas, which is currently 1.5.2 and then try reproducing the issue again. If you are still unable to reproduce the issue, it is likely that it has been fixed in a later version, or it was a temporary bug that has been resolved. In that case, you can close this issue. Also, it's good to check the official pandas documentation for from_dummies to see if there are any changes or any indication of this bug.

Comment From: bashtage

Was it fixed? Possibly by the switch to bool in dummies from int8.

Comment From: kostyafarber

Was it fixed? Possibly by the switch to bool in dummies from int8.

I've pulled the latest version of Pandas from main and have tested this and do not receive any errors.

One of the maintainers can confirm and if so we can close this one.

Comment From: bashtage

Does appear to be fixed in 1.5.3.

Comment From: MarcoGorelli

tbh I can't even reproduce this on 1.5.0:

In [1]: pd.__version__
Out[1]: '1.5.0'

In [2]: import pandas as pd
   ...:
   ...: df = pd.DataFrame(
   ...:         {
   ...:             "A": pd.Series(["a", "b", "a", "b", "c", "a", "a"], dtype="category"),
   ...:         }
   ...: )
   ...: dummies = pd.get_dummies(df, drop_first=True)
   ...: pd.from_dummies(dummies, sep="_", default_category="a")
Out[2]:
   A
0  a
1  b
2  a
3  b
4  c
5  a
6  a