Research

  • [X] I have searched the [pandas] tag on StackOverflow for similar questions.

  • [X] I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/questions/74830329/pandas-crosstab-dos-not-support-float-with-capital-f-number-formats

Question about pandas

Hello guys, I´ve noted that pandas crosstab does not support Float formats (instead of float). What I really would like is to understand why and if the output error message could be clearer to understand? Best regards.

I am working on a sample data transaction dataframe. Such base contains cliente ID, transaction gross value (GMV) and revenue. Take this example as DF :

df = pd.DataFrame({
    'id' :  [1,2,3,4,5],
    'date' : [datetime(2022,6,1),datetime(2022,12,31),datetime(2022,12,31),datetime(2022,12,31),datetime(2022,12,31)],
    'revenue' : [10.1, 10.1, 10.1, 10.1, 10.1]})
df = df.astype({'revenue':'Float64'})

Now I create a crosstab

CrossTab = pd.crosstab(df['id'], df['date'], values=df['revenue'], rownames=None, colnames=None, aggfunc='sum', margins=True, margins_name='All', dropna=False, normalize=False)

the output

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?62540584-615b-4486-a364-032ed486e331)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 CrossTab = pd.crosstab(df['id'], df['date'], values=df['revenue'], rownames=None, colnames=None, aggfunc='sum', margins=True, margins_name='All', dropna=False, normalize=False)

File c:\Users\fabio\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\pivot.py:691, in crosstab(index, columns, values, rownames, colnames, aggfunc, margins, margins_name, dropna, normalize)
    688     df["__dummy__"] = values
    689     kwargs = {"aggfunc": aggfunc}
--> 691 table = df.pivot_table(
    692     "__dummy__",
    693     index=unique_rownames,
    694     columns=unique_colnames,
    695     margins=margins,
    696     margins_name=margins_name,
    697     dropna=dropna,
    698     **kwargs,
    699 )
    701 # Post-process
    702 if normalize is not False:

File c:\Users\fabio\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py:8728, in DataFrame.pivot_table(self, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name, observed, sort)
   8711 @Substitution("")
   8712 @Appender(_shared_docs["pivot_table"])
   8713 def pivot_table(
   (...)
...
--> 292     raise TypeError(dtype)  # pragma: no cover
    294 converted = maybe_downcast_numeric(result, dtype, do_round)
    295 if converted is not result:

TypeError: Float64

Comment From: phofl

Hi, thanks for your report.

Can you please provide a minimal reproducible example? see https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

The groupby ops etc don't seem to be relevant

Comment From: frbelotto

Thanks. I´ve edited the post to make it smaller and more straight forward.

Comment From: phofl

This doesn't work, clients is not defined

Comment From: frbelotto

Sorry, I´ve simplified the code and forgot to change the source of the crosstab data. I´ve updated the post.

Its just substitute "clients" for df.

Comment From: phofl

This work on main, so may need tests

Comment From: frbelotto

Sorry for my ignorance. What do you mean when you say it "works on main"?

Comment From: phofl

That it works on the pandas main branch

Comment From: frbelotto

I am running it on VScode on last available versions. These are my imported modules and current versions.

pandas 1.5.2 Python 3.11.1

The crosstab itselfs runs normmaly if I just remove the "df = df.astype({'revenue':'Float64'})" code (so the DF is generated with float64 instead of Float64).

And other types of float vs float generate same errors (Float32, for example).

Comment From: phofl

The pandas main branch is our development branch. I am not talking about anything that is released yet

Comment From: frbelotto

OOOwww, sorry. Lol. Finally got it.

Comment From: phofl

No worries

Comment From: natmokval

take

Comment From: frbelotto

Thanks @natmokval, Just another question.

As I've seen, the issue occurred with Float32 too, for example, and you sad Float64.

Does it fix all of them in one shot?