Feature Type

  • [X] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

After I saw the representation of a group by in Airtable, I wanted something similar in pandas. It always bothered me that there was no HTML representation for a group by.

Feature Description

I already have written a solution, which can be found here: https://github.com/pandas-dev/pandas/compare/main...JulianWgs:pandas:main

It displays groups of the group by: Pandas ENH: HTML repr for groupby dataframe and series

With that you can debug and introspect of what you are doing.

It skips groups if they are more than 10 (user configurable): Pandas ENH: HTML repr for groupby dataframe and series

Alternative Solutions

@jorisvandenbossche proposed a different representation which would only show information which is already available (number of groups and group names). The reasons for this are that get_group can be quite expensive for large groups (but it is only called a maximum of 10 times) and that the output is quite verbose. Additionally pandas could display the size of the group which would be cheaper to calculate than get_group.

Additional Context

I already have written the code for my proposed solution and created a pull request for this. There was one issue left, but unfortunately I didnt quite get what @jreback meant and the pull request got closed due to inactivity. I'm now at PyConDE in the pandas workshop and it was recommended to me by @jorisvandenbossche, to open an issue.