Problem description
I can't seem to find if pandas has a count unique / distinct function. I know I can unique a series / dataframe and length that but it's quite slow. It would be neat if we had something that use hyperloglog since that's fast and memory efficient. Though it has a tradeoff of being probabilistic so that'd have to be configurable I feel.
Expected Output
I would expect something like
s = pd.Series([0,1,2])
s.count_distinct() #=> 3
Comment From: jorisvandenbossche
There is a nunique
method, what I think returns what you are looking for.
But, I think under the hood it just counts the uniques.
Comment From: hexgnu
awesome thanks @jorisvandenbossche