For my own purposes, I am considering writing classes to represent PMFs and CDFs using Pandas Series as the implementation, and an API similar to what I did in the thinkstats2 library (but made more stylistically consistent with Pandas).

Has there been any discussion of adding something like this to Pandas? If I develop it, would you be interested in seeing a proposal to include it? (I ask now because it might influence some design decisions if I am targeting inclusion in Pandas).

Comment From: TomAugspurger

I suspect it would be out of scope for pandas, since it's pretty stats-specific.

(aside: statsmodels has an empirical CDF implementation at http://www.statsmodels.org/dev/generated/statsmodels.distributions.empirical_distribution.ECDF.html. Not sure if that suites your needs or not).

Comment From: AllenDowney

Thanks for the quick reply, Tom. Understood.

The ECDF in StatsModels is along the lines of what I have in mind, but there are a few more methods I'd like to provide (like random sampling).

Comment From: chris-b1

see #14781, idea from xarray here - this feels like a usecase that might be well suited to that accessor idea instead of subclassing. Idea would be to be able to write something like

@pd.register_series_accessor('stat')
class StatAccessor:
    def custom_function1(...)
        pass


s = pd.Series([1,2,3,4])

s.stat.custom_function1(...)

I do agree with @TomAugspurger that this is probably out of scope for pandas to be built in

Comment From: jreback

closing. as indicated this is out-of-scope. We have removed all non-basic stats functionaility recently. This should be done in statsmodels or a pandas plug-in package (just came up with that!)