Pandas BUG: Rolling skew error - Nineya|java/go/python

Pandas version checks

[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
data =[-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,7729964027,7729965641,7729965103,7729966179,7729968330,7729969406,7729971020,7729969406,7729969944,7729969406,7729969406,7729969944,7729971558,7729972633,7729973171,7729973171,7729973171,7729969406]

df=pd.DataFrame(data,columns = ['data'])
df.data.rolling(10).skew()
43   -1.778781e+00
44   -3.162278e+00
45   -1.311680e+03
46    2.681674e+04
47    5.244346e+03
48    5.565501e+03
49    4.143785e+04
50    1.921654e+04
51   -7.723187e+03
52    1.171920e+04
53    1.171920e+04

df.data.tail(10).skew()
0.16889762763904356

#Same problem for kurt()!
df.data.rolling(10).kurt()[-1:]
53   -4.518855e+10

df.data.tail(10).kurt()
-2.198110595961745

Issue Description

When there is a large spread in the exponent of values in the data, then rolling().skew() and tail().skew() come to different results.

Expected Behavior

The results must be equal.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 06d230151e6f18fdb8139d09abf539867a8cd481 python : 3.9.7.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19044 machine : AMD64 processor : AMD64 Family 23 Model 1 Stepping 1, AuthenticAMD byteorder : little LC_ALL : None LANG : None LOCALE : Russian_Russia.1251 pandas : 1.4.1 numpy : 1.22.2 pytz : 2021.3 dateutil : 2.8.2 pip : 22.0.4 setuptools : 62.4.0 Cython : 0.29.27 pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.7.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.3 IPython : 8.0.1 pandas_datareader: None bs4 : 4.10.0 bottleneck : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.5.1 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.7.3 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None

Comment From: phofl

Hi, thanks for your report.

Probably an overflow. Could you trim your example to use only as many values as necessary? This looks a bit large for a minimal example

Comment From: alex7088

I reduced the data in the example to 54 elements, and also found that the same error with kurt()

Comment From: GYHHAHA

A refined example:

>>> s = pd.Series([-2., 200000., 200001., 200005.])
>>> s.tail(3).skew()
1.457862967321305
>>> s.rolling(3).skew()
 0         NaN
 1         NaN
 2   -1.732051
 3    2.071599
 dtype: float64

When the last three numbers grow, the last element in the rolling result deviates from the correct skew value.

Comment From: GYHHAHA

The problem is caused by double precision (instead of overflow). I have fixed the refined example by using long double C type and implementing multiplication instead of division when computing the momentums in the following lines.

# original
A = x / dnobs
B = xx / dnobs - A * A
C = xxx / dnobs - A * A * A - 3 * A * B
# new
B = (xx * dnobs - x * x) / dnobs / dnobs
C = (xxx * dnobs * dnobs - 3. * x * xx * dnobs + 2. * x * x * x) / dnobs / dnobs / dnobs

But in @alex7088 example, it is still not accurate, though much better numerically. Thus I'm not sure whether I should PR. @phofl