Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
In [28]: import pandas as pd
In [29]: p = pd.Series([123123000123123, 1323123132, 23123123123], dtype='datetime64[ns]')
In [30]: p
Out[30]:
0 1970-01-02 10:12:03.000123123
1 1970-01-01 00:00:01.323123132
2 1970-01-01 00:00:23.123123123
dtype: datetime64[ns]
In [31]: p.dt.second
Out[31]:
0 3
1 1
2 23
dtype: int64
In [32]: p.dt.nanosecond
Out[32]:
0 123
1 132
2 123
dtype: int64
In [33]: p.dt.microsecond
Out[33]:
0 123
1 323123
2 123123
dtype: int64
Issue Description
When second & nanosecond are being accessed, the results being returned are consistent with expectations i.e., only the second & nanosecond components of the times. But when microsecond is accessed, a result of combining millisecond & microsecond are being returned. This doesn't seem to be consistent with the way other properties return their results.
This is also a Feature request to have the millisecond property added to DatetimeProperties.
Expected Behavior
In [33]: p.dt.microsecond
Out[33]:
0 123
1 123
2 123
dtype: int64
In [37]: p.dt.millisecond
Out[37]:
0 0
1 323
2 123
dtype: int16
Installed Versions
Comment From: mroeschke
Going to restructure this issue as an enhancement request since this stems from millisecond not being supported and the return of microsecond is consistent with datetime.datetime that also doesn't support millisecond.
Also if pandas supports millisecond, I think .microsecond should still match the datetime.datetime behavior since the "base" Timestamp tries to be fully compatible with it.
Comment From: PedroPUCRIO
take
Comment From: MarcoGorelli
If I've understood correctly, I think I'd be -1 on adding milliseconds
Suppose one had
In [30]: ts
Out[30]: Timestamp('2000-01-01 01:01:01.123456789')
In [31]: ts.millisecond
Out[31]: 123
Then one might well expect to get 123456789 from the following, instead of 246456789
In [32]: ts.millisecond * 1_000_000 + ts.microsecond * 1_000 + ts.nanosecond
Out[32]: 246456789
This could be resolved by making ts.microsecond return 456 instead, but that'd be a breaking change from the Python standard library.
To minimise surprises to users, I'd advocate for staying consistent with the Python standard library:
- no millisecond
- microsecond stays as-is
Comment From: MarcoGorelli
closing for now then, but thanks for the suggestion!