Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> x = 340.3333333333333
>>> pd.Timestamp(x, unit='us')
Timestamp('1970-01-01 00:00:00.000340333')
>>> pd.Timestamp(x, unit='us').unit
'ns'
>>> pd.Timestamp(x, unit='us').as_unit('us')
Timestamp('1970-01-01 00:00:00.000340')
>>> pd.Timestamp(x, unit='us').as_unit('us').unit
'us'

Issue Description

When creating a pd.Timestamp & pd.Timedelta with a specific unit, the resulting Timestamp seems to be always of unit ns. We are having to perform as_unit('us') to retrieve the desired result.

Expected Behavior

>>> pd.Timestamp(x, unit='us')
Timestamp('1970-01-01 00:00:00.000340')

Installed Versions

INSTALLED VERSIONS ------------------ commit : c2a7f1ae753737e589617ebaaff673070036d653 python : 3.10.10.final.0 python-bits : 64 OS : Linux OS-release : 4.15.0-76-generic Version : #86-Ubuntu SMP Fri Jan 17 17:24:28 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.0.0rc1 numpy : 1.23.5 pytz : 2023.2 dateutil : 2.8.2 setuptools : 67.6.0 pip : 23.0.1 Cython : 0.29.33 pytest : 7.2.2 hypothesis : 6.70.1 sphinx : 5.3.0 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.11.0 pandas_datareader: None bs4 : 4.12.0 bottleneck : None brotli : fastparquet : None fsspec : 2023.3.0 gcsfs : None matplotlib : None numba : 0.56.4 numexpr : None odfpy : None openpyxl : 3.1.2 pandas_gbq : None pyarrow : 11.0.0 pyreadstat : None pyxlsb : None s3fs : 2023.3.0 scipy : 1.10.1 snappy : sqlalchemy : 1.4.46 tables : None tabulate : 0.9.0 xarray : None xlrd : None zstandard : None tzdata : None qtpy : None pyqt5 : None

Comment From: jbrockmendel

The "unit" keyword describes how the input is interpreted, not how the result is stored internally (which the .unit attribute represents). The overlapping naming is unfortunate.

cc @MarcoGorelli

Comment From: mroeschke

I forgot if this was discussed, but is it possible if a user passes a numeric + unit to automatically also store the representation in that unit?

Comment From: jbrockmendel

is it possible if a user passes a numeric + unit to automatically also store the representation in that unit?

For units that we support, yes. For minute and lower we'd default back to seconds. I'd be on board with this.

Comment From: mroeschke

For units that we support, yes. For minute and lower we'd default back to seconds. I'd be on board with this.

Likewise, makes sense to me

Comment From: jbrockmendel

OK with putting this in 2.0?

Comment From: mroeschke

Yes definitely

Comment From: MarcoGorelli

OK with putting this in 2.0?

totally - marking as blocker then if that's OK?

Comment From: jbrockmendel

Implementing this I now remember why we didn't do it already: it introduces possibly-unwanted rounding in float cases. i.e Timestamp(1.5, unit="s) drops the half-second. We have 6 tests TestTimestamp::test_unit that this breaks.

Comment From: mroeschke

it introduces possibly-unwanted rounding in float cases

Ah okay. Yeah I don't think specifying unit should potentially round the input. Maybe for this particular case then we should document that unit does not set the unit and users should use as_unit afterwards

Comment From: mroeschke

totally - marking as blocker then if that's OK?

IMO I don't think this should be a strict blocker for 2.0 but a nice to fix/document

Comment From: MarcoGorelli

sure, sounds good!

separate issue but float inputs to to_datetime feel like something we should consider deprecating anyway...removing the 'blocker' label anyway