import pandas as pd
import numpy as np

dt = np.datetime64("2016-01-01", "ms")
ts = pd.Timestamp(dt)

>>> ts.value
1451606400000        # <- 2.0/main
1451606400000000000  # <- 1.x

In previous versions .value has always been in nanoseconds. By changing the reso we get with some Timestamp/Timedelta inputs, we change the .value, which is technically public API.

One option would be just to document the change in .value behavior along with the other breaking changes docs.

Another would be to use e.g. ._value internally and keep .value as representing nanos.

Comment From: mroeschke

Probably least disruptive to have .value continue to represent nanos.

Personally, I was never fond of the vague meaning of .value and could use this as an opportunity to deprecate in favor of Timestamp.to_epoch #14772 and Timedelta.to_something

Comment From: jorisvandenbossche

The Arrow CI was failing because of this (in combination with the changes parsing of strings in Timestamp(..), https://github.com/pandas-dev/pandas/issues/50704). Now, the failure was only because of using .value in a test to construct the expected result, and that is something easy to update. However, I also looked into our actual conversion code, and it seems we do use .value to access the integer value, and we do assume that this is always nanoseconds at the moment (https://github.com/apache/arrow/blob/2b50694c10e09e4a1343b62c6b5f44ad4403d0e1/python/pyarrow/src/arrow/python/python_to_arrow.cc#L360-L365)

It's a bit of a corner case (and so apparently also not covered by a test, since we didn't get a failure for this), but can be triggered by converting a list of Timestamp objects (or object dtype array), and explicitly passing a nanoseconds timestamp type:

# using current pandas main branch
>>> pa.array([pd.Timestamp("2012-01-01 09:01:02")], type=pa.timestamp("ns"))
<pyarrow.lib.TimestampArray object at 0x7f46ce55ec80>
[
  1970-01-01 00:00:01.325408462
]

# triggering the pd.Timestamp to be nanosecond resolution -> correct result
>>> pa.array([pd.Timestamp("2012-01-01 09:01:02.000000000")], type=pa.timestamp("ns"))
<pyarrow.lib.TimestampArray object at 0x7f46ac585480>
[
  2012-01-01 09:01:02.000000000
]

This could be fixed on the pyarrow side by checking the unit of the Timestamp, and only using .value when it is actually nanoseconds, and otherwise falling back to interpreting it as datetime.datetime and getting the components that way.

But it would break current pyarrow releases, so if we want to change .value, ideally we would wait a bit longer with that to give time for pyarrow to update for this.

So to summarize, I agree with the above that the easiest would be to keep .value as returning nanoseconds, always, regardless the actual unit.

But long term it would be good to have some way to get the raw underlying integer (that is more efficient that building up this value from the components, as we do for datetime.datetime objects). But this could indeed also be a method, as suggested by @mroeschke

Comment From: MarcoGorelli

+1 on @mroeschke 's suggestion

I'll work on .value today

Comment From: jbrockmendel

Two bikeshed thoughts to what to use instead of .value internally: 1) using asi8 would allow for some code-sharing with DTA/DTI/TDA/TDI, 2) having something distinctive would make it easier to grep for places where we use it.

Comment From: MarcoGorelli

For now, in https://github.com/pandas-dev/pandas/pull/50891, I just made it _value

I wasn't quite prepared for how many files it would involve changing

Comment From: MarcoGorelli

@jorisvandenbossche @jbrockmendel @mroeschke are we sure this is a good idea?

Because then the public-facing .value would be unavailable for old timestamps, e.g. Timestamp('300-01-01 00:00:00'), because it would overflow (and this is what motivated the introduction of non-nanosecond resolution)

Comment From: jbrockmendel

i guess we'd document that it is for nanos-only and direct users to use e.g. .asm8.view('i8') more generally

Comment From: MarcoGorelli

sure, I'll add that to the error message

Timedelta doesn't support non-nano reso yet, right? If so, perhaps I'll hold off timedelta changes for now then, that'll reduce the diff and make review a bit easier

Comment From: jbrockmendel

Timedelta supports non-nano, but doesn't do inference on strings

Comment From: MarcoGorelli

thanks - how do I set it? I'm seeing

In [6]: Timedelta(days=1, unit='s').unit
Out[6]: 'ns'

Comment From: MarcoGorelli

ah got it, nvm

In [2]: Timedelta(days=1, unit='s').as_unit('s').unit
Out[2]: 's'