-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
pd.to_datetime(1, unit='s')
# Timestamp('1970-01-01 00:00:01')
pd.to_datetime(1, unit='s', origin=1)
# Current --> Timestamp('1970-01-01 00:00:01')
# Expected -> Timestamp('1970-01-01 00:00:02')
pd.to_datetime(1, unit='s', origin=1000000000)
# Current --> Timestamp('1970-01-01 00:00:02')
# Expected -> Timestamp('2001-09-09 01:46:41')
Problem description
The numeric values of origin should be parsed as number of units defined by unit
Output of pd.show_versions()
Comment From: stephan-hesselmann-by
This should be easy to fix, but I'm not sure what the correct expected behavior actually is. According to the docs (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html) origin
may be given as "unix" (or POSIX) time
. POSIX time is defined in seconds. So should the correct behavior be to use seconds as unit for numeric origin
(potentially different from the unit
of arg
)? Or should origin
be in the unit
of arg
? The current behavior also makes sense since it corresponds to the default Timestamp(origin)
conversion.
Comment From: ytchanglp
the doc is very clearly.
The numeric values would be parsed as number of units (defined by unit) since this reference date.
Comment From: Dinkarkumar
Can I work on this issue ?
Comment From: stephan-hesselmann-by
the doc is very clearly. The numeric values would be parsed as number of units (defined by unit) since this reference date.
At least to me this does not clarify if the reference date itself should be parsed as number of units as well. Especially with Unix time this would not really make sense, because Unix time is per definition always in seconds.