We currently expose all of the following items from pandas.core.api via the API:
from pandas.core.api import (
# dtype
Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype, UInt8Dtype,
UInt16Dtype, UInt32Dtype, UInt64Dtype, CategoricalDtype,
PeriodDtype, IntervalDtype, DatetimeTZDtype,
# missing
isna, isnull, notna, notnull,
# indexes
Index, CategoricalIndex, Int64Index, UInt64Index, RangeIndex,
Float64Index, MultiIndex, IntervalIndex, TimedeltaIndex,
DatetimeIndex, PeriodIndex, IndexSlice,
# tseries
NaT, Period, period_range, Timedelta, timedelta_range,
Timestamp, date_range, bdate_range, Interval, interval_range,
DateOffset,
# conversion
to_numeric, to_datetime, to_timedelta,
# misc
np, Grouper, factorize, unique, value_counts, NamedAgg,
array, Categorical, set_eng_float_format, Series, DataFrame,
Panel)
A pseudo-prioritized list of annotations I think we need out of this would be the below. Open to suggestions on how to prioritize and obviously community PRs are very welcome!
- [ ] DataFrame
- [ ] Series
- [ ] Index
- [ ] MultiIndex
- [ ] Categorical
- [ ] CategoricalIndex
- [ ] Datetimelike indices
- [ ] Numeric indices
...
These don't necessarily need to be completed in order. Will continue to expand checklist as we tackle more items so if you see something you'd like to tackle feel free to call it out
Comment From: WillAyd
@vaibhavhrt @gwrome
Comment From: vaibhavhrt
@WillAyd I am not sure what exactly needs to be done. Do we need to annotate every attribute/method of class DataFrame, Index, etc.? Can you provide some example(mypy docs or SO link maybe)
Comment From: WillAyd
Yes that's correct - would want to add annotations to the methods for these objects (and attributes where inference may not work)
Comment From: vaibhavhrt
Alright, I will start with DataFrame, since it's one of the few things in pandas I use in my projects. I will open a new issue for it.
On Tue, 11 Jun 2019, 12:23 am William Ayd, notifications@github.com wrote:
Yes that's correct - would want to add annotations to the methods for these objects (and attributes where inference may not work)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/26766?email_source=notifications&email_token=AFMJMLEF6TVCQMNDJMFMZDTPZ2PJDA5CNFSM4HWTSJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXK3MQQ#issuecomment-500545090, or mute the thread https://github.com/notifications/unsubscribe-auth/AFMJMLG2ABH4VDJNDYY4JVTPZ2PJDANCNFSM4HWTSJSQ .
Comment From: TomAugspurger
I'm unpinning this to make room for https://github.com/pandas-dev/pandas/issues/31879. Will re-pin tomorrow.
Comment From: finete
@WillAyd I am not quite sure that I understand the meaning of this issue. Is it about creating something like this?
class MyDataFrame(pandas.DataFrame):
col_foo: datetime.datetime
def func(df: MyDataFrame):
df['col_foo'].dt
func(MyDataFrame()) # mypy passes
func(pd.Dataframe(columns=['col_foo'] , dtype=np.datetime64)) # mypy passes?
func(pd.Dataframe(columns=['col_foo'])) # mypy raises error?
I was looking for something that imitates the dataclass
\ NamedTuple
usage api
Comment From: WillAyd
We can close this issue; I haven't tracked it in quite some time
Comment From: finete
@WillAyd which issue tracks development of something similar to the type annotations I have mentioned above?
Comment From: karlicoss
I'd be interested in something like this, obviously it would be very hard to type everything properly and conveniently (considering how dynamic pandas can be), but for simple usecases, having some helpers which exploit Literal
and TypedDict
could cover many cases where typing is desirable.
Comment From: WillAyd
Sounds good. Feel free to submit PRs to improve annotations - they are always welcome
Comment From: jbrockmendel
I think this has served its purpose. Closing.