It would be great if you could annotate a series, for example with -

def process_series(ser: pd.Series[int, str]) -> pd.Series[int, int]:

to indicate that the function should accept a string-valued series with an integer index, and output int-valued series with an integer index. It could be even better if you would build on that such that the series member's types could be inferred using TypeVars (for example, pd.Series[int,str].iloc would accept an integer and return a string), but that's not necessary - just a bonus or a later milestone.

To do that, you could add a new module (maybe "pandas.typing"?) that would contain these type-hints and would require minimal integration (if any) into the pandas' infrastructure. There's a similar package for "numpy" that's external to it called nptyping that could be used as a reference.

Comment From: simonjayhawkins

Thanks @erezinman for the report. Adding this would certainly be easier with stubs, but we have not yet reached a decision on stubs, xref #28142.

To do this with type annotations in the code, we would need to use typing.Generic

Thanks for the link. I suspect we would follow NumPy conventions for the type parameters. https://github.com/numpy/numpy-stubs

This might be as simple as writing np.ndarray[np.float64], but will need a decision about appropriate syntax for shape typing to ensure that this is forwards compatible with typing shapes.

In pandas we would also need to allow Series to be backed by pandas Extension Arrays for the values and index, so I suspect the type parameters would be numpy/pandas extension dtypes and not Python types

so maybe something like pd.Series[pd.Int64Dtype(),pd.StringDtype()].

This would be cumbersome so having pd.Series[int, str]) represent the same thing would certainly be welcome from a user perspective.

Further investigation and PRs welcome.

Comment From: simonjayhawkins

fixed in https://github.com/pandas-dev/pandas-stubs/commit/ba7aa5f0d5f629904c3a6d4030fd484b0cdb8047

Comment From: erezinman

Hi, @simonjayhawkins Could you please elaborate how this commit solves the issue? I looked at the referred commit and couldn't see how. Thanks.

Comment From: simonjayhawkins

IIUC Series and Index are now generic wrt dtype in https://github.com/pandas-dev/pandas-stubs

@Dr-Irv is this documented anywhere how to use these?

For DataFrame (not explicitly mentioned in this issue), there is an open issue https://github.com/pandas-dev/pandas-stubs/issues/295

Comment From: Dr-Irv

IIUC Series and Index are now generic wrt dtype in https://github.com/pandas-dev/pandas-stubs

Right now, only Series is generic. We have open issues to investigate making Index generic.

@Dr-Irv is this documented anywhere how to use these?

What we have written up is here: https://github.com/pandas-dev/pandas-stubs/blob/main/docs/philosophy.md#use-of-generic-types

Note that if you want to specify an annotation like def f(x: Series[int]) -> None, you probably have to surround the type declaration with quotes, i.e. def f(x: "Series[int]") -> None because at runtime, the generic declaration is unknown.

It's worth mentioning that the pandera project supports the generic types at both a typing level and at runtime. See https://pandera.readthedocs.io/en/stable/schema_models.html . I haven't used this, but was made aware of it by the pandera authors when they reported some issues with pandas-stubs