While working in https://github.com/pandas-dev/pandas/pull/29799, it was quite hard to add a new return value from lib.infer_dtype
. There are many places where we do something like
inferred_dtype = lib.infer_dtype(values)
if inferred_dtype == "string":
...
if inferred_dtype in {"datetime", "integer"}
...
It was hard to grep for these. Could we instead define an enum
class InferredType(enum.Enum):
STRING: "string"
INTEGER: "intger"
MIXED_INTEGER: "mixed-integer"
...
And then return InferredType.INTEGER
, and update all our checks for
if inferred_dtype == InferredType.INTEGER
I think this should be entirely backwards compatible, since we're still just returning the string.
Comment From: jreback
yep this is generally a good idea
Comment From: alimcmaster1
Sounds good this would be neat.
I guess the check would be
if inferred_dtype == InferredType.INTEGER.value
I can potentially take a look at this is we feel its worth doing.
Comment From: theoniko
Hello @TomAugspurger, @mroeschke, @alimcmaster1, @jreback Could you please have a look to https://github.com/pandas-dev/pandas/pull/52517 and give me feeback pr is in the right direction? I am also not sure about https://github.com/pandas-dev/pandas/blob/b1dbbe08fd958afd8df4bbec437215daf2773c06/pandas/_libs/lib.pyx#L1522?
Comment From: theoniko
Could somebody have a look to my pr and give feedback?