Pandas version checks
- [X] I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
https://pandas.pydata.org/docs/user_guide/basics.html#dtypes
Documentation problem
In the dtypes section a list of integer types are given, but not floating point types. According to pandas.to_numeric()
under the downcast
argument, it appears float32
is the smallest floating point type, which implies float16
is not documented.
In addition, the integer types are listed with capitalization, but it appears that capitalization does not matter, which is not documented.
Suggested fix for documentation
- List supported floating point types.
- Mention
float16
not supported. (Could be implicitly solved by point 1.) - Mention entries are not case-sensitive.
- Link to NumPy types.
Resulting from https://github.com/pandas-dev/pandas/issues/50213.
Comment From: phofl
Capitalization matters. Int64 is an extension dtype while int64 is the numpy dtype. But this is documented
Comment From: mcp292
Thank you for clarifying. Can you please link to documentation?
Comment From: phofl
The table listing all int dtypes in your link already has a link to nullable integer. One of the first sentences there clarifies the capital I at the beginning
Comment From: MarcoGorelli
Do we just need to add a row with FloatArray
then? I think this'd be fine - do you want to open a pull request? Here's the contributing guide https://pandas.pydata.org/docs/dev/development/contributing.html
Comment From: mcp292
The table listing all int dtypes in your link already has a link to nullable integer. One of the first sentences there clarifies the capital I at the beginning
I'm still not seeing the sentence you refer to. Can you please quote it?
Comment From: mcp292
Do we just need to add a row with
FloatArray
then? I think this'd be fine - do you want to open a pull request? Here's the contributing guide https://pandas.pydata.org/docs/dev/development/contributing.html
Sure! It'll be a few days before I can make time for this. Thank you for the link.
Comment From: MarcoGorelli
I'm still not seeing the sentence you refer to. Can you please quote it?
It's probably this one
Or the string alias "Int64" (note the capital "I", to differentiate from NumPy’s 'int64' dtype:
(btw, looks like there's a missing closed bracket)
Comment From: mcp292
Thank you! Would it be acceptable to add the closing parenthesis in the same PR?
Comment From: MarcoGorelli
yup 😄
Comment From: mcp292
Looks like there's no entry for FloatingArray
.
I'm trying to track down a list of the available FloatingArray
types. I'm looking first to track down the IntegerArray
types listed on the dtypes page, so that I may find the same for FloatingArray
.
Do you know where I can find this documentation?
Comment From: mcp292
I noticed that UInt8
warns against information loss due to conversion to a shorter type, but uint8
does not:
import pandas as pd
data = [255, 256, 257, 258, 259]
df = pd.DataFrame(data)
df.astype("uint8")
df.astype("UInt8")
This advantage is not documented on the nullable integer page. The only advantage listed there is the preservation of integer type in the presence of NaN
s.
Is this worth a separate issue/PR?
Comment From: natmokval
take
Comment From: mcp292
take
Thank you, I've been too busy to address this.
Comment From: vijaybirju
Is this issue fixed?
Comment From: MarcoGorelli
hey - there's already a PR open, could you find another issue to work on please?
Comment From: vijaybirju
Ok thanks so in one issue only one person can do pull requests at a single time