Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from io import StringIO
import pandas as pd
bug = """13-04-2023 16.2 16.7 16.8 3.9 4.5 6:25"""
dtypes = {
'Datum': "string",
'Temp_max': 'float32',
'Temp_min': 'float32',
'Temp_gem': 'float32',
'Neerslag': 'float32',
'Wind': 'float32',
'Zonneschijn': "string",
}
data = pd.read_csv(
StringIO(bug),
names=dtypes.keys(),
decimal=".",
delimiter=r"\s+",
dtype=dtypes,
)
display(data)
Returns :
Datum Temp_max Temp_min Temp_gem Neerslag Wind Zonneschijn
0 13-04-2023 16.200001 16.700001 16.799999 3.9 4.5 6:25
Issue Description
Why is 16.2 converted to 16.200001 ? or 16.7 to 16.700001.
Confirmed to work with: - Bugs : to "float32", "Float32" - is fine for "Float64', 'float64'
Is bugged whether:
- we pass it under dtype argument to pd.read_csv
- or use pd.DataFrame.astype
Expected Behavior
No additional decimals should be added outside 00's.
I noticed the same problem for other numbers in the range 16 to 17
Installed Versions
Comment From: atharvaagate
take
Comment From: ljmc-github
@MCRE-BE this looks like a generic floating point math arithmetic problem.
See https://0.30000000000000004.com/
Comment From: MCRE-BE
Hmmm indeed, never noticed it before. I don't know if the maintainers want/can fix it, but it's confusing anyway. 🫣
Comment From: rhshadrach
Agreed @ljmc-github - for numbers in [16, 32), float32 has a precision of 2^-19, which is greater than 0.000001. If you want more precision you need to use float64.