Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas
df = pandas.DataFrame(data=[[1234567890123456789]], columns=['test'])
df.style.to_html()
df.style.to_latex()

Issue Description

If the DataFrame contains integers with more digits than can be represented by floating point double precision, The default Styler (DataFrame.style) will display the wrong value:

html: <style type="text/css">\n</style>\n<table id="T_2d9c5">\n <thead>\n <tr>\n <th class="blank level0" >&nbsp;</th>\n <th id="T_2d9c5_level0_col0" class="col_heading level0 col0" >test</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th id="T_2d9c5_level0_row0" class="row_heading level0 row0" >0</th>\n <td id="T_2d9c5_row0_col0" class="data row0 col0" >1234567890123456768</td>\n </tr>\n </tbody>\n</table>\n

latex: \\begin{tabular}{lr}\n & test \\\\\n0 & 12345678901234567168 \\\\\n\\end{tabular}\n

In both cases, the displayed value is 12345678901234567168 instead of 1234567890123456789. I suspect that there is an internal lossy conversion to floating point.

It does not seem to matter, whether the dtype is int64, uint64, Int64, UInt64, or object

Expected Behavior

The integers should be displayed correctly. In pandas 1.1.4, this does work as expected:

<style type="text/css" >\n</style><table id="T_100b2b29_ce31_11ed_9d71_00505692da1d" ><thead> <tr> <th class="blank level0" ></th> <th class="col_heading level0 col0" >test</th> </tr></thead><tbody>\n <tr>\n <th id="T_100b2b29_ce31_11ed_9d71_00505692da1dlevel0_row0" class="row_heading level0 row0" >0</th>\n <td id="T_100b2b29_ce31_11ed_9d71_00505692da1drow0_col0" class="data row0 col0" >1234567890123456789</td>\n </tr>\n </tbody></table>

Installed Versions

tested versions (broken): pandas 1.5.3 pandas 2.1.0.dev0+337.g8c7b8a4f3e

tested versions (fine): pandas 1.1.4

INSTALLED VERSIONS ------------------ commit : 8dab54d6573f7186ff0c3b6364d5e4dd635ff3e7 python : 3.10.8.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-132-generic Version : #148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.5.2 numpy : 1.24.1 pytz : 2022.7.1 dateutil : 2.8.2 setuptools : 66.0.0 pip : 22.3.1 Cython : None pytest : 7.2.1 hypothesis : 6.62.1 sphinx : 6.1.3 blosc : None feather : None xlsxwriter : 3.0.7 lxml.etree : 4.9.2 html5lib : 1.1 pymysql : None psycopg2 : 2.9.3 jinja2 : 3.1.2 IPython : 8.8.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.2 numba : None numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.10.0 snappy : None sqlalchemy : 1.4.46 tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : 2022.7

Comment From: hxy450

take

Comment From: ZackOudeif

Take

Comment From: attack68

The suspicion of lossy conversion to float is correct.

The current _default_formatter of pandas\io\formats\style_render.py (approx line 1720) has the following:

    elif is_integer(x):
        return f"{x:,.0f}" if thousands else f"{x:.0f}"

This is therefore a python f-string conversion doing the work. A fix would be to change to:

    elif is_integer(x):
        return f"{x:,.0f}" if thousands else str(x)

The str function does not lose the precison.

Comment From: attack68

(And an even better fix would be to correct the case for thousands separator without lossy precision - but this is much more complciated and a separate PR for that would be recommended)

Comment From: asishm-wk

(And an even better fix would be to correct the case for thousands separator without lossy precision - but this is much more complciated and a separate PR for that would be recommended)

In [18]: a = 1234567890123456789

In [19]: f"{a:,}"
Out[19]: '1,234,567,890,123,456,789'

Wouldn't this work?

Comment From: attack68

Apparently so.

:-) Just testing.

Comment From: lusolorz

take