Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

def main():

    group = pd.DataFrame({'code': [ 'group_value' ]})

    test = pd.read_csv('./test_csv_2.csv')

    # If this line is used instead of read_csv, no issues
    # test = pd.DataFrame({'run_number': [20], 'group': ['group_code'], 'test': ['NaN'], 'test1': ['NaN'], 'test2': ['NaN'], 'test3': ['NaN'], 'test4': ['NaN'], 'test5': ['NaN']})

    newDf = pd.DataFrame().reindex_like(test)

    newDf['group'] = group['code']

    newDf.columns.values[5:7] = ['curr', 'comp']

    print(newDf)

main()

Issue Description

When running the given code, I sometimes run into a bus error:

See:

Bus Error
Fatal Python error: Bus error

Current thread 0x00000001e26c8100 (most recent call first):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 5010 in __contains__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 907 in _get_formatter
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 946 in <listcomp>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 945 in _get_formatted_column_labels
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 864 in _get_strcols_without_index
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 611 in get_strcols
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 31 in _get_strcols
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 40 in _get_string_representation
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 25 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1128 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 1192 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 1011 in __repr__
  File "/Users/dovikaplan/pandas_crash_test/Basys Files/Remaining CSV files.py", line 21 in main
  File "/Users/dovikaplan/pandas_crash_test/Basys Files/Remaining CSV files.py", line 23 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.strptime, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pandas._libs.ops, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json (total: 54)
zsh: bus error  python3.10 Remaining\ CSV\ files.py

and

Seg Fault
Fatal Python error: Segmentation fault

Current thread 0x00000001e26c8100 (most recent call first):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 5291 in __contains__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 915 in _get_formatter
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 954 in <listcomp>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 953 in _get_formatted_column_labels
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 872 in _get_strcols_without_index
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 617 in get_strcols
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 36 in _get_strcols
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 45 in _get_string_representation
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/string.py", line 30 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1136 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 1245 in to_string
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 1064 in __repr__
  File "/Users/dovikaplan/pandas_crash_test/Basys Files/Remaining CSV files.py", line 21 in main
  File "/Users/dovikaplan/pandas_crash_test/Basys Files/Remaining CSV files.py", line 23 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pandas._libs.ops, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json (total: 54)
zsh: segmentation fault  python3.10 Remaining\ CSV\ files.py

Occasionally, however, it does work, but other times I get a segfault.

The contents of the csv file I import is:

run_number,group,test,test2,test3,test4,test5
20,group_value,,,,,

If I uncomment the line that manually creates the df, instead of pulling from csv, I get no issues, so I'm thinking that where the issue comes from.

Expected Behavior

No segfault or bus errors

Installed Versions

commit : 2e218d10984e9919f0296931d92ea851c6a6faf5 python : 3.10.4.final.0 python-bits : 64 OS : Darwin OS-release : 22.3.0 Version : Darwin Kernel Version 22.3.0: Thu Jan 5 20:48:54 PST 2023; root:xnu-8792.81.2~2/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : None.UTF-8 pandas : 1.5.3 numpy : 1.23.1 pytz : 2022.1 dateutil : 2.8.2 setuptools : 58.1.0 pip : 22.3.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.5.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.2 numba : None numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : None tables : None tabulate : 0.8.10 xarray : None xlrd : None xlwt : None zstandard : None tzdata : None

Comment From: phofl

Hi, thanks for your report. Can reproduce on 1.5.3 but not on main. Could you try on the rc for 2.0 as well?

Comment From: Droidking18

Thank you, tested on 2.0.0 rc as well as 2.1.0.dev0+119.gb3913977ee. Can reproduce on both my end

Comment From: Droidking18

Just to add in, sometimes I can run it quite a few times before it actually segfaults/bus errors, other times it errors out right away

Comment From: phofl

Running it a couple of times reproduces for me as well. In general, what you are doing with your columns isn't recommended. An Index is considered immutable, hence this can get weird, meaning updating the underlying array