Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

I want to use a binary blob data type (just an array of bytes) in a Series. I'm converting python list to `bytes()` and using pyarrow `pa.binary()` data type (Series/DataFrame will be later handed over to pyarrow).

It seems that if the binary bytes string contains zero bytes at the and they get chopped off after creating Series. Maybe there is a better way of doing this.


import pandas as pd
import pyarrow as pa
import array as ar


>>> pd.ArrowDtype(pa.binary())
binary[pyarrow]

>>> ints3 = [123, 445, 0, 0]
>>> a = ar.array('l', ints3)

>>> binary = memoryview(a).tobytes()
>>> binary
b'{\x00\x00\x00\x00\x00\x00\x00\xbd\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

>>> ps = pd.Series(binary, dtype='binary[pyarrow]')
>>> ps
0    b'{\x00\x00\x00\x00\x00\x00\x00\xbd\x01'
dtype: binary[pyarrow]

Going directly to pyarrow seems to work as expected:

>>> pa.array([binary], type=pa.binary())
<pyarrow.lib.BinaryArray object at 0x7f883c84eb20>
[
  7B00000000000000BD0100000000000000000000000000000000000000000000
]


### Issue Description

Data lost due to trimming.

### Expected Behavior

No trimming, no data loss.

### Installed Versions

<details>

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : 2e218d10984e9919f0296931d92ea851c6a6faf5
python           : 3.8.10.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.4.0-136-generic
Version          : #153-Ubuntu SMP Thu Nov 24 15:56:58 UTC 2022
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.5.3
numpy            : 1.24.2
pytz             : 2022.7.1
dateutil         : 2.8.2
setuptools       : 44.0.0
pip              : 20.0.2
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : None
IPython          : None
pandas_datareader: None
bs4              : None
bottleneck       : None
brotli           : None
fastparquet      : None
fsspec           : None
gcsfs            : None
matplotlib       : None
numba            : None
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 11.0.0
pyreadstat       : None
pyxlsb           : None
s3fs             : None
scipy            : None
snappy           : None
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
zstandard        : None
tzdata           : None

</details>


**Comment From: mroeschke**

Thanks for the report but this is just the `repr` being truncated. The data is still there

In [1]: import pandas as pd ...: import pyarrow as pa ...: import array as ar

In [2]: pd.ArrowDtype(pa.binary()) Out[2]: binary[pyarrow]

In [3]: ints3 = [123, 445, 0, 0]

In [4]: a = ar.array('l', ints3)

In [5]: binary = memoryview(a).tobytes()

In [6]: ps = pd.Series(binary, dtype='binary[pyarrow]')

In [7]: pd.options.display.max_seq_items = None

In [8]: pd.options.display.width = 999

In [9]: ps Out[9]: 0 b'{\x00\x00\x00\x00\x00\x00\x00\xbd\x01\x00\x0... dtype: binary[pyarrow]

In [10]: ps.array._data Out[10]: [ [ 7B00000000000000BD0100000000000000000000000000000000000000000000 ] ] ```

Closing as a repr usage question