Installation check

Platform

Linux-5.10.0-11-amd64-x86_64-with-glibc2.31

Installation Method

pip install

pandas Version

1.4.0

Python Version

3.9

Installation Logs

$ pip install --upgrade pandas
Requirement already satisfied: pandas in /home/work/.local/lib/python3.9/site-packages (1.3.5)
Collecting pandas
  Downloading pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
     |████████████████████████████████| 11.7 MB 8.5 MB/s 
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/lib/python3/dist-packages (from pandas) (2.8.1)
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas) (2021.1)
Requirement already satisfied: numpy>=1.18.5 in /usr/lib/python3/dist-packages (from pandas) (1.19.5)
Installing collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.3.5
    Uninstalling pandas-1.3.5:
      Successfully uninstalled pandas-1.3.5
Successfully installed pandas-1.4.0
$ ipython3 
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.20.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pandas as pd
/home/work/.local/lib/python3.9/site-packages/pandas/compat/_optional.py:149: UserWarning: Pandas requires version '1.3.1' or newer of 'bottleneck' (version '1.2.1' currently installed).
  warnings.warn(msg, UserWarning)

Description

Pandas 1.4 requires bottleneck >= 1.3.1, but pip does not know about this dependency. Therefore warnings are raised, which causes a lot of CI pipelines to fail in my projects which make use of Pandas.

I believe that the dependency should be reflected in the setup scripts as well.

Comment From: roib20

This is not the only optional dependency that pandas has. The installation guide you linked shows that this dependency is recommended but still optional. See Dependencies; although bottleneck is listed under Recommended dependencies, it is not one of the three Dependencies that are strictly required (NumPy, python-dateutil and pytz). Most pandas functionality can be used without additional libraries besides those three. I believe the aim of pandas is to install with only a minimal set of libraries and allow users to install additional libraries if they want the additional functionality.

Comment From: mroeschke

@roib20 is correct. bottleneck is an optional dependency, so the user is responsible for installing the minimum suggested version of the optional libraries.

Comment From: roib20

This issue is closed but for future reference - pandas 2.0.0 will include this enhancement: "Installing optional dependencies with pip extras".

For example: pip install "pandas[performance]" will install pandas together with three performance dependencies: numexpr, bottleneck and numba.