Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
data = {'A': [-2, -1, 0, 1, 2], 'B': [-2, 1, 0, 1, 2]}
df = pd.DataFrame(data)
df
A B
0 -2 -2
1 -1 1
2 0 0
3 1 1
4 2 2
df.clip(lower=10, upper=[1,1])
A B
0 10 10
1 10 10
2 10 10
3 10 10
4 1 1
Issue Description
This is undocumented behavior; but when upper is < lower, and upper is not a scalar the resulting data frame is not consistently producing the same results. If lower is an array, and upper is a scalar the values are fixed at whatever the scalar upper is.
I'm not sure if pandas should simply throw an error when the bounds are inverted, or what - but some consistency would be helpful.
Expected Behavior
Either throw an error when upper < lower or have a consistent way to handle the result between scalar and list-like things.
Installed Versions
Comment From: m-ganko
take
Comment From: m-ganko
Ok, I tested clip method and it is quite a mess.
>>> import pandas as pd
>>> import numpy as np
>>> data = {'A': range(10), 'B': range(10)}
>>> df = pd.DataFrame(data)
>>> df
A B
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
When lower
< upper
it works as expected:
>>> expected = df.clip(lower=3, upper=7)
>>> expected
A B
0 3 3
1 3 3
2 3 3
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 7 7
9 7 7
>>> all(expected == df.clip(lower=3, upper=[7, 7]))
True
>>> all(expected == df.clip(lower=[3, 3], upper=7))
True
When we swap lower
and upper
, we get 3 different results.
>>> df.clip(lower=7, upper=[3, 3]) # 1
A B
0 7 7
1 7 7
2 7 7
3 7 7
4 3 3
5 3 3
6 3 3
7 3 3
8 3 3
9 3 3
>>> df.clip(lower=[7, 7], upper=3) # 2
A B
0 3 3
1 3 3
2 3 3
3 3 3
4 3 3
5 3 3
6 3 3
7 3 3
8 3 3
9 3 3
>>> df.clip(lower=7, upper=3) # 3
A B
0 3 3
1 3 3
2 3 3
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 7 7
9 7 7
- Case # 1 - for this one I don't have explanation, it is wrong result
- Case # 2 - this one works same as numpy.clip, when
lower
>upper
all values are equal toupper
. - Case # 3 - here
lower
andupper
are swapped, but this behavior is not documented.
In my opinion, there are two reasonable approaches:
1. Follow numpy
approach (# 2).
2. Raise exception when lower
> upper
.
I will start coding first one, to be consistent with numpy. If you have other idea, discussion is welcome.