Pandas PERF: x.loc[x > 0.5] = 2 * x is 2x slower than x[x > 0.5] = 2 * x

Pandas version checks

[X] I have checked that this issue has not already been reported.
[X] I have confirmed this issue exists on the latest version of pandas.
[X] I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

Pandas v1.4.2, Numpy v1.22.3

# Masking by a vector
v = np.random.rand(100000)
# Need to reset `pd.Series` after each run
%timeit x = pd.Series(v, copy=True); x[x > 0.5] = 2 * x
%timeit x = pd.Series(v, copy=True); x.loc[x > 0.5] = 2 * x

1.18 ms ± 24 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.21 ms ± 42.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Usually, x.loc[...] is faster than x[...] when subsetting or masking by a scalar. So I wonder if there is a miss fast-path implementation here?

Installed Versions

Replace this line with the output of pd.show_versions()

Prior Performance

No response

Comment From: MarcoGorelli

Thanks for the report - seems like quite a bit of time is spent in

https://github.com/pandas-dev/pandas/blob/ddf2541df866e89150210d41c22e45eb2cf83e91/pandas/core/indexes/range.py#L399-L427

which can probably be improved

Comment From: MarcoGorelli

Had a look at this, and it's hard to get a consistent improvement without affecting some other benchmarks

Do you have a use-case where this difference makes a noticeable difference? If not, I'd be tempted to close, not sure it's worth optimising what's already only taking a few milliseconds

Comment From: phofl

This is consistent on main, so closing