Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import numpy as np
import pandas as pd
df = pd.DataFrame({1: [1, 2, 3], 2: [np.zeros(3), np.ones(3), np.zeros(0)]})
print(df.dtypes)
df.loc[1, 2] = np.zeros(3)
Issue Description
The loc assign throws a "ValueError: Must have equal len keys and value when setting with an iterable" despite dtypes being object for column 2. This started to happen when I updated Pandas from 1.4.4 -> 2.2.1. In 1.4.4 this syntax worked. This issue also happens when assigning multiple columns at the same time as soon as one of the new values is an iterable.
Expected Behavior
The assign should set the second row to np.array of zeros.
Installed Versions
Comment From: dimitsev
@isabelladegen The exact same problem appears on the main branch of pandas 2.2.2 if you do df.loc[1, 2] = list(np.zeros(3))
. Please update your title that the problem happens with any iterable, not just a numpy array. Please also tick the checkbox that "this bug exists on the main branch of pandas".
Comment From: onejgordon
Just ran into this today with 2.2.2. Is anyone aware of a workaround?
Comment From: isabelladegen
@onejgordon This is how I worked around it. Warning it is ugly. For the above toy example, instead of indexing the row and column(s) to update the cell like this:
df.loc[1, 2] = np.zeros(3)
I update the whole row like this:
df.loc[1] = {1:2, 2:np.zeros(3)}
Comment From: onejgordon
Thanks for your response @isabelladegen.
It appears one of the factors influencing my case was a non-unique index. Dropping duplicates from the index column resolved the error and I was able to set both arrays and individual cells with:
df.loc[3, 'arr'] = np.array([1, 2])
and
df.loc[[2, 4, 6], 'arr'] = np.array([1, 2])
Comment From: saeub
Using df.at
instead of df.loc
seems to do the job if you only want to set a single value:
df.at[1, 2] = np.zeros(3)
Comment From: AlastairKelly
I'm having the same problem with df.at, unfortunately.
df.at[row.Index,"pmids_old_before"] = results.get('idlist')
is generating the same error as
df.loc[row.Index,"pmids_old_before"] = results.get('idlist')
Oddly, this was working fine for me until today, and I haven't changed anything in my coding environment, so I'm very puzzled about the break. Also, it still works on two dataframes before suddenly generating the error on the third that uses this function. I can't figure out any salient differences. The function creates and initializes this column with None values before trying to make this assignment, so it should be identical conditions for each dataframe.
Comment From: saeub
@AlastairKelly does your column pmids_old_before
have dtype=object
?
Comment From: JakeHightower
Using
df.at
instead ofdf.loc
seems to do the job if you only want to set a single value:
python df.at[1, 2] = np.zeros(3)
@saeub method worked for me, while having identical issue listed here. Ty.
Comment From: tcourat
Same issue in panda 2.2.2 and nothing works (.at or wrapping with a list)