Assigning a value to a single location in a DataFrame (using .loc with scalar indexers) started to fail with "values with a length".
Consider the following example:
In [1]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [(1, 2), (1, 2, 3), (3, 4)]})
In [2]: df
Out[2]:
a b
0 1 (1, 2)
1 2 (1, 2, 3)
2 3 (3, 4)
In [3]: df.loc[0, 'b'] = (7, 8, 9)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-a2d59e11519a> in <module>
----> 1 df.loc[0, 'b'] = (7, 8, 9)
~/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
187 key = com._apply_if_callable(key, self.obj)
188 indexer = self._get_setitem_indexer(key)
--> 189 self._setitem_with_indexer(indexer, value)
190
191 def _validate_key(self, key, axis):
~/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
604
605 if len(labels) != len(value):
--> 606 raise ValueError('Must have equal len keys and value '
607 'when setting with an iterable')
608
ValueError: Must have equal len keys and value when setting with an iterable
This raises on 0.23.4 - master, but worked in 0.20.3 - 0.22.0 (the ones I tested):
In [1]: pd.__version__
Out[1]: '0.22.0'
In [2]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [(1, 2), (1, 2, 3), (3, 4)]})
In [3]: df
Out[3]:
a b
0 1 (1, 2)
1 2 (1, 2, 3)
2 3 (3, 4)
In [4]: df.loc[0, 'b'] = (7, 8, 9)
In [5]: df
Out[5]:
a b
0 1 (7, 8, 9)
1 2 (1, 2, 3)
2 3 (3, 4)
Related to https://github.com/pandas-dev/pandas/issues/25806
We don't have very robust support in general for list-like values, but, for the specific case above of updating a single value, I don't think there is anything ambiguous about it? You are updating a single value, so the passed value should simply be put in that place?
Note, the above is with tuples. But, there are also custom objects like MultiPolygons that represent single objects, but do define a __len__ ..
Comment From: jorisvandenbossche
There are several potential issues with assigning list-like values, see eg the issues being linked to here: https://github.com/pandas-dev/pandas/issues/19590#issuecomment-364079529
However, most of those cases occur when assigning to multiple elements at once (unpack the list?), while here the arguments to loc are both scalars.
Probably caused by https://github.com/pandas-dev/pandas/pull/20732
Comment From: elfmanryan
I am getting the same issue as the multipolygon seen as equivalent to a list of polygons. my work around (to assign a MP to a single row) is to wrap the MP in a list first. I am running a try except to catch the error, something like this:
from shapely.geometry import Multipolygon
try:
geopandas_dataframe_one_row['geometry'] = Multipolygon
except ValueError:
geopandas_dataframe_one_row['geometry'] = [Multipolygon]
Hope that's helpful to someone.
Comment From: jklatt
Hi! Any solution in sight? Also encountering the issue when wanting to assign MultiPolygons and the list work-around as well as ".values" work-around do not seem to help... Thankful for any hint!
Comment From: koshy1123
@jklatt I was able to resolve by roundtripping a shapely geo through geopandas:
gdf.loc[scalar_index_loc, 'geometry'] = geopandas.GeoDataFrame(geometry=[shapely_geo]).geometry.values
This seems to avoid the ValueError
Comment From: jklatt
Works like a charm! Thank you Thomas :)
Comment From: elizabethswkim
Thanks, @koshy1123 ! Your roundtripping shapely geo via geopandas worked for me!
Comment From: bramson
I've tried various version of the workaround for this problem, but I still can't get anything to work.
What should work:
someData.at[index,'geometry'] = thisGeom
Some workarounds attempted:
someData.loc[index, 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values someData.loc[index, 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values[0] someData.loc[index, 'geometry'] = [geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values[0]]
someData.loc[[index],'geometry'] = geopandas.GeoSeries([thisGeom]).values
All give
ValueError: Must have equal len keys and value when setting with an iterable
or
ValueError: Must have equal len keys and value when setting with an ndarray
Is there any updated solution or working workaround?
Comment From: bennlich
@bramson I think the one that worked for me is not in your list:
someData.loc[[index], 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values
:disappointed:
Comment From: hadim
I am having the same issue as well. So far the only solution for me was to build a complex .apply(lambda x: xxx) based hack as a workaround.
Comment From: oboklob
I found that if the geopandas DataFrame contains only a geometry column the problem goes away. As soon as an additional column has values the above issue arises.
So my solution was to split the data into two Dataframes whilst populating geometry, one with only geometry and a second with additional data (using the same indexes). Then combine the two dataframes afterwards. Not ideal, but a work around if you are struggling.