Code Sample, a copy-pastable example if possible
Both of the examples below fail with the same error
df = pd.DataFrame(index=[0, 1, 2], columns=['a', 'b'])
df.loc[0, 'a'] = dict(x=2)
df.iloc[0, 0] = dict(x=2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-282-62f3ee5ff885> in <module>()
1 # file_map.loc[file_no, 'Q_step_length'] = dict(a=1)
2 df = pd.DataFrame(index=[0, 1, 2], columns=['a', 'b'])
----> 3 df.iloc[0, 0] = dict(x=2)
4 df['a'] = df['a'].apply(lambda x: x[0] if not pd.isnull(x) else x)
5 df
...\lib\site-packages\pandas\core\indexing.py in __setitem__(self, key, value)
177 key = com._apply_if_callable(key, self.obj)
178 indexer = self._get_setitem_indexer(key)
--> 179 self._setitem_with_indexer(indexer, value)
180
181 def _has_valid_type(self, k, axis):
...\lib\site-packages\pandas\core\indexing.py in _setitem_with_indexer(self, indexer, value)
603
604 if isinstance(value, (ABCSeries, dict)):
--> 605 value = self._align_series(indexer, Series(value))
606
607 elif isinstance(value, ABCDataFrame):
...\lib\site-packages\pandas\core\indexing.py in _align_series(self, indexer, ser, multiindex_indexer)
743 return ser.reindex(ax)._values
744
--> 745 raise ValueError('Incompatible indexer with Series')
746
747 def _align_frame(self, indexer, df):
ValueError: Incompatible indexer with Series
This works, but is placing a list into the dataframe
df[0, 'a'] = [dict(x=2)]
It is possible to get the dict directly in the dataframe by using a very inelegant construct like this:
df['a'] = df['a'].apply(lambda x: x[0] if not pd.isnull(x) else x)
Problem description
Since it is possible to store a dict in a dataframe, trying an assignment as above should not fail. I am aware that df.loc[...] = dict(...) will assign values in the dict to the corresponding columns if present (is that documented?) and has its own issues but this behaviour should not apply when accessing a single location of the dataframe
Expected Output
A dataframe with a dict inside the specified location.
Output of pd.show_versions()
Comment From: jreback
this is pretty non-idiomatic, and you are pretty much on your own here. you could do it by just using a list/tuple around it
In [14]: df.loc[0, 'a'] = [dict(x=2)]
In [15]: df
Out[15]:
a b
0 [{'x': 2}] NaN
1 NaN NaN
2 NaN NaN
Comment From: aaclayton
Encountered the same issue, had two thoughts:
Storing a dict within a DataFrame is unusual, but there are valid cases where software may be using Pandas as a way to represent and manipulate arbitrary key/value style data where the data is indexed in a way that makes sense for panel representation.
The behavior that location based indexing will update columns based on the keys/values of a provided dictionary was a surprise to me. This is a cool convenience feature that makes sense when an explicit column is not referenced. For example, when providing:
df.loc[row, :] = dict(key1=value1, key2=value2)
It makes sense that the keys of the dictionary might be written as columns and that df.loc[row, key1] == value1
. However, when providing an explicit column index, inferring the target columns from a provided dictionary is (to me) counter-intuitive. If I instead supply:
df.loc[row, col] = dict(key=value)
I am explicitly denoting that I want to store the entire value in the col
column, and I would expect the dictionary to be inserted as-is.
Anyways, I agree with @jreback that this is somewhat non-idiomatic BUT I am sympathetic to the original issue raised by @andreas-thomik. I encountered a problem where trying to store a dict to an element of a dataframe using this syntax made sense for the particular problem I was facing, so he isn't entirely on his own with this request.
Comment From: jreback
@aaclayton this is related to #18955 . We could/should prob supporting setting scalars of dicts better (and other iterables). Its a bit tricky though.
Comment From: varadpatil
@jreback, The behaviour is not uniform here, as assigning more than one value with dictionaries works, while assigning a single value doesn't.
This works:
In [18]: df = pd.DataFrame(index=[0, 1, 2], columns=['a', 'b'])
In [19]: df
Out[19]:
a b
0 NaN NaN
1 NaN NaN
2 NaN NaN
In [20]: df.loc[slice(None),'a']=[{'x':2}]*3
In [21]: df
Out[21]:
a b
0 {'x': 2} NaN
1 {'x': 2} NaN
2 {'x': 2} NaN
In [22]: df.loc[0]=[{'y':3}]*2
In [23]: df
Out[23]:
a b
0 {'y': 3} {'y': 3}
1 {'x': 2} NaN
2 {'x': 2} NaN
while trying to assign to a aingle value as dic doesn't work.
In [25]: df.loc[0,'a']={'z':4}
ValueError: Incompatible indexer with Series
This creates confusion, @jreback, can you please consider it to be fixed in upcoming versions?