Pandas Adding a new DataFrame row using dict() gives unexpected behaviour

Code Sample, a copy-pastable example if possible

Using pd.series works fine:

df = pd.DataFrame(np.random.randint(1, 10, (3, 3)), index=['one', 'one', 'two'], columns=['col1', 'col2', 'col3'])
new_data = pd.Series({'col1': 'new', 'col2': 'new', 'col3': 'new'})
df.iloc[0] = new_data

# resulting df looks like:

#       col1    col2    col3
#one    new     new     new
#one    9       6       1
#two    8       3       7

But if I try to add a dictionary instead, I get this:

new_data = {'col1': 'new', 'col2': 'new', 'col3': 'new'}
df.iloc[0] = new_data
#
#         col1  col2    col3
#one      col2  col3    col1
#one      2     1       7
#two      5     8       6

Problem description

Considering that a dataframe can be created using a dictionary, it seems odd that adding to a dataframe using a dictionary would result in shuffled columns when they have been explicitly labeled.

This seems like an unexpected behaviour which, although may not be a bug, doesn't seem like the optimal way of making this process function.

Expected Output

Adding a row to a dataframe using a dictionary which gives column headers results in the same thing as adding a row using pd.series

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-83-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C LANG: en_GB.UTF-8 LOCALE: None.None pandas: 0.20.2 pytest: None pip: 9.0.1 setuptools: 33.1.1.post20170320 Cython: None numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.6.1 patsy: None dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: None lxml: 3.7.3 bs4: None html5lib: 0.999 sqlalchemy: 1.1.5 pymysql: None psycopg2: None jinja2: 2.9.5 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: gfyoung

@mat2py : Thanks for the issue! I tried your example using 0.20.2 on Windows, and I can't seem to reproduce your error (I just copied and pasted your code). If you could reconfirm that you get this error by copying and pasting the code you provided above?

Also, if anyone else can reproduce this (cc @jreback ), that would be great to know.

Comment From: mp-v2

I can replicate it yes. However the issue doesn't occur if you apply the dict to the df after already having applied the pd.series. You have to recreate the dataframe and only try to add the dict.

What you probably did: - create df - test with df.series (works) - test with dict (works)

What doesn't work: - create df - test with dict (doesn't work)

Comment From: mp-v2

Pandas Adding a new DataFrame row using dict() gives unexpected behaviour

Comment From: gfyoung

Not on my computer ATM, but can't argue with pictures. Something funny is going on there. PR to patch this is welcome!

Comment From: chris-b1

Thanks for the example, this is a duplicate of #16724

Pandas Adding a new DataFrame row using dict() gives unexpected behaviour

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`