Code Sample, a copy-pastable example if possible
# Your code here
x = pd.DataFrame([1,2,3])
y = x
y["test"] = "test"
print y
y
0 test
0 1 test
1 2 test
2 3 test
print x
x
0 test
0 1 test
1 2 test
2 3 test
Problem description
When creating a copy of a dataframe and then adding a new column of type string to the new dataframe, the original dataframe also has the new column added to it.
Expected Output
The original dataframe remains unaltered.
Output of pd.show_versions()
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.0
nose: 1.3.7
pip: 9.0.1
setuptools: 25.1.3
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.3.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.0
matplotlib: 1.5.1
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.3
lxml: None
bs4: 4.5.1
html5lib: None
httplib2: 0.9.2
apiclient: 1.5.1
sqlalchemy: 1.1.2
pymysql: 0.7.4.None
psycopg2: None
jinja2: 2.8
boto: 2.43.0
pandas_datareader: None
Comment From: jreback
you aren't creating a copy here. y = x
is just a reference, kind of a different name for things.
you would need to do y = x.copy()
to actually copy.
another way is
x.assign(test='foo')
which will return a new dataframe.