Example 1 (.values returning a view)

df = pd.DataFrame([["A", "B"],["C", "D"]])

df.values[0][0] = 0

df.values[0][1] = 5
# df
#     0   1
# 0   0   5
# 1   C   D

Example 2 (.values returning a copy) (Taken from COLDSPEED's answer in this question)

In [755]: df = pd.DataFrame({'A' :[1, 2], 'B' : ['ccc', 'ddd']})

In [756]: df
Out[756]: 
   A    B
0  1  ccc
1  2  ddd

In [757]: v = df.values

In [758]: v[0] = 123

In [759]: v[0, 1] = 'zzxxx'

In [760]: df
Out[760]: 
   A    B
0  1  ccc
1  2  ddd

Problem description

This is based on a question I asked on SO, Will changes in DataFrame.values always modify the values in the data frame?. You can see from the two examples above, sometimes, .values function returns a view while sometimes returns a copy.

I wonder whether this is safe.

I would advocate adding an option to force returning a view or copy. This is because someone can do some process on df.values and change df by accident. Or when people expect to change the values on a df but it does not happen because a copy is returned. See this question for example. People are advocating changing df.values directly like np.fill_diagonal(df.values, 0) to change the diagonal part of a data frame.

I think users at least need to be warned. Now on the documentation, it only contains a line "Numpy representation of NDFrame"

Expected Output

Output of pd.show_versions()

[paste the output of ``pd.show_versions()`` here below this line] INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.22.0 pytest: 3.2.1 pip: 9.0.1 setuptools: 36.5.0.post20170921 Cython: 0.26.1 numpy: 1.14.0 scipy: 0.19.1 pyarrow: None xarray: None IPython: 6.1.0 sphinx: 1.6.3 patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.1.0 openpyxl: 2.4.8 xlrd: 1.1.0 xlwt: 1.2.0 xlsxwriter: 1.0.2 lxml: 4.1.0 bs4: 4.6.0 html5lib: 0.999999999 sqlalchemy: 1.1.13 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Comment From: jreback

this is an implementation detail. since you are getting a single dtyped numpy array, it is upcast to a compatible dtype. if you have mixed dtypes, then you almost always will have a copy (the exception is mixed float dtypes will not copy I think), but this is a numpy detail.

I agree this is not great, but it has been there from the beginning and will not change in current pandas. If exporting to numpy you need to take care.

Comment From: jreback

all that said, if you want to update the docs to make this more clear that would be fine.

Comment From: tyrinwu

Thanks for the information. I appreciate the explanation.

Maybe add a line to the documentation that "It could return either a view or a copy. Changes to DataFrame.values may or may not change the DataFrame itself" ?