Code Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame({'a': ['Hello', 'World']})
df['a'] = df['a'].astype('S')
df['b'] = df['a'].astype('S')
print(df.dtypes)
Output is:
a object
b |S5
dtype: object
Problem description
The new column has the correct dtype of numpy byte string, the old column no contains
the new values but still has dtype object.
Expected Output
Both columns should have dtype 'S'
Output of pd.show_versions()
Comment From: jreback
dupe of #12857
an easy fix actually, these should be converted to object (never a fixed-width string/bytes which are not supported)
Comment From: maxnoe
So you say this is intended behaviour and not converting to object on the new assignment is the bug?
Comment From: jreback
yes, this should be object dtype. see the duplicate issue.
Comment From: maxnoe
I think you should leave it to the user. If the user wants to do astype('S') that's pretty specific. Why force using object?
Comment From: jreback
fixed width types for strings are not supported and don't offer any advantages except more complexity - this might change at some point but this is how it is