• [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

def test_astype_string_conversion(): 

    df = pd.DataFrame([{'n': 0, 's':'qwerty'}] * 3)

    direct_conversion_df = df.copy()
    direct_conversion_df['s'] = direct_conversion_df['s'].astype('|S10')

    # direct column conversion : String to fixed len bytes works as expected
    assert direct_conversion_df['s'].dtype != np.dtype('O')

    # Where as type specified with a dictionary
    dictionary_conversion_df = df.astype({'n': 'float64', 's':'|S10'})    

    # the float conversion works, 
    assert dictionary_conversion_df['n'].dtype == np.dtype('float64')

    # but the strings are still objects, 
    assert dictionary_conversion_df['s'].dtype != np.dtype('O')  # FAIL

Problem description

pandas.DataFrame.astype, 'dtype' parameter accepts a dictionary which can specify the columns to be converted. It looks like limited length string specifiers, e.g: '|S10', are ignored, if they are specified in the dictionary. Directly converting columns to the same type, works as expected.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : 67a3d4241ab84419856b84fc3ebc9abcbe66c6b3 python : 3.8.6.final.0 python-bits : 64 OS : Linux OS-release : 4.19.104-microsoft-standard Version : #1 SMP Wed Feb 19 06:37:35 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.1.4 numpy : 1.19.4 pytz : 2020.4 dateutil : 2.8.1 pip : 20.1.1 setuptools : 47.3.1.post20200622 Cython : 0.29.21 pytest : 5.4.3 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : 1.1 pymysql : None psycopg2 : 2.8.5 (dt dec pq3 ext lo64) jinja2 : 2.11.2 IPython : 7.19.0 pandas_datareader: None bs4 : 4.9.3 bottleneck : None fsspec : 0.8.4 fastparquet : 0.4.1 gcsfs : None matplotlib : 3.2.2 numexpr : 2.7.1 odfpy : None openpyxl : None pandas_gbq : None pyarrow : 2.0.0 pytables : None pyxlsb : None s3fs : None scipy : 1.5.0 sqlalchemy : 1.3.18 tables : 3.6.1 tabulate : None xarray : 0.16.1 xlrd : None xlwt : None numba : 0.51.2

Comment From: mroeschke

This looks to work on main now (and I believe we have testing for this). So closing