Research

  • [X] I have searched the [pandas] tag on StackOverflow for similar questions.

  • [X] I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/questions/21287624/convert-pandas-column-containing-nans-to-dtype-int

Question about pandas

With python list data of dict element, every dict element has same key, now I first transfer list data to dataframe, then dataframe data left join to get joined dataframe data, last transfer joined dataframe data back to list data of dict element.

listdata1 = [{'intc':1, 'strc': 'a', 'floatc': 1.0}, {'intc':None, 'strc': 'b', 'floatc': 2.0}, {'intc':3, 'strc': 'c', 'floatc': 3.0}, ...]
listdata2 = [...]
pddata1 = pd.DataFrame(listdata1)
pddata1['intc'] =pddata1['intc'].map(lambda x: pd.NA if pd.isna(x) else int(x))
pddata2 = pd.DataFrame(listdata2)
padata_merge = pd.merge(pddata1, pddata2, how='left', on=['somecolumn', ...])
listdata_merge = padata_merge .to_dict('records')
for item in listdata_merge:
                if item['intc'] is pd.NA:
                    item['intc'] = None

There is a inconvenient point with int type data with None value like intc in listdata1, pddata1 will be: intc strc floatc 0 1.0 a 1.0 1 NaN b 2.0 2 3.0 c 3.0 ... now i need transfer value 1.0 NaN 3.0 of intc to 1 pd.NA 2 use language: pddata1['intc'] =pddata1['intc'].map(lambda x: pd.NA if pd.isna(x) else int(x)) then i need with for loop to change pd.NA value to None in listdata_merge. With there 2 operations, intc data is able to insert to database successfully with corresponding int field.

I want to ask is there some better method for me to process data like this situation as above, such as pddata1 = pd.DataFrame(listdata1, someintcolumprocesspara=True), so i can directly get data like this: intc strc floatc 0 1 a 1.0 1 pd.NA b 2.0 2 3 c 3.0 ... at the same time, listdata_merge = padata_merge .to_dict('records', somepanaprocesspara=True), i can dicectly get int data with Instead of pd.NA

Comment From: phofl

please ask such questions on stack overflow.

You can call convert_dtypes on your object

Comment From: flyly0755

convert_dtypes

thank you very much, that is what exactly i want:) but i have a new question, usually pandas method supports inplace parameter, means directly change original df, such as dfdata.drop_duplicates(keep='first', inplace=True) but convert_dtypes seems not support inplace parameter, only with assignment to get the result new_dfdata = dfdata.convert_dtypes()

Comment From: phofl

Inplace is a pattern we are actively trying to get rid of.