Code Sample, a copy-pastable example if possible

df = pd.DataFrame({'a': [1], 'val': [1.35]})
# no groupby, result is 'object', correct behavior
print(df['val'].transform(lambda x: x.map('+{}'.format)).dtype)
# result can't be converted to float, result is 'object', correct behavior
print(df.groupby('a')['val'].transform(lambda x: x.map('(+{})'.format)).dtype)
# result is 'float64' and plus sign is lost, INCORRECT behavior (should be 'object')
print(df.groupby('a')['val'].transform(lambda x: x.map('+{}'.format)).dtype)
# convert column to object type before transform:
df['val']=df['val'].astype(object)
# same code as before, but now result is 'object' and plus sign is shown, correct behavior
print(df.groupby('a')['val'].transform(lambda x: x.map('+{}'.format)).dtype)

Problem description

It is sometimes useful to use DataFrame.groupby()[col].transform() to create a string column based on a float64 column. In the case leading up to this issue report, I wanted to create a string column showing the actual value for the group min, and e.g., "+1.2" for the difference of the other rows from the min. However, Pandas unexpectedly converts this column back to float64 values, dropping the '+' sign for the non-min rows. This only happens with groupby, not when the transformation is applied to the whole column. It also only happens if the original column is a float type and the column of strings can be successfully converted back to floats.

I don't think users expect string columns to be converted back to the type of the original column (float in this case), even if that is possible. Pandas also doesn't seem to promise this behavior, since it only occurs with groupby and not when applying transform to the whole column.

It is possible to work around this by converting the original column to object type before applying the groupby and transform, but I don't think that should be necessary.

I would recommend dropping this conversion, and create a new series with the same type returned by the transform function.

Expected Output

object object object object

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.11.final.0 python-bits: 64 OS: Darwin OS-release: 17.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.3 pytest: 2.8.5 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.0 xarray: None IPython: 5.7.0 sphinx: 1.6.1 patsy: 0.4.1 dateutil: 2.4.2 pytz: 2017.2 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.4.6 feather: None matplotlib: 2.2.2 openpyxl: 2.3.2 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: 0.8.4 lxml: 3.5.0 bs4: 4.6.0 html5lib: None sqlalchemy: 1.0.11 pymysql: None psycopg2: 2.6.1 (dt dec pq3 ext lo64) jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: mroeschke

Looks like we get the expected dtypes on master. Could use a test

In [3]: df = pd.DataFrame({'a': [1], 'val': [1.35]})
   ...: # no groupby, result is 'object', correct behavior
   ...: print(df['val'].transform(lambda x: x.map('+{}'.format)).dtype)
   ...: # result can't be converted to float, result is 'object', correct behavior
   ...: print(df.groupby('a')['val'].transform(lambda x: x.map('(+{})'.format)).dtype)
   ...: # result is 'float64' and plus sign is lost, INCORRECT behavior (should be 'object')
   ...: print(df.groupby('a')['val'].transform(lambda x: x.map('+{}'.format)).dtype)
   ...: # convert column to object type before transform:
   ...: df['val']=df['val'].astype(object)
   ...: # same code as before, but now result is 'object' and plus sign is shown, correct behavior
   ...: print(df.groupby('a')['val'].transform(lambda x: x.map('+{}'.format)).dtype)
object
object
object
object

Comment From: xmu23

Hi, I am a newcomer and want to try to write a test for this good first issue. Can anyone tell me how should I know which file to put this test method in?

Comment From: jackgoldsmith4

Hey @xmu23, I'm also a newcomer and have been grabbing some of these issues. If you're still interested, this page and this page are really useful for writing tests and figuring out where to put them.

Comment From: andrewchen1216

Does this issue still require a test? If so, can I take it?