total=pd.DataFrame({"all_eternal":[1,2,1,1,0,1],"country_name":["United States","Germany","Germany","Germany","France","United States"],"amount":[20,40,30,10,5,6]})
When I apply
totalbycountrynew=total.groupby(["country_name"],as_index=False).apply(
lambda g:
pd.Series({
"LTVav":g.amount.mean(),
"LTVav_without_trials":g.amount[g.all_eternal!=1].mean()
})
)
I would like to see:
country_name LTVav LTVav_without_trials
France 5.0 5.0
Germany 26.7 40.0
United States 13.0 nan
Instead I see:
LTVav LTVav_without_trials
5.0 5.0
26.7 40.0
13.0 nan (with index 0,1,2)
Without as_index=False works as_expected - countries' names are there, but they are in index While there is a workaround, I would say it is an unexpected behaviour (at least non-intuitive one)
(sorry for styling, I don't know Markdown)
Comment From: jorisvandenbossche
(sorry for styling, I don't know Markdown)
https://guides.github.com/features/mastering-markdown/#GitHub-flavored-markdown (first example show how to make code blocks)
Comment From: jreback
you don't need ``as_index=False)
In [25]: total.groupby(["country_name"]).apply(
...: lambda g:
...: pd.Series({
...: "LTVav":g.amount.mean(),
...: "LTVav_without_trials":g.amount[g.all_eternal!=1].mean()
...: })
...: )
...:
Out[25]:
LTVav LTVav_without_trials
country_name
France 5.000000 5.0
Germany 26.666667 40.0
United States 13.000000 NaN
Comment From: cryptotvync
@jreback did you real all my comment?) Countries' names are in index then, not in column, as usual
Comment From: TomAugspurger
@cryptotvync can you take the time to properly format your desired output then? @jreback's suggestion looks correct.