Code Sample

 def test_groupby_aggregate_item_by_item(self):
        def test_df():
            s = pd.DataFrame(np.array([[13, 14, 15, 16]]),
                             index=[0],
                             columns=['b', 'c', 'd', 'e'])
            num = np.array([[s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 24],
                            [s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 6]])
            columns = ['a', 'b', 'c', 'd', 'e', 'f']
            idx = [x for x in xrange(0, len(num))]
            return pd.DataFrame(num, index=idx, columns=columns)
        c = [test_df().sort_values(['d', 'e', 'f']),
             test_df().sort_values(['d', 'e', 'f'])]
        df = pd.concat(c)
        df = df[["e", "a"]].copy().reset_index(drop=True)
        df["e_idx"] = df["e"]
        what = [0, 0.5, 0.5, 1]

        def x():
            df.groupby(["e_idx", "e"])["a"].quantile(what)
        self.assertRaisesRegexp(ValueError,
                                "'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'",
                                x)

Problem description

The return message from the ValueError in _GroupBy._aggregate_item_by_item is vague.

                except (AttributeError):
>                   raise ValueError
E                   ValueError

core/groupby.py:592: ValueError

The proposed change raises the error message for the user to see.

Expected Output

                except (AttributeError) as e:
>                   raise ValueError(e)
E                   ValueError: 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'

core/groupby.py:592: ValueError

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: b895968a90c918709b01b49db427f9b5ab28c1fe python: 2.7.11.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None pandas: 0.19.0+311.gb895968.dirty nose: 1.3.7 pip: 9.0.1 setuptools: 32.3.1 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_datareader: None

Comment From: ghost

Possibly related: - https://github.com/pandas-dev/pandas/issues/5755

Comment From: jreback

this looks like a duplicate of #11759 ?

Comment From: ghost

merged that branch, testing again seems like the same issue occurs with a "ValueError 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'" as in the test that I made still passes.

Any suggestions?

Also, will do some cleanup on the test case, based on the reviews.

Thanks.

Comment From: jreback

@jmunsch you could have simply updated that original PR. but now that its closed GH won't let you re-open. so you can open a new one.

Comment From: ghost

So after doing some more research I tried moving _aggregate_item_by_item into series, but then I notice some TypeErrors like this: https://github.com/pandas-dev/pandas/pull/8472

When trying:

> /Users/jmunsch/Desktop/dev/pandas/pandas/core/groupby.py(2631)_aggregate_item_by_item()
-> cannot_agg.append(item)
(Pdb) item
   num1  num2
0    13    14
(Pdb) e
TypeError('Indexing a Series with DataFrame is not supported, use the appropriate DataFrame column',)
(Pdb) 

I followed the linked issue 11579 more closely, and retried with:

    def test_groupby_aggregate_item_by_item(self):
        s = pd.DataFrame(np.array([[13, 14]]),
                         index=[0],
                         columns=['num1', 'num2'])
        c = [pd.DataFrame([[s.copy(), 'zzz']], index=range(2), columns=['nums', 'node']),
             pd.DataFrame([[s.copy(), 'xxx']], index=range(2), columns=['nums', 'node'])]

        df = pd.concat(c)
        df = df[["node", "nums"]].copy().reset_index(drop=True)
        df["node_idx"] = df["node"]

       y = df.set_index(["node_idx", "node"]).groupby("nums").quantile([0.25, 1])

But it returns an empty set, I must be using this incorrectly. closing.

Comment From: jreback

@jmunsch you are doing some really odd things in your example and seem to have embedded dataframes within dataframes.

please show the input data construction and what you are after.