Pandas Unclear ValueError on core.groupby

Code Sample

 def test_groupby_aggregate_item_by_item(self):
        def test_df():
            s = pd.DataFrame(np.array([[13, 14, 15, 16]]),
                             index=[0],
                             columns=['b', 'c', 'd', 'e'])
            num = np.array([[s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 24],
                            [s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 6]])
            columns = ['a', 'b', 'c', 'd', 'e', 'f']
            idx = [x for x in xrange(0, len(num))]
            return pd.DataFrame(num, index=idx, columns=columns)
        c = [test_df().sort_values(['d', 'e', 'f']),
             test_df().sort_values(['d', 'e', 'f'])]
        df = pd.concat(c)
        df = df[["e", "a"]].copy().reset_index(drop=True)
        df["e_idx"] = df["e"]
        what = [0, 0.5, 0.5, 1]

        def x():
            df.groupby(["e_idx", "e"])["a"].quantile(what)
        self.assertRaisesRegexp(ValueError,
                                "'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'",
                                x)

Problem description

The return message from the ValueError in _GroupBy._aggregate_item_by_item is vague.

                except (AttributeError):
>                   raise ValueError
E                   ValueError

core/groupby.py:592: ValueError

The proposed change raises the error message for the user to see.

Expected Output

                except (AttributeError) as e:
>                   raise ValueError(e)
E                   ValueError: 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'

core/groupby.py:592: ValueError

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: b895968a90c918709b01b49db427f9b5ab28c1fe python: 2.7.11.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None pandas: 0.19.0+311.gb895968.dirty nose: 1.3.7 pip: 9.0.1 setuptools: 32.3.1 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_datareader: None

Comment From: ghost

Possibly related: - https://github.com/pandas-dev/pandas/issues/5755

Comment From: jreback

this looks like a duplicate of #11759 ?

Comment From: ghost

merged that branch, testing again seems like the same issue occurs with a "ValueError 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'" as in the test that I made still passes.

Any suggestions?

Also, will do some cleanup on the test case, based on the reviews.

Thanks.

Comment From: jreback

@jmunsch you could have simply updated that original PR. but now that its closed GH won't let you re-open. so you can open a new one.

Comment From: ghost

So after doing some more research I tried moving _aggregate_item_by_item into series, but then I notice some TypeErrors like this: https://github.com/pandas-dev/pandas/pull/8472

When trying:

> /Users/jmunsch/Desktop/dev/pandas/pandas/core/groupby.py(2631)_aggregate_item_by_item()
-> cannot_agg.append(item)
(Pdb) item
   num1  num2
0    13    14
(Pdb) e
TypeError('Indexing a Series with DataFrame is not supported, use the appropriate DataFrame column',)
(Pdb)

I followed the linked issue 11579 more closely, and retried with:

    def test_groupby_aggregate_item_by_item(self):
        s = pd.DataFrame(np.array([[13, 14]]),
                         index=[0],
                         columns=['num1', 'num2'])
        c = [pd.DataFrame([[s.copy(), 'zzz']], index=range(2), columns=['nums', 'node']),
             pd.DataFrame([[s.copy(), 'xxx']], index=range(2), columns=['nums', 'node'])]

        df = pd.concat(c)
        df = df[["node", "nums"]].copy().reset_index(drop=True)
        df["node_idx"] = df["node"]

       y = df.set_index(["node_idx", "node"]).groupby("nums").quantile([0.25, 1])

But it returns an empty set, I must be using this incorrectly. closing.

Comment From: jreback

@jmunsch you are doing some really odd things in your example and seem to have embedded dataframes within dataframes.

please show the input data construction and what you are after.

Pandas Unclear ValueError on core.groupby

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`