Revisiting some 6 month old class that subclasses Series and overrides __finalize__ as:

    def __finalize__(self, other, method=None, **kwargs):
        pd.Series.__finalize__(self, other, method=method, **kwargs)
        self.metadata = other.metadata.copy()
        self.metadata.parent = self
        return self

With 0.21.1 these objects can no longer be rendered:

>>> ser
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/base.py", line 80, in __repr__
    return str(self)
  File "pandas/core/base.py", line 60, in __str__
    return self.__bytes__()
  File "pandas/core/base.py", line 72, in __bytes__
    return self.__unicode__().encode(encoding, 'replace')
  File "pandas/core/series.py", line 1066, in __unicode__
    max_rows=max_rows, length=show_dimensions)
  File "pandas/core/series.py", line 1109, in to_string
    max_rows=max_rows)
  File "pandas/io/formats/format.py", line 176, in __init__
    self._chk_truncate()
  File "pandas/io/formats/format.py", line 190, in _chk_truncate
    series.iloc[-row_num:]))
  File "pandas/core/reshape/concat.py", line 213, in concat
    return op.get_result()
  File "pandas/core/reshape/concat.py", line 377, in get_result
    return cons(mgr, name=name).__finalize__(self, method='concat')
  File "pdsm/core/series.py", line 125, in __finalize__
    self.metadata = other.metadata.copy()
AttributeError: '_Concatenator' object has no attribute 'metadata'

I had assumed that other would always be of the same class as self. Is _Concatenator a recent addition?

Comment From: jreback

can you demonstrate an actual example?

Comment From: jbrockmendel

class Metadata(object):
    def __init__(self, parent):
        self.parent = parent

    def copy(self):
        return Metadata(parent=self.parent)


class MetaSeries(pd.Series):
    _metadata = pd.Series._metadata + ['metadata']

    @property
    def _constructor(self):
        return MetaSeries

    def __init__(self, *args, **kwargs):
        super(MetaSeries, self).__init__(*args, **kwargs)
        md = Metadata(parent=self)
        object.__setattr__(self, 'metadata', md)

    def __finalize__(self, other, method=None, **kwargs):
        pd.Series.__finalize__(self, other, method=method, **kwargs)
        md = other.metadata.copy()
        object.__setattr__(self, 'metadata', md)
        md.parent = self
        return self
>>> ser = pd.Series(range(100))
>>> mser = MetaSeries(ser)
>>> mser[:5]
0    0
1    1
2    2
3    3
4    4
dtype: int64
>>> mser
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/pandas/core/base.py", line 80, in __repr__
    return str(self)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/base.py", line 60, in __str__
    return self.__bytes__()
  File "/usr/local/lib/python2.7/site-packages/pandas/core/base.py", line 72, in __bytes__
    return self.__unicode__().encode(encoding, 'replace')
  File "/usr/local/lib/python2.7/site-packages/pandas/core/series.py", line 1066, in __unicode__
    max_rows=max_rows, length=show_dimensions)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/series.py", line 1109, in to_string
    max_rows=max_rows)
  File "/usr/local/lib/python2.7/site-packages/pandas/io/formats/format.py", line 176, in __init__
    self._chk_truncate()
  File "/usr/local/lib/python2.7/site-packages/pandas/io/formats/format.py", line 190, in _chk_truncate
    series.iloc[-row_num:]))
  File "/usr/local/lib/python2.7/site-packages/pandas/core/reshape/concat.py", line 213, in concat
    return op.get_result()
  File "/usr/local/lib/python2.7/site-packages/pandas/core/reshape/concat.py", line 377, in get_result
    return cons(mgr, name=name).__finalize__(self, method='concat')
  File "subclass.py", line 27, in __finalize__
    self.metadata = other.metadata.copy()
AttributeError: '_Concatenator' object has no attribute 'metadata'

Comment From: jorisvandenbossche

@jbrockmendel That is code that I recently rewrote (https://github.com/pandas-dev/pandas/pull/17728), so it is quite possible I introduces a bug.

However, I cannot reproduce the exact error you get. I get:

In [3]: mser
..
ValueError: All objects passed were None

and I already get to for 0.20.3 as well.

Comment From: jorisvandenbossche

Ah, there was a return self missing in the __finalize__ implementation in the code example above. Now I get the error you were showing (and now it also works in older pandas version).

And I suppose that I meant to pass the first object from the list to concat instead of self. Does that make sense?

However, it seems that I just did how it was already done before (using this __finalize__(self) construct where self is the _Concatenator object). So not sure why it started failing.

Comment From: jorisvandenbossche

OK, I understand what changed the behaviour. Before my PR (https://github.com/pandas-dev/pandas/pull/17728), concat of serieses always returned a Series, while my PR changed it to return an instance of the class of the first object. So in practice, the __finalize__ of your MetaSeries is now called instead of Series.__finalize__, and the latter has a check that self is NDFrame like:

https://github.com/pandas-dev/pandas/blob/8acdf801c501a0ce2ac14ddd676cd248f0f32180/pandas/core/generic.py#L4000-L4003

Given that, I would argue the current code might be correct, as passing through the metadata of the first object is maybe not the best default? (Eg that would ignore different metadata in the different objects you are concatting). The subclass can always have specialized code in __finalize__ for checking such metadata

Comment From: jbrockmendel

Ah, there was a return self missing in the finalize implementation in the code example above. Now I get the error you were showing (and now it also works in older pandas version).

Good catch. I just edited the previous comment to fix that mistake.

Given that, I would argue the current code might be correct, as passing through the metadata of the first object is maybe not the best default? (Eg that would ignore different metadata in the different objects you are concatting).

I think you're right that the problem above would be fixed by having a isinstance(other, NDFrame) check in the subclass's __finalize__ method. (In the non-toy version MetaSeries, the metadata object is responsible for figuring out how to propagate itself under various operations.)

concat is well outside the areas that I'm familiar with (and I was surprised that it was relevant for __repr__). For my own understanding, why is _Concatenator a reasonable object to pass to __finalize__? I don't see how it could usefully propagate any metadata.

Comment From: jorisvandenbossche

I was surprised that it was relevant for __repr__

Because it is a truncated repr, concatting first and last part.

For my own understanding, why is _Concatenator a reasonable object to pass to finalize? I don't see how it could usefully propagate any metadata.

That object holds more information, eg the list of all concatenated objects. So for example you could check for a certain metadata field if all objects have the same value, and only in that case set it on the resulting object.

Comment From: jbrockmendel

This is probably closable, possibly worth keeping around on some tracker for "subclassing pitfalls"

Comment From: jorisvandenbossche

@jbrockmendel want to do a PR to mention this in the subclassing docs?