Is your feature request related to a problem?

This might help with two things

  1. A coordination point for 3rd-party libraries creating objects they'd like to turn into DataFrames, and users of those libraries
  2. Possibly, simplification of DataFrame.__init__

Describe the solution you'd like

A new top-level pd.dataframe function.

def dataframe(data: Any, index: Index, columns: Index, copy: bool = False):
    """
    Create a pandas DataFrame from data.
    """

@singledispatch.register(np.ndarray)
def dataframe(...):
    pass

API breaking implications

None

Describe alternatives you've considered

xref https://github.com/pandas-dev/pandas/pull/32844. Which attempted this for DataFrame.__init__. That was a non-starter since it exposed our internal BlockManager too publicly. https://github.com/pandas-dev/pandas/pull/32844#issuecomment-601494850. So we'd need to do this on a top-level function instead.

Comment From: simonjayhawkins

xref #32908 for alternative

Comment From: jbrockmendel

Just checking if I understand the idea:

Downstream library Foo has a class ModelData with something like a to_frame() method and by writing

@singledispatch.register(ModelData)
def dataframe(model_data, ...):
    return model_data.to_frame(...)

they make pd.dataframe Just Work on ModelData objects?

Comment From: TomAugspurger

Yep. functools.singledispatch looks at the type of the first argument and dispatches off that (with a fallback default if desired). So when pd.dataframe encounters a model_data, it would call the function registered for it (which would be expected to return an initialized pandas DataFrame.

Comment From: jbrockmendel

The experience with _constructor has soured me on the cost/benefit tradeoff of adding customization hooks for downstream libraries