I'm interested if there are recommended best practices for extending DataFrames.
I work on a project where we have an API for querying a database that returns DataFrames. I basically want a way to associate extra metadata with each column in the DataFrame. At a minimum this would allow me to run commands like df.some_column.canonical_url
and get a link back to a web service.
There are some other things that I would like to do like giving columns aliases for better legends when plotted.
I know I could return a wrapped DataFrame object from the api, but the DataFrame would lose its wrapper once operations were performed.
Comment From: jreback
see #2485 for a long discussion.
You can do this in >= 0.14.0 by utilizing __finalize__()
.
see this PR's tests for basically doing this. https://github.com/pydata/pandas/pull/6931/files
Its straightforward, but you have to deal with the propogation issues. E.g. How to decide between different meta data.
Comment From: jreback
closing as discussion in #2485