Hello,
pd.DataFrame.to_dict
accepts orient
parameter in
['dict', 'list', 'series', 'split', 'records', 'index']
(default being 'dict'
)
http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.to_dict.html
from_dict
only accepts orient
parameter in ['columns', 'index']
(default being 'columns'
)
I think that from_dict
should accept same parameter orient
for API consistency
http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.from_dict.html
Same for Panel
http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.Panel.from_dict.html
to_dict
: missing method for Panel
Same for Series
http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.to_dict.html
from_dict
: missing method for Series
Kind regards
Comment From: jreback
you would have to show a specific example here of why this API change would actually be useful. Don't just list lots of code references.
Comment From: femtotrader
Use case is to be able to save to MongoDB Series
, DataFrame
, Panel
and also retrieve them.
An other feature to add to to_dict
will be to output dict of NumPy arrays instead of dict of Python lists.
Comment From: jreback
pls show a specific example and a use case - you are describing a very general case
Comment From: femtotrader
https://bitbucket.org/djcbeach/monary/issues/19/use-pandas-series-dataframe-and-panel-with
Comment From: jreback
still not sure WHY this would be a good thing to add to the API nor what you are actually asking
Comment From: bluenote10
what you are actually asking
I guess he is referring to the problem that to_dict
has more modes than from_dict
, which means that some to_dict
conversions don't have an inverse operation.
It's an API inconsistency, and a consistent API is a "good thing". Here is an example of a problem that arises from this inconsistency:
When converting a DataFrame to JSON we are using the orient="split"
option, because it seems to be only option that preserves a DataFrame entirely (order of columns and index). In the inverse operation we would need the equivalent. We cannot use to_json
/read_json
directly (which both support orient="split"
), because the DataFrame is only a nested element in a bigger JSON structure, which requires operating on dicts. But Pandas lacks the inverse operation for dicts... We tried to temporarily convert the dict to string, allowing to use the read_json
on the string with the orient split option, but the double conversion makes it prohibitively slow.
Comment From: xysscn
what you are actually asking
I guess he is referring to the problem that
to_dict
has more modes thanfrom_dict
, which means that someto_dict
conversions don't have an inverse operation.It's an API inconsistency, and a consistent API is a "good thing". Here is an example of a problem that arises from this inconsistency:
When converting a DataFrame to JSON we are using the
orient="split"
option, because it seems to be only option that preserves a DataFrame entirely (order of columns and index). In the inverse operation we would need the equivalent. We cannot useto_json
/read_json
directly (which both supportorient="split"
), because the DataFrame is only a nested element in a bigger JSON structure, which requires operating on dicts. But Pandas lacks the inverse operation for dicts... We tried to temporarily convert the dict to string, allowing to use theread_json
on the string with the orient split option, but the double conversion makes it prohibitively slow.
I have the same problem with bluenote10, the double conversion is too slow for a frequently called api.Does anyone have a workaround?
Comment From: sjrl
I'm running into the same problem here. My current workaround is to use from_records
to reverse to_dict(orient="split")
# Initial dataframe
df = pd.DataFrame.from_records([{"col1": "text_1", "col2": 1}, {"col1": "text_2", "col2": 2}])
# to_dict and reverse with from_records
pd.DataFrame.from_records(**df.to_dict(orient="split"))