Code Sample, a copy-pastable example if possible
from pandas import to_datetime, period_range, DataFrame
import pandas as pd
print(pd.__version__)
start_of_time = to_datetime('2016-10-17 01:16:39.133000')
end_of_time = to_datetime('2017-01-04 23:58:37.905000')
avs_date_range = period_range(start_of_time, end_of_time, freq='D')
bins = DataFrame(dict(foo=[0] * len(avs_date_range), bar=[0] * len(avs_date_range)),
index=avs_date_range)
current = range(10)
for idx, bin in bins.iterrows():
for i in range(6):
bins.set_value(idx, 'foo', bin['foo'] + 1)
f_count = bins.get_value(idx, 'foo')
bins.set_value(idx, 'bar', len(current) - f_count)
print(bins)
Problem description
If I comment out the setting of index this works as expected with PeriodIndex defined this creates KeyError.
Output of pd.show_versions()
Comment From: jreback
.set_value
is a fairly raw low-level non-public interface. Use .loc
.
Futher what you are doing is quite non-performant, iterating over the rows is not recommended.
In [15]: bins
Out[15]:
bar foo
2016-10-17 0 0
2016-10-18 0 0
2016-10-19 0 0
2016-10-20 0 0
2016-10-21 0 0
... ... ...
2016-12-31 0 0
2017-01-01 0 0
2017-01-02 0 0
2017-01-03 0 0
2017-01-04 0 0
[80 rows x 2 columns]
In [16]: bins.loc[bins.index[-1], 'foo'] = 1
In [17]: bins
Out[17]:
bar foo
2016-10-17 0 0
2016-10-18 0 0
2016-10-19 0 0
2016-10-20 0 0
2016-10-21 0 0
... ... ...
2016-12-31 0 0
2017-01-01 0 0
2017-01-02 0 0
2017-01-03 0 0
2017-01-04 0 1
[80 rows x 2 columns]
Comment From: jorisvandenbossche
It is get_value
that raises the error, not set_value
.
I agree that you don't need to use this method in the current example, but still, it is a public, documented method that in this case totally fails to do what is documented it should. According to the docstring it takes row and column labels, which fails:
In [155]: bins.get_value(bins.index[0], bins.columns[0])
...
KeyError: Period('2016-10-17', 'D')
Shouldn't we just fix this? Or update the documentation to discourage its usage? (or both)
Comment From: devanl
Probably not the place but, could you please explain a better way to populate each of he columns based on conditional analysis of external time stamped data being counted for each of the periods in the PeriodIndex?
Comment From: jorisvandenbossche
@devanl you can better ask on StackOverflow (and be sure to give there a reproducible example with a clear expected result, as this is currently not fully clear to me)
Comment From: jreback
These are effectively internal method and should actually be deprecated. I thought we did this quite a while back.