Resampling timeseries with Pandas resample and ohlc method and Pandas DataReader should output DataFrame with same columns name to provide a consistent columns naming.
Currently this is not consistent.
In [1]: import pandas_datareader as pdr
In [2]: df = pdr.DataReader("IBM", "google")
In [2]: df
Out[2]:
Open High Low Close Volume
Date
2010-01-04 131.18 132.97 130.85 132.45 6155846
2010-01-05 131.68 131.85 130.10 130.85 6842471
2010-01-06 130.68 131.49 129.81 130.00 5605290
2010-01-07 129.87 130.25 128.91 129.55 5840569
2010-01-08 129.07 130.92 129.05 130.85 4197105
... ... ... ... ... ...
2016-11-02 152.48 153.34 151.67 151.95 3074438
2016-11-03 152.51 153.74 151.80 152.37 2878803
2016-11-04 152.40 153.64 151.87 152.43 2470368
2016-11-07 153.99 156.11 153.84 155.72 3706214
2016-11-08 154.56 NaN NaN 155.17 3921944
Pandas DataReader uses Open
, High
, Low
, Close
column name for candlesticks
In [3]: price = df["Close"]
In [4]: price.resample("W").ohlc()
Out[4]:
open high low close
Date
2010-01-10 132.449997 132.449997 129.550003 130.850006
2010-01-17 129.479996 132.309998 129.479996 131.779999
2010-01-24 134.139999 134.139999 125.500000 125.500000
2010-01-31 126.120003 126.330002 122.389999 122.389999
2010-02-07 124.669998 125.660004 123.000000 123.519997
... ... ... ... ...
2016-10-16 157.020004 157.020004 153.720001 154.449997
2016-10-23 154.770004 154.770004 149.630005 149.630005
2016-10-30 150.570007 153.350006 150.570007 152.610001
2016-11-06 153.690002 153.690002 151.949997 152.429993
2016-11-13 155.720001 155.720001 155.720001 155.720001
[358 rows x 4 columns]
Pandas uses open
, high
, low
, close
column name for candlesticks
We should change this on pandas side or on pandas-datareader side.
PS: on pandas_datareader side https://github.com/pydata/pandas-datareader/issues/260
Comment From: jreback
well pandas has had these for quite some time, and all other names are already in lower case. Its pretty easy to simply rename this if needed.
Comment From: femtotrader
Will use
df.columns = [col.lower() for col in df.columns]
or
df.columns = [col.title() for col in df.columns]
but following same convention will be easier if we want to use TA-Lib (for example) - https://github.com/mrjbq7/ta-lib/issues/66
Comment From: jreback
@femtotrader TA-lib naming convention is downstream of pandas, so its naming convention has no bearing.