As I can see, this is the intended way to append one Series to another
import pandas
s = pandas.Series([0, 1, 2])
new_s = s.append(s)
Problem description
I think it is confusing that the append call does not modify the s
Series instance in place, like it works with Python lists.
Also, what is the intended way for extending a Series, if the data, like some kind of measurements, is accumulated over time?
Output of pd.show_versions()
Comment From: jorisvandenbossche
Almost all methods in pandas return a new object, instead of mutating in place, and this is not something we are going to change
(there are a few where inplace=True, and those typically stem from being consistent with list/array, eg Series.sort
, but we are rather deprecating those)
Also, what is the intended way for extending a Series, if the data, like some kind of measurements, is accumulated over time?
You can use append
for that as you did in the example. More general is concat
(you can concat multiple parts at once). Typically you first gather all pieces in a list, and only then concatenate them at once using concat
.
Comment From: egorf
Hi and thanks for the fast answer,
What would be the correct approach if I want to accumulate some data over time and, for example, know the standard deviation at each step?
I can't seem to find a way to calculate the std iteratively, as I need to recreate the Series every time new data comes along.
Is this something pandas is not designed for?
Comment From: chris-b1
@egorf - if the data is reasonably sized and you're not super performance bound, you can just keep recreating the Series
. Each recreation will take a copy of the underlying data, which as long as your number of iterations is reasonable, is pretty fast.
If that's an issue, the other option is create a large, empty array / series with a single allocation, then place data into it. More complex, but something like this.
data = np.empty(100000)
data[:] = np.nan
s = pd.Series(data)
loc = 0
for new_data in source: # your iterative data source
n = len(new_data)
s.iloc[loc:n+loc] = new_data
loc += len(new_data)
# iterative std
new_std = s.iloc[:loc].std()
Comment From: egorf
Thanks for the tip, but I decided to go with the online algorithm instead and use for pandas for data storage and manipulation, rather than accumulation.