This tutorial introduces the basic data structure of trimes. It covers basic concepts like (re-)sampling, interpolation and slicing. Other tutorials cover more advanced applications.
1 Data Structure
First we create a time series with two curves between 0 and 10 seconds. The time samples are randomly varied (sample time is not constant). This could be for example results of simulations with an adaptive (variable) step solver. Note that there is a separate tutorial where more convenient ways to create time series signals are shown, but we want to illustrate the illustrate the data structure here.
import syssys.path.append(r"..\..\src") # local path to trimes (usually not required)import trimesimport numpy as npimport pandas as pdfrom matplotlib import pyplot as pltaverage_sample_time =1e-3time = np.arange(0, 10, average_sample_time)time = time + (np.random.rand(len(time)) -0.5) *1e-5# Make sure that the first time step is zero and the last equals to 10. This is not a generic requirement of trimes, but necessary for this tutorial.time[0] =0time[-1] =10val_a = np.sin(time) + np.sin(time *0.2)val_b = np.cos(time) + np.cos(time *0.7)plt.plot(time, val_a, label="a")plt.plot(time, val_b, label="b")plt.legend(loc="upper center")plt.grid()print("Time samples:")print(time)
Time samples:
[0.00000000e+00 9.99402542e-04 2.00497001e-03 ... 9.99699673e+00
9.99800117e+00 1.00000000e+01]
Let’s create a pandas DataFrame. trimes requires the index to be the time variable. The index can be set at instantiation of the DataFrame or later using set_index.
# Set index at in instantiationdata = {"a": val_a, "b": val_b}df = pd.DataFrame(data, index=time)df.index.name ="time"print("Set index at initialization:")print(df.head())# Set index after instantiationdata = {"time": time, "a": val_a, "b": val_b}df = pd.DataFrame(data)df.set_index("time", inplace=True)print("\nSet index after initialization:")print(df.head())
Set index at initialization:
a b
time
0.000000 0.000000 2.000000
0.000999 0.001199 1.999999
0.002005 0.002406 1.999997
0.003005 0.003606 1.999993
0.003999 0.004799 1.999988
Set index after initialization:
a b
time
0.000000 0.000000 2.000000
0.000999 0.001199 1.999999
0.002005 0.002406 1.999997
0.003005 0.003606 1.999993
0.003999 0.004799 1.999988
2 Getting Data from Transient Time Series
2.1 Get Samples
You can get samples from a DataFrame using the index with the loc method. But the input must be exact. Hence, 0.0 is accepted and returns a pandas Series object because there is a sample at exactly that time. However, 0.55 will throw a key error.
df.loc[0.0] # -> ok, because `0.0` is in df.index# df.loc[0.55] -> KeyError
a 0.0
b 2.0
Name: 0.0, dtype: float64
The get_sample method of trimes returns a pandas Series object with the next sample after the queried time:
trimes.get_sample(df, 0.0) # -> same as df.loc[0.0]trimes.get_sample(df, 0.55)
a 0.633515
b 1.778538
Name: 0.5509987426757798, dtype: float64
get_sample and get_sample_shifted accept multiple samples and then return a DataFrame. get_sample_shifted shifts the returned samples. For example the input -1 returns the samples before the sample time:
trimes.get_sample_shifted(df, [0.55, 2], -1)
a
b
time
0.549997
0.632463
1.779325
1.999001
1.298947
-0.244583
You can also query samples around a point in time:
trimes.get_samples_around(df, 0.55, -1, 2)
a
b
time
0.549997
0.632463
1.779325
0.550999
0.633515
1.778538
0.552004
0.634571
1.777746
This returns the samples from -1 to 2 (relative to first sample after 0.55).
These methods work with DataFrames as well as with Series:
trimes.get_samples_around(df["a"], 0.55, -1, 2)
time
0.549997 0.632463
0.550999 0.633515
0.552004 0.634571
Name: a, dtype: float64
Whereas get_sample returns the values, get_index returns their index (index of first sample after point in time):
index = trimes.get_index(df, 0.55)# Then iloc can be useddf.iloc[index]
a 0.633515
b 1.778538
Name: 0.5509987426757798, dtype: float64
2.2 Interpolation
interp_df returns a DataFrame with interpolated (linear) values:
trimes.interp_df(df, [0.5, 3])
a
b
0.5
0.579259
1.816955
3.0
0.705762
-1.494839
interp_df returns a DataFrame even if there is only one sample. You can use squeeze to get a Series object:
trimes.interp_df(df, [0.5]).squeeze(axis=0)
a 0.579259
b 1.816955
Name: 0.5, dtype: float64
interp_series does the same with a Series as input:
trimes.interp_series(df["b"], [0.5, 0.7])
0.5 1.816955
0.7 1.647175
dtype: float64
2.3 Slicing
The loc method of pandas works with slices even if the input time does not fit with the samples (9.5 is not in time):
df.loc[9.5:10]
a
b
time
9.500002
0.871146
-0.063698
9.501002
0.870085
-0.063874
9.501996
0.869029
-0.064048
9.503005
0.867959
-0.064224
9.504004
0.866898
-0.064398
...
...
...
9.994999
0.369895
-0.085584
9.996001
0.368969
-0.085502
9.996997
0.368049
-0.085420
9.998001
0.367121
-0.085336
10.000000
0.365276
-0.085169
500 rows × 2 columns
The method get_between of trimes works similar and is more performant. Note that one difference between loc and get_between is that get_between returns samples before the last time sample (in this case before 10).
54.5 μs ± 3.14 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
34.8 μs ± 2.36 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
trimes is more performant because it assumes that the index (time) is monotonously increasing and uses numpy’s searchsorted function under the hood.
The function ‘get_between_and_around’ includes the samples before and after the time range (or at the exact points in time) by default. You can use the keyword arguments num_samples_start and num_samples_end to include or exclude further samples before and after the time range.
trimes.get_between_and_around(df, 5, 6)
a
b
time
4.999005
-0.117843
-0.653993
5.000001
-0.117453
-0.652793
5.000995
-0.117063
-0.651596
5.001999
-0.116668
-0.650387
5.003004
-0.116272
-0.649175
...
...
...
5.996997
0.649523
0.467235
5.997996
0.650555
0.468126
5.999001
0.651592
0.469020
5.999999
0.652623
0.469909
6.000999
0.653655
0.470798
1003 rows × 2 columns
trimes.get_between_and_around(df, 5, 6, num_samples_start=-2, num_samples_end=2) # include two samples at beginning and end
a
b
time
4.998004
-0.118233
-0.655199
4.999005
-0.117843
-0.653993
5.000001
-0.117453
-0.652793
5.000995
-0.117063
-0.651596
5.001999
-0.116668
-0.650387
...
...
...
5.997996
0.650555
0.468126
5.999001
0.651592
0.469020
5.999999
0.652623
0.469909
6.000999
0.653655
0.470798
6.001997
0.654687
0.471685
1005 rows × 2 columns
get_between_interp interpolates at tstart and tend:
trimes.get_between_interp(df, 5.02, 6.02)
a
b
5.020000
-0.109434
-0.628671
5.020002
-0.109433
-0.628668
5.020996
-0.109025
-0.627469
5.022003
-0.108611
-0.626251
5.023000
-0.108199
-0.625047
...
...
...
6.015999
0.669175
0.484048
6.017004
0.670217
0.484930
6.018002
0.671251
0.485805
6.018995
0.672282
0.486676
6.020000
0.673324
0.487555
1002 rows × 2 columns
2.4 Delta Between Samples
get_delta returns the difference between two samples (under the hood get_sample is used, so the the next sample after the given time is used).
trimes.get_delta(df, 0, 0.5)
a 0.580338
b -0.183767
dtype: float64
One can also query the delta between interpolated (linear) values: