Getting Started

This tutorial introduces the basic data structure of trimes. It covers basic concepts like (re-)sampling, interpolation and slicing. Other tutorials cover more advanced applications.

1 Data Structure

First we create a time series with two curves between 0 and 10 seconds. The time samples are randomly varied (sample time is not constant). This could be for example results of simulations with an adaptive (variable) step solver. Note that there is a separate tutorial where more convenient ways to create time series signals are shown, but we want to illustrate the illustrate the data structure here.

import sys

sys.path.append(r"..\..\src")  # local path to trimes (usually not required)
import trimes
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt


average_sample_time = 1e-3
time = np.arange(0, 10, average_sample_time)
time = time + (np.random.rand(len(time)) - 0.5) * 1e-5
# Make sure that the first time step is zero and the last equals to 10. This is not a generic requirement of trimes, but necessary for this tutorial.
time[0] = 0
time[-1] = 10

val_a = np.sin(time) + np.sin(time * 0.2)
val_b = np.cos(time) + np.cos(time * 0.7)

plt.plot(time, val_a, label="a")
plt.plot(time, val_b, label="b")
plt.legend(loc="upper center")
plt.grid()

print("Time samples:")
print(time)

Time samples:
[0.00000000e+00 1.00163879e-03 2.00496899e-03 ... 9.99700403e+00
 9.99800440e+00 1.00000000e+01]

Let’s create a pandas DataFrame. trimes requires the index to be the time variable. The index can be set at instantiation of the DataFrame or later using set_index.

# Set index at in instantiation
data = {"a": val_a, "b": val_b}
df = pd.DataFrame(data, index=time)
df.index.name = "time"
print("Set index at initialization:")
print(df.head())

# Set index after instantiation
data = {"time": time, "a": val_a, "b": val_b}
df = pd.DataFrame(data)
df.set_index("time", inplace=True)
print("\nSet index after initialization:")
print(df.head())

Set index at initialization:
                 a         b
time                        
0.000000  0.000000  2.000000
0.001002  0.001202  1.999999
0.002005  0.002406  1.999997
0.002997  0.003597  1.999993
0.003998  0.004798  1.999988

Set index after initialization:
                 a         b
time                        
0.000000  0.000000  2.000000
0.001002  0.001202  1.999999
0.002005  0.002406  1.999997
0.002997  0.003597  1.999993
0.003998  0.004798  1.999988

2 Getting Data from Transient Time Series

2.1 Get Samples

You can get samples from a DataFrame using the index with the loc method. But the input must be exact. Hence, 0.0 is accepted and returns a pandas Series object because there is a sample at exactly that time. However, 0.55 will throw a key error.

df.loc[0.0]  # -> ok, because `0.0` is in df.index
# df.loc[0.55] -> KeyError

a    0.0
b    2.0
Name: 0.0, dtype: float64

The get_sample method of trimes returns a pandas Series object with the next sample after the queried time:

trimes.get_sample(df, 0.0)  # -> same as df.loc[0.0]
trimes.get_sample(df, 0.55)

a    0.633513
b    1.778539
Name: 0.5509968383295476, dtype: float64

get_sample and get_sample_shifted accept multiple samples and then return a DataFrame. get_sample_shifted shifts the returned samples. For example the input -1 returns the samples before the sample time:

trimes.get_sample_shifted(df, [0.55, 2], -1)

	a	b
time
0.549995	0.632461	1.779327
1.999997	1.298717	-0.246175

You can also query samples around a point in time:

trimes.get_samples_around(df, 0.55, -1, 2)

	a	b
time
0.549995	0.632461	1.779327
0.550997	0.633513	1.778539
0.551999	0.634567	1.777750

This returns the samples from -1 to 2 (relative to first sample after 0.55).

These methods work with DataFrames as well as with Series:

trimes.get_samples_around(df["a"], 0.55, -1, 2)

time
0.549995    0.632461
0.550997    0.633513
0.551999    0.634567
Name: a, dtype: float64

Whereas get_sample returns the values, get_index returns their index (index of first sample after point in time):

index = trimes.get_index(df, 0.55)
# Then iloc can be used
df.iloc[index]

a    0.633513
b    1.778539
Name: 0.5509968383295476, dtype: float64

2.2 Interpolation

interp_df returns a DataFrame with interpolated (linear) values:

trimes.interp_df(df, [0.5, 3])

	a	b
0.5	0.579259	1.816955
3.0	0.705762	-1.494839

interp_df returns a DataFrame even if there is only one sample. You can use squeeze to get a Series object:

trimes.interp_df(df, [0.5]).squeeze(axis=0)

a    0.579259
b    1.816955
Name: 0.5, dtype: float64

interp_series does the same with Series input and returns an array:

trimes.interp_series(df["b"], [0.5, 0.7])

0.5    1.816955
0.7    1.647175
dtype: float64

2.3 Slicing

The loc method of pandas works with slices even if the input time does not fit with the samples (9.5 is not in time):

df.loc[9.5:10]

	a	b
time
9.501003	0.870084	-0.063874
9.501996	0.869029	-0.064048
9.502996	0.867968	-0.064222
9.503996	0.866906	-0.064396
9.505000	0.865841	-0.064570
...	...	...
9.994996	0.369898	-0.085584
9.996004	0.368966	-0.085502
9.997004	0.368042	-0.085419
9.998004	0.367118	-0.085336
10.000000	0.365276	-0.085169

499 rows × 2 columns

The method get_between of trimes works similar and is more performant. Note that one difference between loc and get_between is that get_between returns samples before the last time sample (in this case before 10).

trimes.get_between(df, 9.5, 10)

	a	b
time
9.501003	0.870084	-0.063874
9.501996	0.869029	-0.064048
9.502996	0.867968	-0.064222
9.503996	0.866906	-0.064396
9.505000	0.865841	-0.064570
...	...	...
9.993996	0.370823	-0.085666
9.994996	0.369898	-0.085584
9.996004	0.368966	-0.085502
9.997004	0.368042	-0.085419
9.998004	0.367118	-0.085336

498 rows × 2 columns

%timeit df.loc[9.5:10]
%timeit trimes.get_between(df, 9.5, 10)

23.1 μs ± 1.02 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
15.8 μs ± 714 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

trimes is more performant because it assumes that the index (time) is monotonously increasing and uses numpy’s searchsorted function under the hood.

The function ‘get_between_and_around’ includes the samples before and after the time range (or at the exact points in time).

trimes.get_between_and_around(df, 9.5, 10)

	a	b
time
9.499997	0.871153	-0.063697
9.501003	0.870084	-0.063874
9.501996	0.869029	-0.064048
9.502996	0.867968	-0.064222
9.503996	0.866906	-0.064396
...	...	...
9.994996	0.369898	-0.085584
9.996004	0.368966	-0.085502
9.997004	0.368042	-0.085419
9.998004	0.367118	-0.085336
10.000000	0.365276	-0.085169

500 rows × 2 columns

2.4 Delta Between Samples

get_delta returns the difference between two samples (under the hood get_sample is used, so the the next sample after the given time is used).

trimes.get_delta(df, 0, 0.5)

a    0.579259
b   -0.183045
dtype: float64

One can also query the delta between interpolated (linear) values:

delta = trimes.get_delta_interp_df(df, 0, 0.5)
print(delta)
trimes.get_delta_interp_series(df["a"], 0, 0.5)

a    0.579259
b   -0.183045
dtype: float64

np.float64(0.5792589551981927)

get_delta_shift returns the delta between samples around a point in time, in this case between sample 2 before 0.1 and one sample after 0.1:

trimes.get_delta_shift(df, 0.1, -2, 0)

a    0.002383
b   -0.000297
dtype: float64

3 Resampling

The resample method… well you guessed it:

df_resampled = trimes.resample(df, np.arange(0, 10.001, 1))
print(df_resampled)
df["a"].plot(label="a original")
df["b"].plot(label="b original")
df_resampled["a"].plot(label="a resampled")
df_resampled["b"].plot(label="b resampled")
plt.legend(loc="upper center")
plt.grid()

             a         b
0.0   0.000000  2.000000
1.0   1.040140  1.305144
2.0   1.298716 -0.246180
3.0   0.705762 -1.494839
4.0  -0.039446 -1.595866
5.0  -0.117453 -0.652795
6.0   0.652624  0.469909
7.0   1.642436  0.940415
8.0   1.988932  0.630066
9.0   1.385966  0.088728
10.0  0.365276 -0.085169