edvart.report_sections.timeseries_analysis package

Submodules

edvart.report_sections.timeseries_analysis.autocorrelation module

class edvart.report_sections.timeseries_analysis.autocorrelation.Autocorrelation(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None)[source]

Bases: Section

Generates autocorrelation (ACF) and partial autocorrelation function (PACF) subsection.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates ACF and PACF plots in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.autocorrelation.plot_acf(df: DataFrame, columns: List[str] | None = None, lags: List[int] | None = None, figsize: Tuple[float, float] = (15, 6), partial: bool = False) None[source]

Plot ACF or PACF.

Autocorrelation function (ACF) returns, for a given lag, correlation between the timeseries and itself shifted by this lag.

Partial autocorrelation function (PACF) returns conditional autocorrelation given all smaller lag values up to the given lag.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • lags (List[int], optional) – List of lag values to plot ACF for

  • figsize (Tuple[float, float] (default = (15, 6))) – Size of figure of (partial) autocorrelation plot.

  • partial (bool (default = False)) – If True, PACF will be plotted, otherwise, ACF will be plotted.

edvart.report_sections.timeseries_analysis.autocorrelation.plot_pacf(df: DataFrame, columns: List[str] | None = None, lags: List[int] | None = None, figsize: Tuple[float, float] = (15, 6)) None[source]

Plot PACF.

Partial autocorrelation function (PACF) returns conditional autocorrelation given all smaller lag values up to the given lag.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • lags (List[int], optional) – List of lag values to plot ACF for

  • figsize (Tuple[float, float] (default = (15, 6))) – Size of figure of (partial) autocorrelation plot.

edvart.report_sections.timeseries_analysis.autocorrelation.show_autocorrelation(df: DataFrame, columns: List[str] | None = None, lags: List[int] | None = None, figsize: Tuple[float, float] = (15, 6)) None[source]

Generate autocorrelation (ACF) and partial autocorrelation function (PACF) plots.

ACF returns, for a given lag, correlation between a given timeseries and itself shifted by this lag.

Partial autocorrelation function returns conditional autocorrelation given all smaller lag values up to the given lag.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • lags (List[int], optional) – List of lag values to plot ACF for

  • figsize (Tuple[float, float] (default = (15, 6))) – Size of figure of (partial) autocorrelation plot.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.boxplots_over_time module

class edvart.report_sections.timeseries_analysis.boxplots_over_time.BoxplotsOverTime(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None, grouping_function: Callable[[Any], str] | None = None, grouping_function_imports: List[str] | None = None, grouping_name: str | None = None, default_nunique_max: int = 80)[source]

Bases: Section

Generates the boxplots over time intervals section.

For each column, generates a series of boxes, each box representing distribution of values in the given column during a time interval.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • grouping_function (Callable[[Any], str], optional) – Function to group the data into intervals. Cannot pass an anonymous function, i.e. the function must be assigned an identifier. To pass a lambda, simply assign it to a variable and pass the variable. If None is passed, a default grouping will be selected (see default_nunique_max).

  • grouping_function_imports (List[str], optional) – Additional imports required for the grouping function.

  • grouping_name (str, optional) – Name of grouping, will be displayed as title of the horizontal axis.

  • default_nunique_max (int (default = 80)) – If no grouping function is passed, the most granular grouping which produces at most default_nunique_max unique values is selected from the following: Hour, Day, Week, Month, Quarter, Year, Decade. If a default grouping is selected, a corresponding name is displayed on the horizontal axis by default

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates boxplots grouped over time intervals in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.boxplots_over_time.default_grouping_functions() Dict[str, Callable[[Timestamp], str]][source]

Return a dictionary of function names and functions.

The function takes a pandas datetime and represents it as a rougher (in terms of time) string, which can be used for grouping. Available groupings are: Hour, Day, Week, Month, Quarter, Year, Decade

Returns:

Dictionary from grouping function names to grouping functions.

Return type:

Dict[str, Callable[[pandas.Timestamp], str]]

edvart.report_sections.timeseries_analysis.boxplots_over_time.get_default_grouping_func(df: DataFrame, nunique_max: int = 80) Tuple[str, Callable][source]

Return the most granular function to group df.index into at most nunique_max intervals.

Uses grouping functions from default_grouping_functions.

Parameters:
  • df (pd.DataFrame) – Dataframe, for index of which a suitable grouping

  • nunique_max (int (default = 80)) – Maximum number of intervals to group to.

Returns:

Name of selected grouping function and the grouping function itself.

Return type:

Tuple[str, Callable]

edvart.report_sections.timeseries_analysis.boxplots_over_time.show_boxplots_over_time(df: DataFrame, columns: List[str] | None = None, grouping_function: Callable[[Any], str] | None = None, grouping_name: str | None = None, default_nunique_max: int = 80, figsize: Tuple[float, float] = (20, 7), color: Any = None) None[source]

Generate boxplots over time intervals.

For each column, generates a series of boxes, each box representing distribution of values in the given column during a time interval.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • grouping_function (Callable[[Any], str], optional) – Function to group the data into intervals. Cannot pass an anonymous function, i.e. the function must be assigned an identifier. To pass a lambda, simply assign it to a variable and pass the variable. If None is passed, a default grouping will be selected (see default_nunique_max).

  • grouping_name (str, optional) – Name of grouping, will be displayed as title of the horizontal axis.

  • default_nunique_max (int, optional) – If no grouping function is passed, the most granular grouping which produces at most default_nunique_max unique values is selected from the following: Hour, Day, Week, Month, Quarter, Year, Decade. If a default grouping is selected, a corresponding name is displayed on the horizontal axis by default

  • figsize (Tuple[float, float] (default = (20, 7))) – Size of boxplot series figure for each column.

  • color (Any, optional) – Color or color map compatible with matplotlib/seaborn. By default a “rainbow” color map - color of individual boxes changes over time.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.fourier_transform module

class edvart.report_sections.timeseries_analysis.fourier_transform.FourierTransform(sampling_rate: int, verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None)[source]

Bases: Section

Generates the Discrete Fourier Transform spectrum plot subsection.

Parameters:
  • sampling_rate (int) – The time series will be considered as samples from a lower-frequency at this rate, i.e. frequencies in multiples of (1 / sampling rate) will be analyzed.

  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates Fourier transform spectrum plot(s) in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.fourier_transform.show_fourier_transform(df: DataFrame, sampling_rate: int, columns: List[str] | None = None, figsize: Tuple[float, float] = (15, 6), log: bool = False, freq_min: float | None = None, freq_max: float | None = None) None[source]

Generate Discrete Fourier Transform frequency vs amplitude plot.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • sampling_rate (int) – The time series will be considered as samples from a lower-frequency at this rate, i.e. frequencies in multiples of (1 / sampling rate) will be analyzed.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • figsize (Tuple[float, float] (default = (15, 6))) – Size of frequency-amplitude plot.

  • log (bool (default = False)) – Whether to plot amplitude in logarithmic scale – in decibel.

  • freq_min (float, optional) – Lowest frequency to show in the plot. All computed frequencies are shown by default.

  • freq_max (float, optional) – Highest frequency to show in the plot. All computed frequencies are shown by default.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.rolling_statistics module

class edvart.report_sections.timeseries_analysis.rolling_statistics.RollingStatistics(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None, window_size: int = 20)[source]

Bases: Section

Generates the rolling statistics interactive plot subsection.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • window_size (int (default = 20)) – Size of the rolling window to use when computing rolling statistics.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates rolling statistics interactive plot(s) in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.rolling_statistics.show_rolling_statistics(df: DataFrame, columns: List[str] | None = None, window_size: int = 20, show_bands: bool = True, band_width: float = 1.0, show_std_dev: bool = True, color_mean: str = '#2040FF', color_band: str = '#90E0FF', color_std: str = '#CD5C5C') None[source]

Display rolling statistics interactive plot.

Displays a separate plot for each column of df.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – Columns to analyze. Only numeric columns can be analyzed. All numeric columns are used by default.

  • window_size (int (default = 20)) – Size of the rolling window to use when computing rolling statistics.

  • show_bands (bool (default = True)) – Whether to show lines delimiting the range [rolling_mean - band_width * rolling_std, rolling_mean + band_width * rolling_std]

  • band_width (float (default = 1.)) – Multiple of standard deviation from mean to show bands at. Ignored if not showing bands.

  • show_std_dev (bool (default = True)) – Whether to plot rolling standard deviation.

  • color_mean (str (default = "#2040FF")) – Color of the line showing rolling mean.

  • color_band (str (default = "#90E0FF")) – Color of the lines showing bands around rolling mean. Ignored if not showing bands.

  • color_std (str (default = "#CD5C5C")) – Color of the line showing standard deviation. Ignored if not showing standard deviation.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.seasonal_decomposition module

class edvart.report_sections.timeseries_analysis.seasonal_decomposition.SeasonalDecomposition(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None, period: int | None = None, model: str = 'additive')[source]

Bases: Section

Generates seasonal decomposition subsection.

Each timeseries represented by one column is decomposed into trend, seasonal and residual (noise) components. This is a primitive decomposition. The seasonal component is first removed by applying a convolution filter to the data. The average of this smoothed series for each period is the returned seasonal component.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • period (int, optional) – Period to use when modelling seasonal component. If None, period is inferred from frequency of df.index, provided pd.infer_freq is able to infer the frequency. Otherwise, this parameter has to be set manually.

  • model (str (default = "additive")) – Can be either “multiplicative” or “additive”. If “additive”, series is modelled as series = trend + seasonal + noise If “multiplicative”, series is modelled as series = trend * seasonal * noise

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates seasonal decomposition plot(s) in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.seasonal_decomposition.show_seasonal_decomposition(df: DataFrame, columns: List[str] | None = None, period: int | None = None, model: str = 'additive', figsize: Tuple[float, float] = (20, 10)) None[source]

Generate the seasonal decomposition plot.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • period (int, optional) – Period to use when modelling seasonal component. If None, period is inferred from frequency of df.index, provided pd.infer_freq is able to infer the frequency. Otherwise, this parameter has to be set manually.

  • model (str (default = "additive")) – Can be either “multiplicative” or “additive”. If “additive”, series is modelled as series = trend + seasonal + noise If “multiplicative”, series is modelled as series = trend * seasonal * noise

  • figsize (Tuple[float, float] (default = (20, 10))) – Size of the whole figure for one column (i.e. includes plots of all components).

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.short_time_ft module

class edvart.report_sections.timeseries_analysis.short_time_ft.ShortTimeFT(sampling_rate: int, window_size: int, verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None)[source]

Bases: Section

Generates Short-time discrete Fourier transform spectrogram plot subsection.

Parameters:
  • sampling_rate (int) – The time series will be considered as samples from a lower-frequency at this rate, i.e. frequencies in multiples of (1 / sampling rate) will be analyzed.

  • window_size (int) – Size of window to perform DFT on to obtain Short-time Fourier transform.

  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates Short-time Fourier transform spectrogram in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.short_time_ft.show_short_time_ft(df: DataFrame, sampling_rate: int, window_size: int, columns: List[str] | None = None, overlap: int | None = None, log: bool = True, window: str | Tuple | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = 'hamming', scaling: str = 'spectrum', figsize: Tuple[float, float] = (20, 7), colormap: Any = 'viridis', freq_min: float | None = None, freq_max: float | None = None) None[source]

Generates Short-time discrete Fourier transform spectrogram plot.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • sampling_rate (int) – The time series will be considered as samples from a lower-frequency at this rate, i.e. frequencies in multiples of (1 / sampling rate) will be analyzed.

  • window_size (int) – Size of window to perform DFT on to obtain Short-time Fourier transform.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • overlap (int, optional) – How many samples adjacent windows overlap by. Default window_size // 8.

  • log (bool (default = True)) – Whether to color plot according by linear-scale amplitude or log-scale (in decibel).

  • window (str (default = "hann")) – Type of weighting of individual samples in a window. If string or tuple, it is passed to scipy.signal.get_window. If array-like, each term is weight for the corresponding sample within the windows

  • scaling (str (default = "density")) – Selects between computing the power spectral density (“density”) with units of V**2/Hz and computing the power spectrum (“spectrum”) with units of V**2, if input values are measured in V and sampling_rate is measured in Hz.

  • figsize (Tuple[float, float] (default = (20, 7))) – Size of generated spectral plot figure.

  • colormap (Any) – Any seaborn-compatible colormap.

  • freq_min (float, optional) – Lowest frequency to show in the plot. All computed frequencies are shown by default.

  • freq_max (float, optional) – Highest frequency to show in the plot. All computed frequencies are shown by default.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.stationarity_tests module

class edvart.report_sections.timeseries_analysis.stationarity_tests.StationarityTests(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None)[source]

Bases: Section

Generates the stationarity tests subsection.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the generated code in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates seasonal decomposition plot(s) in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.stationarity_tests.default_stationarity_tests() Dict[str, Callable[[Series], Tuple]][source]

Return a dictionary of stationarity test and functions.

Stationarity tests are:

KPSS (constant)

which has a null hypothesis that a given series is stationary around a constant value

KPSS (trend)

which has a null hypothesis that a given series is stationary around a constant-slope, i.e. a linear function

Augmented Dickey-Fuller

which has a null hypothesis of a unit root, i.e. non-stationarity.

Returns:

A dictionary from test name to function.

Return type:

Dict[str, Callable]

edvart.report_sections.timeseries_analysis.stationarity_tests.show_stationarity_tests(df: DataFrame, columns: List[str] | None = None, kpss_const: bool = True, kpss_trend: bool = True, adfuller: bool = True) None[source]

Show stationarity for each numeric column.

Parameters:
  • df (pd.DataFrame) – Data to test.

  • columns (List[str], optional) – List of columns to test. Only numeric columns can be used. All numeric columns are used by default.

  • kpss_const (bool (default = True)) – Whether to perform KPSS (constant) test.

  • kpss_trend (bool (default = True)) – Whether to perform KPSS (trend) test.

  • adfuller (bool (default = True)) – Whether to perform Augmented Dickey-Fuller test.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.time_series_line_plot module

class edvart.report_sections.timeseries_analysis.time_series_line_plot.TimeSeriesLinePlot(verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None, separate_plots: bool = False, color_col: str | None = None)[source]

Bases: Section

Generates the time series line plot section.

Parameters:
  • verbosity (Verbosity (default = Verbosity.LOW)) – Verbosity of the code generated in the exported notebook.

  • columns (List[str], optional) – List of columns to analyze. Only numeric column can be analyzed. All numeric columns are analyzed by default.

  • separate_plots (bool (default = False)) – Whether to plot each column in a separate plot. All columns are plotted in a single plot by default.

  • color_col (str, optional) – Which column to use for coloring of the lines. Each segment of the line will be colored according to value in this column in the given time point. If this parameter is set, each column will be plotted in a separate plot (separate_plots param is ignored).

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Adds cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells.

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”].

Return type:

List[str]

show(df: DataFrame) None[source]

Generates time series line plot(s) in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output

edvart.report_sections.timeseries_analysis.time_series_line_plot.show_time_series_line_plot(df, columns: List[str] | None = None, separate_plots: bool = False, color_col: str | None = None) None[source]

Display time series line plot.

Parameters:
  • df (pd.DataFrame) – Data to analyze.

  • columns (List[str], optional) – Columns to analyze. Only numeric columns can be analyzed. All numeric columns are used by default.

  • separate_plots (bool (default = False)) – Whether to plot each column in a separate plot. All columns are plotted in a single plot by default.

  • color_col (str, optional) – Name of column to use for coloring of the lines. Each segment of the line will be colored according to value in this column in the given time point. If this parameter is set, each column will be plotted in a separate plot (separate_plots param is ignored).

Raises:

ValueError – If the input data is not indexed by time in ascending order.

edvart.report_sections.timeseries_analysis.timeseries_analysis module

class edvart.report_sections.timeseries_analysis.timeseries_analysis.TimeseriesAnalysis(subsections: List[TimeseriesAnalysisSubsection] | None = None, verbosity: Verbosity = Verbosity.LOW, columns: List[str] | None = None, verbosity_time_series_line_plot: Verbosity | None = None, verbosity_rolling_statistics: Verbosity | None = None, verbosity_boxplots_over_time: Verbosity | None = None, verbosity_seasonal_decomposition: Verbosity | None = None, verbosity_stationarity_tests: Verbosity | None = None, verbosity_autocorrelation: Verbosity | None = None, verbosity_fourier_transform: Verbosity | None = None, verbosity_short_time_ft: Verbosity | None = None, sampling_rate: int | None = None, stft_window_size: int | None = None)[source]

Bases: ReportSection

Generates the Timeseries analysis section of the report.

Contains an enum TimeseriesAnalysisSubsection of possible subsections.

Parameters:
  • subsections (List[TimeseriesAnalysisSubsection], optional) – List of subsections to include. All subsection in TimeseriesAnalysisSubsection are included by default, except for FourierTransform, which is only included if sampling_rate is set and ShortTimeFT, which is only included if sampling_rate and stft_window_size are both set.

  • verbosity (Verbosity) – Generated code verbosity global to the Overview sections. If subsection verbosities are None, then they will be overridden by this parameter.

  • columns (List[str], optional) – Columns to include in timeseries analysis. Each column is treated as a separate time series. All columns are used by default.

  • verbosity_series_line_plot (Verbosity, optional) – Time series line plot subsection code verbosity.

  • verbosity_rolling_statistics (Verbosity, optional) – Rolling statistics interactive plot subsection code verbosity.

  • verbosity_boxplots_over_time (Verbosity, optional) – Boxplots grouped over time intervals subsection code verbosity.

  • verbosity_seasonal_decomposition (Verbosity, optional) – Seasonal decomposition subsection code verbosity.

  • verbosity_stationarity_tests (Verbosity, optional) – Stationarity tests subsection code verbosity.

  • verbosity_autocorrelation (Verbosity, optional) – Autocorrelation and partial autocorrelation plot subsection code verbosity.

  • verbosity_fourier_transform (Verbosity, optional) – Discrete Fourier transform plot subsection code verbosity.

  • verbosity_short_time_ft (Verbosity, optional) – Short-time discrete Fourier transform plot subsection code verbosity.

  • sampling_rate (int, optional) – Sampling rate of the time-series, i.e., how many samples form one period. For example, if your time-series contains hourly data and want to investigate daily frequencies, use 24. If not set, Fourier transform and Short-time Fourier transform will not be included.

  • stft_window_size (int) – Window size for Short-time Fourier transform, which will not be included if this parameter is not set.

add_cells(cells: List[Dict[str, Any]], df: DataFrame) None[source]

Add cells to the list of cells.

Cells can be either code cells or markdown cells.

Parameters:
  • cells (List[Dict[str, Any]]) – List of generated notebook cells which are represented as dictionaries

  • df (pd.DataFrame) – Data for which to add the cells

property name: str

Name of the section.

Returns:

Name of the section.

Return type:

str

required_imports() List[str][source]

Returns a list of imports to be put at the top of a generated notebook.

Returns:

List of import strings to be added at the top of the generated notebook, e.g. [“import pandas as pd”, “import numpy as np”]

Return type:

List[str]

show(df: DataFrame) None[source]

Generates cell output of this section in the calling notebook.

Parameters:

df (pd.DataFrame) – Data based on which to generate the cell output.

class edvart.report_sections.timeseries_analysis.timeseries_analysis.TimeseriesAnalysisSubsection(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: IntEnum

Enum of all implemented timeseries analysis subsections.

Autocorrelation = 5
BoxplotsOverTime = 2
FourierTransform = 6
RollingStatistics = 1
SeasonalDecomposition = 3
ShortTimeFT = 7
StationarityTests = 4
TimeSeriesLinePlot = 0
edvart.report_sections.timeseries_analysis.timeseries_analysis.show_timeseries_analysis(df: DataFrame, subsections: List[TimeseriesAnalysisSubsection] | None = None, columns: List[str] | None = None, sampling_rate: int | None = None, stft_window_size: int | None = None) None[source]

Generate timeseries analysis for df.

Parameters:
  • df (pd.DataFrame) – Data to be analyzed.

  • subsections (List[TimeseriesAnalysisSubsection], optional) – Subsections to include in the analysis. All subsections are included by default.

  • columns (List[str], optional) – Subset of columns of df to consider in timeseries analysis. All columns are used by default.

  • sampling_rate (int, optional) – Sampling rate of the time-series, i.e., how many samples form one period. For example, if your timeseries contains hourly data and you want to investigate daily frequencies, use 24. If not set, Fourier transform and Short-time Fourier transform will not be included.

  • stft_window_size (int, optional) – Window size for Short-time Fourier transform. Short-time Fourier transform will not be included if this parameter is not set.

Raises:

ValueError – If the input data is not indexed by time in ascending order.

Module contents