epysurv.data package

Submodules

epysurv.data.disease_loader module

epysurv.data.disease_loader.load_diseases(path)[source]

epysurv.data.filter_combination module

class epysurv.data.filter_combination.FilterCombination(disease: str, county: str, pathogen: str, data: pandas.core.frame.DataFrame)[source]

Bases: object

Representation of case records filtered by combination of county and pathogen.

disease

The disease from which the cases suffer.

county

The county in which the cases where reported.

pathogen

The pathogen subtype.

data

The case records.

expanding_windows(min_len_in_weeks: int, split_years: epysurv.data.filter_combination.SplitYears) → epysurv.data.filter_combination.TimeseriesClassificationData[source]

Transform case records into expanding time series.

Parameters
  • min_len_in_weeks – The minimum length of each time series.

  • split_years – The years at which to split the data into train and test data.

Returns

Compound object of train and test data as generators and dataframes.

class epysurv.data.filter_combination.SplitYears(start: pandas._libs.tslibs.timestamps.Timestamp, middle: pandas._libs.tslibs.timestamps.Timestamp, end: pandas._libs.tslibs.timestamps.Timestamp)[source]

Bases: object

Data structure that holds the years data should be split into training and test set.

start to middle is the training data. middle to end is the test data.

classmethod from_ts_input(start, middle, end)[source]

Create instance from inputs that are passed through pd.Timestamp.

class epysurv.data.filter_combination.TimeseriesClassificationData(train_final, test_final, train_gen, test_gen)

Bases: tuple

property test_final

Alias for field number 1

property test_gen

Alias for field number 3

property train_final

Alias for field number 0

property train_gen

Alias for field number 2

epysurv.data.salmonella_data module

class epysurv.data.salmonella_data.TimeseriesClassificationData(train, test, train_gen, test_gen)

Bases: tuple

property test

Alias for field number 1

property test_gen

Alias for field number 3

property train

Alias for field number 0

property train_gen

Alias for field number 2

epysurv.data.salmonella_data.salmonella()[source]

Count data from Salmonella newport in Germany.

epysurv.data.salmonella_data.timeseries_classifaction_generator(train: pandas.core.frame.DataFrame, test: pandas.core.frame.DataFrame, offset_in_weeks: int) → Tuple[Generator, Generator][source]

Turn a time point classification problem into a time series classification problem.

epysurv.data.salmonella_data.timeseries_classifcation(train: pandas.core.frame.DataFrame, test: pandas.core.frame.DataFrame, offset_in_weeks: int) → epysurv.data.salmonella_data.TimeseriesClassificationData[source]

Convert standard timeseries for usage in time series classification.

epysurv.data.utils module

epysurv.data.utils.timedelta_weeks(weeks: int)[source]

Module contents

Module for handling data transformation and example data.

epysurv.data.load_diseases(path)[source]
class epysurv.data.TimeseriesClassificationData(train, test, train_gen, test_gen)

Bases: tuple

property test

Alias for field number 1

property test_gen

Alias for field number 3

property train

Alias for field number 0

property train_gen

Alias for field number 2

epysurv.data.salmonella()[source]

Count data from Salmonella newport in Germany.

epysurv.data.timeseries_classifaction_generator(train: pandas.core.frame.DataFrame, test: pandas.core.frame.DataFrame, offset_in_weeks: int) → Tuple[Generator, Generator][source]

Turn a time point classification problem into a time series classification problem.

epysurv.data.timeseries_classifcation(train: pandas.core.frame.DataFrame, test: pandas.core.frame.DataFrame, offset_in_weeks: int) → epysurv.data.salmonella_data.TimeseriesClassificationData[source]

Convert standard timeseries for usage in time series classification.