epysurv.simulation package¶
Submodules¶
epysurv.simulation.naive_poisson module¶
-
epysurv.simulation.naive_poisson.
get_outbreak_begins
(n: int, outbreak_length: int, n_outbreaks: int) → Set[int][source]¶
-
epysurv.simulation.naive_poisson.
simulate_outbreaks
(n: int = 104, outbreak_length: int = 5, n_outbreaks: int = 3, mu: float = 1, outbreak_mu: float = 10) → pandas.core.frame.DataFrame[source]¶ Simulate outbreaks based on Poisson distribution.
- Parameters
n – Number of weeks.
outbreak_length – Number of weeks each outbreak is long.
n_outbreaks – Number of outbreaks.
mu – Mean for the baseline.
outbreak_mu – Mean for the outbreaks.
- Returns
Simulated case counts per week, separated into baseline and outbreak cases.
Module contents¶
Module for simulating epidemiological data.
-
class
epysurv.simulation.
PointSource
(alpha: float = 1.0, amplitude: float = 1.0, frequency: int = 1, p: float = 0.99, r: float = 0.01, seasonal_move: int = 0, seed: Optional[int] = None, trend: float = 0.0)[source]¶ Bases:
epysurv.simulation.base.BaseSimulation
Simulation of epidemics which were introduced by point sources.
The basis of this programme is a combination of a Hidden Markov Model (to get random time points for outbreaks) and a simple model (compare
epysurv.simulation.SeasonalNoise
) to simulate the baseline.- Parameters
amplitude – Amplitude of the sine. Determines the possible range of simulated seasonal cases.
alpha – Parameter to move along the y-axis (negative values are not allowed) with alpha >= amplitude.
frequency – Factor in oscillation term. Is multiplied with the annual term \(\omega\) and the current time point.
p – Probability to get a new outbreak at time \(t\) if there was one at time \(t-1\).
r – Probability to get no new outbreak at time \(t\) if there was none at time \(t-1\).
seasonal_move – A term added to time point \(t\) to move the curve along the x-axis.
seed – Seed for the random number generation.
trend – Controls the influence of the current week on \(\mu\).
References
http://surveillance.r-forge.r-project.org/
-
simulate
(length: int, state_weight: float = 0, state: Optional[Sequence[int]] = None) → pandas.core.frame.DataFrame[source]¶ Simulate outbreaks.
- Parameters
length – Number of weeks to model.
length
is ignored ifstate
is given. In this case, the length ofstate
is used.state – Use a state chain to define the status at this time point (outbreak or not). If not given, a Markov chain is generated automatically.
state_weight – Additional weight for an outbreak which influences the distribution parameter mu.
- Returns
A
DataFrame
of simulated case counts per week, separated into baseline and outbreak cases.
-
class
epysurv.simulation.
SeasonalNoiseNegativeBinomial
(baseline_frequency: float = 1.5, dispersion: float = 1.0, seasonality_cos: float = 0.2, seasonality_sin: float = -0.4, seasonality_length: int = 1, seed: Optional[int] = None, trend: float = 0.003)[source]¶ Bases:
epysurv.simulation.base.BaseSimulation
A time series simulation that generates case counts based on a negative binomial model.
The model is described by a mean \(\mu\), variance \(\phi \cdot \mu\), and a linear predictor including trend and seasonality determined by Fourier terms. \(\mu\) of the model depends on the current week and is defined as follows:
\(\mu(t) = \exp \left\{ \theta + \beta t + \sum_{j=1}^{m} \left\{ \gamma_{1} \cos (\frac{2\pi j t}{52}) + \gamma_{2} \sin (\frac{2\pi j t}{52}) \right\} \right\}\)
where \(t\) is the current week, \(m\) the seasonality length, \(\beta\) equals to the trend parameter, \(\gamma\) is a seasonality parameter, and \(\theta\) is the baseline frequency of the cases.
The simulation is then run using \(\mu\) and the dispersion parameter \(\phi\) to specify the negative binomial model we draw case counts from.
- Parameters
baseline_frequency – Baseline frequency of cases.
dispersion – Regulates the overdispersion compared to the Poisson distribution (\(\phi \cdot \mu\)).
seasonality_cos – Seasonality parameter to model \(\cos\) of the Fourier term.
seasonality_sin – seasonality parameter to model \(\sin\) of the Fourier term.
seasonality_length – Models the annual-wise seasonality. 0 equals to no seasonality, 1 to annual seasonality, 2 to biannual seasonality and so forth.
seed – A seed for the random number generation.
trend – Controls the influence of the current week on \(\mu\).
References
- 1
Noufaily, A., Enki, D.G., Farrington, C.P., Garthwaite, P., Andrews, N.J., Charlett, A. (2012): An improved algorithm for outbreak detection in multiple surveillance systems. Statistics in Medicine, 32 (7), 1206-1222.
-
class
epysurv.simulation.
SeasonalNoisePoisson
(alpha: float = 1.0, amplitude: float = 1.0, frequency: int = 1, seasonal_move: int = 0, seed: Optional[int] = None, trend: float = 0.0)[source]¶ Bases:
epysurv.simulation.base.BaseSimulation
Simulation of an endemic time series based on a Poisson distribution.
The mean of the Poisson distribution is modelled as:
\(\mu(t) = \exp{(A\sin{(frequency \cdot \omega \cdot (t + \phi))} + \alpha + \beta \cdot t + K \cdot state)}\)
with \(\omega = \pi / 52\), \(A\) being the amplitude, \(\beta\) the trend parameter, \(t\) the current week, and \(\theta\) the seasonal move.
- Parameters
amplitude – Amplitude of the sine. Determines the range of simulated cases.
alpha – Parameter to move simulation along the y-axis (negative values are not allowed) with alpha >= amplitude.
frequency – Factor in oscillation term. Is multiplied with the annual term \(\omega\) and the current time point.
seasonal_move – A term added to each time point \(t\) to move the curve along the x-axis.
seed – Seed for the random number generation.
trend – Controls the influence of the current week on \(\mu\).
References
http://surveillance.r-forge.r-project.org/
-
simulate
(length: int, state_weight: Optional[float] = None, state: Optional[Sequence[int]] = None) → pandas.core.frame.DataFrame[source]¶ Simulate outbreaks.
- Parameters
length – Number of weeks to model.
length
is ignored ifstate
is given. In this case the length ofstate
is used.state – Use a state chain to define the status at this time point (outbreak or not). If not given, a Markov chain is generated automatically.
state_weight – Additional weight for an outbreak which influences the distribution parameter \(\mu\).
- Returns
A
DataFrame
of an endemic time series where each row contains the case counts of this week.It also contains the mean case count value based on the underlying sinus model.