Welcome to Simulation Outbreak


Find the cut-off of outbreak related cases and exclude outsiders with a Wright-Fisher forward model.
You can also find it in the SimulPopB R package.
Simulation Outbreak offer the option to estimate the duration or the number of mutations from a source with SNP (or allele from cgMLST data) differences distance matrix.



Disclaimer

Please be aware that this is a proof-of-concept website made for academic research purpose only. Do not use for medical diagnosis purposes. This software comes with absolutely no warranty or dedicated support.

Getting started

Enter the Type of cut-off you want. If you want a genetic cut-off based on SNP data enter the Genome size or if you want a genetic cut-off based on cgMLST data enter the Average size of genes and the Number of genes. Enter the duration of outbreak in Duration and the Number of mutations per site per year.

You need a csv file with all swab dates to get the cut-off and/or a csv file with the pairwise distance between each isolates to discriminate outbreak to non outbreak cases according to the genetic cut-off found in the CSV Files box.

Epidemiological value

CSV Files

Run Simulation

Download

Distribution

Cluster

Getting started

Enter the Type of cut-off you want. If you want a genetic cut-off based on SNP data enter the Genome size or if you want a genetic cut-off based on cgMLST data enter the Average size of genes and the Number of genes. Enter the Number of mutations per site per year and the duration unit in Time step.

You need a csv file with all swab dates to get the cut-off and a csv file with the pairwise distance between each isolates to estimate the duration of outbreak with MCMC in the CSV Files box.

Enter limits of the estimated parameters in Min and Max. Enter a first start for the estimated parameters in Middle. If you want to complete the MCMC, you can add a Number of iteration, the burning and the Number of chain used for the estimated parameter (average of the chain).

Epidemiological value

CSV Files

MCMC Parameters

Run Estimation

Download

Distribution

Cluster

Getting started

Enter the Type of cut-off you want. If you want a genetic cut-off based on SNP data enter the Genome size or if you want a genetic cut-off based on cgMLST data enter the Average size of genes and the Number of genes. Enter the Duration of outbreak with the Time step.

You need a csv file with all swab dates to get the cut-off and a csv file with the pairwise distance between each isolates to estimate the duration of outbreak with MCMC in the CSV Files box.

Enter limits of the estimated parameters in Min and Max. Enter a first start for the estimated parameters in Middle. If you want to complete the MCMC, you can add a Number of iteration, the burning and the Number of chain used for the estimated parameter (average of the chain).

Epidemiological value

CSV Files

MCMC Parameters

Run Estimation

Download

Distribution

Cluster

FAQ

What is SimulPopB ?

SimulPopB is a web tool that allow you to find the genetic cut-off that group outbreak cases and exclude outsiders. Two genetic features can be used: SNP or cgMLST data. This tool is composed of a Wright-Fisher forward model that simulate the evolution of a given bacteria. First, it start with a naive bacteria that will accumulate mutation according to the number of mutations per site per year during the duration of outbreak. At the end, SimulPopB will give you a pairwise genetic distance difference distribution as well as the genetic cut-off. You can also estimate the duration of your outbreak or the number of mutation per site per year with an MCMC method based on the modMCMC function of the FME R packages by Stoetaert et al.


What kind of data I need?

To use SimulPop you need a sample dates file (.csv). First column is “ID”, second is “dates” in “dd/mm/YYYY” or “YYYY-mm-dd”.

ID dates
1 18/08/2020
2 19/08/2020
3 20/08/2020


You also need parameters such as the duration of your outbreak, the number of mutations per site per year, the genome size (for SNP data) or the number of genes with their average size (for cgMLST data). Moreover, if you want to estimate the duration of outbreak or the number of mutations per site per year, you need a pairwise distance matrix of your genetic data (.csv) with first row as “ID”.

1 2 3
0 2 1
2 0 2
1 2 0


Simulation

This part allow you to find the genetic cut-off. First, you need to enter your type of genetic feature (SNP or cgMLST). Then, you enter the genome size for “SNP” or the average size of genes and the number of genes for “cgMLST”. You can also decide the time step (day or month) for the simulation. If your duration of outbreak is high, month will give you quicker answer with less precision. The number of mutations per site per year is also required. Finally, you have to give your samples dates file in the “Choose CSV Date File” section. If you don't have a pairwise distance matrix, the tool will five you the simulated distribution of pairwise distance and the genetic cut-off in the “Distribution” section. If you have the pairwise distance matrix, the tool will group your samples according to the genetic cut-off found in the “Cluster” section. After giving you the genetic cut-off and the cluster, you will have the possibility to change the cut-off and see what append in the “Cluster” section.


Estimation

In the estimation part, you can estimate the duration of outbreak of the number of mutations per site per year. The pairwise distance matrix is required. As well as for the Simulation part, you need to enter your type of genetic feature (SNP or cgMLST). Then, you enter the genome size for “SNP” or the average size of genes and the number of genes for “cgMLST”. You can also decide the time step (day or month) for the simulation and give your number of mutations per site per year for duration estimation or the duration of your outbreak for the mutation per site per year estimation. You can also choose the number of iteration for your MCMC, as well as the burnin and the number of chain you want. This last one is very important because the estimated parameter will be computed as the average estimation of the MCMC chain. Estimation needs minimal and maximal parameters that are the interval in which you think your parameters is. And a middle value.


Saving your findings

All plots can be saved as pdf file with download button

Authors & Finding

SimulPopB was developed by Audrey Duval, Sylvain Brisse and Lulla Opatowski. The original development of this tool was performed in the frame of the post-doc project of Audrey Duval supervized by Sylvain Brisse and Lulla Opatowski.


People involved

Sylvain Brisse (Scientific manager, team leader, documentation, Initiator)
Lulla Opatowski (Scientific manager, team leader, documentation)
Audrey Duval (coding, testing, documentation, evaluation)


Funding

This work was financially supported by the MedVetKlebs project, a component of European Joint Programme One Health EP, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 773830.