Example 2: Epidemic model¶
The DynGPT package can be applied to solve a given state transition network and infer the associated model parameters. In this section, we will demonstrate the basic use of DynGPT using the epidemic model as an illustrative example.

The STN of epidemic model can be described as:
The system state is \(s={{\left( {{s}_{\text{S}}},{{s}_{\text{I}}},{{s}_{\text{R}}} \right)}^{\text{T}}}\in {{\mathbb{N}}^{3}},\) where \({{s}_{\text{S}}},\text{ }{{s}_{\text{I}}}\) and \({{s}_{\text{R}}}\) represent susceptible, infected, and recovered individuals, respectively.
Event sets are \(\mathcal{R}=\left\{ \left. {{\mathcal{R}}_{1}},\ldots ,{{\mathcal{R}}_{7}} \right\} \right.\), \({{\mathcal{R}}_{1}}\) represents susceptible individual infection, \({{\mathcal{R}}_{2}}\) represents infected individual recovery, \({{\mathcal{R}}_{3}}\) represents recovered individual lose immunity, \({{\mathcal{R}}_{4}}\) represents susceptible individual generation, \({{\mathcal{R}}_{5}},\text{ }{{\mathcal{R}}_{6}}\) and \({{\mathcal{R}}_{7}}\) represent removal of susceptible, infected, and recovered individuals.
State change vectors of events are: \(\left\{ {{\Delta }^{\left( k \right)}} \right\},k=1,2,\ldots ,7,\) where \({{\Delta }^{\left( k \right)}}=\left( \Delta _{\text{S}}^{\left( k \right)},\Delta _{\text{I}}^{\left( k \right)},\Delta _{\text{R}}^{\left( k \right)} \right)\) represents the state change vector corresponding to \(k\text{-th}\) event with
\[\begin{split}\begin{align} & \Delta _{i}^{\left( 1 \right)}=-{{\delta }_{i,\text{S}}}+{{\delta }_{i,\text{I}}},\text{ } \\ & \Delta _{i}^{\left( 2 \right)}=-{{\delta }_{i,\text{I}}}+{{\delta }_{i,\text{R}}}, \\ & \Delta _{i}^{\left( 3 \right)}=-{{\delta }_{i,\text{R}}}+{{\delta }_{i,\text{S}}}, \\ & \Delta _{i}^{\left( 4 \right)}={{\delta }_{i,\text{S}}},\text{ }\Delta _{i}^{\left( 5 \right)}=-{{\delta }_{i,\text{S}}}, \\ & \Delta _{i}^{\left( 6 \right)}=-{{\delta }_{i,\text{I}}},\text{ }\Delta _{i}^{\left( 7 \right)}=-{{\delta }_{i,\text{R}}},\text{ } \\ \end{align}\end{split}\]where \(i\in \left\{ \text{OFF},\text{ON},\text{M},\text{P} \right\}\) and \({{\delta }_{i,j}}\) is the Kronecker delta.
The propensity functions are: \({{\lambda }_{1}}={{k}_{\text{S}}}{{s}_{\text{S}}}{{s}_{\text{I}}},\) \({{\lambda }_{2}}={{k}_{\text{I}}}{{s}_{\text{I}}},\) \({{\lambda }_{3}}={{k}_{\text{R}}}{{s}_{\text{R}}},\) \({{\lambda }_{4}}=\rho ,\) \({{\lambda }_{5}}={{\delta }_{\text{S}}}{{s}_{\text{S}}},\) \({{\lambda }_{6}}={{\delta }_{\text{I}}}{{s}_{\text{I}}}\) and \({{\lambda }_{7}}={{\delta }_{\text{R}}}{{s}_{\text{R}}}.\) In this STN, the input of the DynGPT-Solver is the parameters \({{\theta }_{\text{dyn}}}=\left\{ {{k}_{\text{S}}},{{k}_{\text{I}}},{{k}_{\text{R}}},\rho ,{{\delta }_{\text{S}}},{{\delta }_{\text{I}}},{{\delta }_{\text{R}}} \right\}\).
1. Construct configuration file¶
To facilitate data generation and subsequent neural network training, we need create a configuration file for the epidemic model that encapsulates its key properties. The key propertie of STN primarily consists of the following components:
State-related configuration: This includes the names of the states. In epidemic model, the state names (
state_names) are: (“S”, “I”, “R”).Event-related configuration: The occurrence rate functions governing event dynamics, and state transition vectors associated with each event’s occurrence. In epidemic model, the state transition vectors (
state_transition) are: \(\Delta _{i}^{\left( 1 \right)}\), \(\Delta _{i}^{\left( 2 \right)}\), \(\Delta _{i}^{\left( 3 \right)}\), \(\Delta _{i}^{\left( 4 \right)}\), \(\Delta _{i}^{\left( 5 \right)}\), \(\Delta _{i}^{\left( 6 \right)}\) and \(\Delta _{i}^{\left( 7 \right)}\). Occurrence rates for each event (event_rates) are: \({{\lambda }_{1}}={{k}_{\text{S}}}{{s}_{\text{S}}}{{s}_{\text{I}}}\), \({{\lambda }_{2}}={{k}_{\text{I}}}{{s}_{\text{I}}}\), \({{\lambda }_{3}}={{k}_{\text{R}}}{{s}_{\text{R}}}\), \({{\lambda }_{4}}=\rho\), \({{\lambda }_{5}}={{\delta }_{\text{S}}}{{s}_{\text{S}}}\), \({{\lambda }_{6}}={{\delta }_{\text{I}}}{{s}_{\text{I}}}\), \({{\lambda }_{7}}={{\delta }_{\text{R}}}{{s}_{\text{R}}}\). Specifically, in certain events, specific states actively participate and drive the occurrence of the event without changing their own identity. Such a state is referred to as andrive statein the context of the event. In \({{R}_{2}}\), “I” can be described as a driving state, as its presence is essential for the transition of “S” to “I”. However, the identity of “I” in the event remains unchanged as “I”.Parameter-related configuration: The parameter names and the parameter ranges. In epidemic model, the parameter names (
param_names) are: [“ρ”, “kS”, “kI”, “kR”, “δS”, “δI”,”δR”]. Upper bounds of parameters (upper_bound_val) are: [10.0, 0.2, 0.1, 0.4, 0.2, 0.2, 0.2]. Lower bounds of parameters (lower_bound_val) are: [0.05, 0.1, 0.05, 0.05, 0.05, 0.05, 0.05]. Additionally, we need to specify several hyperparameters for performing SSA simulations on the STN, including the model name (model_name), the initial values (initial_val), maximum simulation time (t_end), the number of parameter sets to simulate for training (train_data_size) and validation datasets (valid_data_size), and the directory for storing the simulation results (dataset_dir). A subset of state indices corresponding to the state names to extract is defined asminimal_state_indices.
Hence, the following configuration files can be defined for epidemic model. And we use Python’s json package to store this configuration for subsequent simulation and neural network training.
[ ]:
config_file = {
"state_names": ["S", "I", "R"],
"events": [
{"state_transition": {"S": 1}, "event_rates": "ρ"},
{"state_transition": {"S": -1, "I": 1}, "event_rates": "kS*S*I", "drive_state": "I"},
{"state_transition": {"I": -1, "R": 1}, "event_rates": "kI*I"},
{"state_transition": {"R": -1, "S": 1}, "event_rates": "kR*R"},
{"state_transition": {"S": -1}, "event_rates": "δS"},
{"state_transition": {"I": -1}, "event_rates": "δI"},
{"state_transition": {"R": -1}, "event_rates": "δR"}
],
"parameters": {
"param_names": ["ρ", "kS", "kI", "kR", "δS", "δI", "δR"],
"upper_bound_val": [10.0, 0.2, 0.1, 0.4, 0.2, 0.2, 0.2],
"lower_bound_val": [0.05, 0.1, 0.05, 0.05, 0.05, 0.05, 0.05]
},
"model_name": "epidemic_model",
"initial_val": [10, 5, 0],
"t_end": 6000,
"train_data_size": 10000,
"valid_data_size": 1000,
"dataset_dir": "/GPUFS/sysu_jjzhang_1/hzw/academicCode/DynGPTTest/DynGPT/data/epidemic_model/",
"minimal_state_indices": [1, 2, 3],
"state_upper_bound": [242, 190, 130],
"observed_data_indices": [1, 2, 3]
}
with open("config/{}_config.json".format(config_file["model_name"]), "w") as file:
json.dump(config_file, file, indent=4)
2. Generate datasets¶
After generating the configuration file, a state transition network for the epidemic model is constructed and simulated to produce the dataset required for training the neural network that solves the model. To generate the training dataset, execute the following command in the terminal:
[ ]:
julia simulation.jl /path/to/config.json
3. Train DynGPT-Solver¶
[1]:
# import packages
import sys
import os
import numpy as np
import json
import logging
import warnings
root_dir = "/GPUFS/sysu_jjzhang_1/hzw/academicCode/DynGPTTest/DynGPT/"
sys.path.append(root_dir)
os.chdir(root_dir)
import dyngpt as dg
logging.getLogger().setLevel(logging.ERROR)
warnings.filterwarnings("ignore")
To train a neural network for addressing the epidemic model and inferring its parameters, a basic configuration file must first be defined. This file includes four key components: (1) the information of the epidemic model, (2) the neural network architecture, (3) the directory for saving outputs and visualizations, and (4) the default parameter settings for the neural network. Subsequent configurations for training, sampling, and inference build upon this foundation by incorporating additional task-specific parameters as needed. The relationships between the configurations of each subsequent stage are presented as follows:

Firstly, the basic configuration file must include detailed information about the dynamic model, which has already been stored in the previous configuration file.
[2]:
config_basic = {}
config_path_dynmodel = "your_path_config"
model_name ="epidemic_model"
config_path_dynmodel = "config/{}_config.json".format(model_name)
config_basic = dg.tl.update_dynmodel_config(config_basic,config_path_dynmodel)
Next, the architecture of the neural network is defined by specifying the relevant structural details within the configuration.
[3]:
config_basic["num_encoder_layers"] = 1 # transformer n_layer
config_basic["n_head"] = 8 # transformer n_head
config_basic["feed_forward_dimension"] = 1024 # transformer feed_forward_dimension
Additionally, we must designate a directory for saving output files and visualizations.
[4]:
config_basic["result_dir"]="your/path/to/saveresult/"
config_basic["figure_dir"]="your/path/to/savefigure/"
config_basic["result_dir"]="result/{}/".format(config_basic["model"])
config_basic["figure_dir"]="figure/{}/".format(config_basic["model"])
Finally, execute the provided code to append default configuration parameters.
[5]:
config_basic = dg.tl.update_default_config(config_basic)
After defining the basic configuration file, we proceed to train the neural network designed to solve the epidemic model. The training process is divided into two stages: the first stage involves pretraining the neural network, while the second stage focuses on fine-tuning it. Additionally, for each stage, we visualize the corresponding loss curves to assess the training progress.
For model pretraining, the configuration file is extended from the basic configuration. We should specify the number of epochs (epochs_pretrain) and the training dataset path (train_dataset_path) in the configuration file.
[ ]:
config_pretrain = config_basic
config_pretrain.epochs_pretrain = 1
config_pretrain.train_dataset_path = "your/path/to/train_dataset"
config_pretrain.train_dataset_path = "data/sirs/sirs_train_stable.json"
result_pre_train = dg.solver.pre_training(config_pretrain)
Visualize the loss values during the pretraining phase of the model.
[6]:
result_pre_train = np.load("result/epidemic_model/train_loss/epidemic_model_epoch500.npz")
dg.pl.plot_loss(result_pre_train["loss_mean"])
During the fine-tuning phase, the configuration file is also extended from the basic configuration, specifying the number of epochs (epochs_fine_tune), the training dataset path (train_dataset_path), and the pretrained neural network weights (weight_nn_pretrain_path).
[15]:
config_fine_tune = config_basic
config_fine_tune.epochs_fine_tune = 1 # 1000
config_fine_tune.train_dataset_path = "your/path/to/train_dataset"
config_fine_tune.train_dataset_path = "data/sirs/sirs_train_stable.json"
config_fine_tune.weight_nn_pretrain_path = result_pre_train["weight_nn_pretrain_path"]
result_fine_tune = dg.solver.fine_tune_training(config_fine_tune)
|epoch 0|mean loss: 24.1329, std loss: 31.7089, ed: 0.0000, kd: 0.0003, lr: 0.00000000
Visualize the loss values during the fine tunning phase of the model.
[7]:
result_fine_tune = np.load("result/epidemic_model/train_loss/fine_tune_epidemic_model_epoch1500.npz")
dg.pl.plot_loss(result_fine_tune["loss_mean"])
After the neural network training converges, we valid the performance of the neural network model in solving epidemic model from comparative analysis of the samples generated by the neural network and those obtained through simulations in terms of their mean, standard deviation, and overall distribution. When using the trained model for sampling, the configuration file is also extended from the basic configuration, specifying the validation dataset path (valid_dataset_path) and incorporating
the fine-tuned network weights (weight_nn_fine_tune_path).
[ ]:
config_sampling = config_basic
valid_dataset_path = "your/valid/datasetpath/"
config_sampling.weight_nn_fine_tune_path = result_fine_tune["weight_nn_fine_tune_path"]
valid_dataset_path = "data/sirs/sirs_valid_stable.json"
config_sampling.weight_nn_fine_tune_path = "result/epidemic_model/weights/fine_tune_epidemic_model_epoch1600.pt"
config_sampling.valid_dataset_path = valid_dataset_path
result_valid_set = dg.tl.compute_sampling_stats(config_sampling)
We visualize the performance of the neural network model on the validation set. This includes a comparative analysis of the samples generated by the neural network and those obtained through simulations in terms of their mean, standard deviation, and overall distribution.
[7]:
dg.pl.plot_model_comparison_stats(result_valid_set)
[12]:
dg.pl.plot_distribution_comparison_2d(result_valid_set,sub_species_index = [0,1],data_index = [12,15,16],width_mm=150)
[9]:
result_valid_set["state_name"] = ["S","I","R"]
result_valid_set['observed_datas'],result_valid_set['nn_sample_datas'] = result_valid_set["ssa_sample"],result_valid_set["nn_sample"]
dg.pl.plot_distribution_comparison_nd(result_valid_set)
4. Infer the parameters with DynGPT-Inferrer¶
In this section, we employ the trained model to infer the corresponding stochastic dynamics from the observed data.
[6]:
valid_dataset_path = "your/valid/datasetpath/"
valid_dataset_path = "data/sirs/sirs_valid_stable.json"
observed_data,true_params = dg.tl.load_synthetic_data(valid_dataset_path,sample_number=4000)
For the inference process, the configuration file is also extended from the basic configuration, defining the following hyperparameters: the trained neural network parameters path (weight_nn_fine_tune_path), the loss function (inference_loss_func), the learning rate (inference_lr), the maximum number of iterations (iteration_steps), the initial number of parameters sampled from the prior distribution (random_r0_number), and the threshold for parameter acceptance
(param_acceptance_threshold).
[7]:
config_infer = config_basic
# config_infer.weight_nn_fine_tune_path = result_fine_tune["weight_nn_fine_tune_path"]
config_infer.weight_nn_fine_tune_path = "result/epidemic_model/weights/fine_tune_epidemic_model_epoch1600.pt"
config_infer.inference_loss_func = "Likelihood"
config_infer.inference_lr = 0.001
config_infer.iteration_steps = 80
config_infer.random_r0_number = 2
config_infer.param_acceptance_threshold = 0.5
result_infer = dg.inferrer.inferring_dynamics(observed_data,config_infer,new_model=True,synthetic_flag=True)
We visualize the distribution obtained from the observational data with the distribution corresponding to the inferred dynamical parameters.
[14]:
dg.pl.plot_distribution_comparison_nd(result_infer,data_index = [2],species_index=[0,1],width_mm = 120)
We visualize the posterior distribution of the inferred dynamical model parameters.
[18]:
# result_infer["param_names"] = np.array(["$ρ$", "$k_{S}$", "$k_{I}$", "$k_{R}$", "$δ_{S}$", "$δ_{I}$","$δ_{R}$"])
result_infer["param_names"] = np.array(["$k_{S}$", "$k_{I}$", "$k_{R}$", "$δ_{S}$", "$δ_{I}$","$δ_{R}$"])
result_infer["es_params"] = result_infer["es_params"][:,:,1:]
dg.pl.plot_param_posterior_dist(result_infer,param_indexes=[0,2])