using EpiAware, Distributions
= AR(;
ar2 =[truncated(Normal(0.4, 0.2), 0, 1),
damp_priorstruncated(Normal(0.1, 0.05), 0, 1)],
=[Normal(0, 0.2), Normal(0, 0.2)],
init_priors=HierarchicalNormal(std_prior=HalfNormal(0.1))
ϵ_t )
A Prototype for Compositional Probabilistic Infectious Disease Modelling
Recent outbreaks of Ebola, COVID-19 and mpox have demonstrated the value of modelling for synthesising data for rapid evidence to inform decision making. Effective models require integration of expert domain knowledge from clinical medicine, environmental science, behavioural research, and public health to accurately capture transmission processes, yet current modelling approaches create barriers to this integration. Methods used to synthesise available data broadly fall into pipeline approaches that chain separate models together, or joint models that are often monolithic and difficult to adapt. These barriers have prevented advances across multiple settings where models could have provided actionable insights. Composable models where components can be reused across different contexts and combined in various configurations whilst maintaining statistical rigour could address these limitations. Beyond enabling modellers to build models more efficiently, standardised component interfaces provide structured environments for large language models to compose and validate epidemiological models. In this work, we start by outlining the key requirements for a composable infectious disease modelling framework and then present a prototype that addresses these requirements through composable epidemiological components built on Julia’s type system and Turing.jl
. Our approach enables “LEGO-like” model construction where complex models emerge from composing simpler, reusable components. Through three case studies using the prototype, we show how components can be reused across different models whilst maintaining statistical rigour. The first replicates a COVID-19 analysis for South Korea using renewal processes with time-varying reproduction numbers. The second extends these components with reporting delays and day-of-week effects for real-time nowcasting applications. The third integrates ODE solvers for compartmental disease transmission models applied to influenza outbreak data. Across all case studies, the same core components combine differently to address distinct epidemiological questions. We explore other potential options and compare them to our proposed approach. The prototype demonstrates promise but future work is needed to solve remaining composability challenges, expand the component library, integrate bridges to existing epidemiological software ecosystems, and explore opportunities for large language model assisted model construction in resource-constrained settings.
1 Center for Forecasting and Outbreak Analysis; Centers for Disease Control, United States of America
2 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, United Kingdom
✉ Correspondence: Samuel Abbott <sam.abbott@lshtm.ac.uk>
Introduction
Recent outbreaks of Ebola, COVID-19 and mpox have demonstrated the value of modelling for synthesising data for rapid evidence to inform decision making [1]. Infectious diseases spread through complex interactions between biology, human behaviour, economic factors, and environmental conditions that must all be understood to control transmission effectively. Effective models require integration of expert domain knowledge from clinical medicine, environmental science, behavioural research, and public health to accurately capture these multifaceted transmission processes, yet current modelling frameworks create barriers to this integration. Individual-level data such as viral loads, biomarkers, genomic sequences, and clinical observations are routinely ignored or aggregated when informing population-level models, losing information that could improve decisions. These barriers have prevented advances in settings where models could have provided actionable insights such as early outbreak analysis, wastewater surveillance, phylodynamics, clinical biomarkers, pooled testing, and large-scale clinical datasets. The next infectious disease threat is unpredictable and may be exacerbated by climate change [2], requiring adaptable response capabilities that can rapidly incorporate diverse data sources and domain expertise.
Methods used to synthesise available data in real time broadly fall into two classes: either combining results from multiple, smaller models calibrated in isolation (the pipeline approach, e.g., [3]), or representing a single model tuned to the specific scenario (the joint model approach, e.g., [4,5]). Both approaches have downsides. Pipeline approaches combine results from separate models without uncertainty propagation, leading to loss of detail and statistical rigour [6]. Joint approaches use single monolithic models that, whilst able to integrate multiple processes and data streams, are too complex to enable transfer to other settings or extension with additional model components. To adapt or extend such models, analysts need to fully comprehend all parts of the corresponding model and code, creating barriers to sharing methodology and leading to inefficient re-implementation when parts of a model could, in principle, be re-used. Attempts to include insights from environmental scientists, economists, or behavioural researchers typically result in models that are either too complex to use practically or too simplified to capture valuable expertise. This prevents effective integration of expertise across disciplines. Transfer of methodology between outbreak events has proven difficult, with each posing unanticipated challenges that existing tools cannot accommodate. Whilst large language models offer potential to assist with model construction, current monolithic approaches lack the structured component interfaces and validation frameworks needed for such systems to compose models reliably.
Recent developments in computational statistics and scientific computing demonstrate the potential for composable approaches where components can be reused across different contexts and combined in various configurations whilst maintaining statistical rigour that could address these limitations. Advances in Turing.jl
have introduced submodel interfaces that enable composable probabilistic programming, providing a pathway for epidemiological model composition, though initial epidemiological applications revealed interoperability challenges [7,8]. Category theory provides one mathematical framework for this composability through operadic composition in hierarchical model construction, as applied in the AlgebraicJulia ecosystem [9]. Alternative approaches to compositional modelling include the SciML ecosystem’s [10] symbolic-numeric framework, where ModelingToolkit.jl
[11] and Catalyst.jl
[12] use acausal equation-based modelling with automated symbolic transformations to support mixed equation types including differential-algebraic equations, partial differential equations, and stochastic differential equations through unified interfaces. HydroModels.jl
[13] demonstrates compositional hydrological modelling with differentiable neural-enhanced components, whilst SpeedyWeather.jl
uses an interactive domain-specific language approach for “LEGO-like” atmospheric modelling with modular component assembly [14]. The key insight underlying these approaches is the separation of structural syntax, which defines valid compositions, from computational semantics, enabling modularity and independence of components whilst maintaining mathematical rigour.
Figure 1 demonstrates how these compositional principles could enable component reuse across different epidemiological applications. Three example applications wastewater surveillance, biomarker modelling, and early outbreak analysis each compose models from components of different types. The schematic also highlights two examples of component reuse. The incubation period model appears in all three applications, and the within-host viral kinetics model is shared between biomarker modelling and wastewater surveillance. This highlights how components developed for one application could be incorporated into others when they share common underlying processes.

This paper presents a prototype that combines the modularity of pipeline approaches with the statistical rigour of joint models through composable epidemiological components. Our approach enables “LEGO-like” model decisions through standardised interfaces similar to those used in SpeedyWeather.jl
[14]. The prototype supports composability beyond ordinary differential equations, accommodating mixed equation types and the potential for different computational backends. We implement this as a domain-specific language operating intended to be implemented as optional package extensions. We demonstrate our approach using an autoregressive model example to illustrate the proposed compositional pattern and component swapping capabilities. Through three case studies using the prototype, we show how components can be reused across different models: the first is inspired by [15]; the second reuses components from the first, along with new elements, inspired by [16]; the third is inspired by [17], again reusing components, alongside the use of an ODE. Finally, we discuss alternative design approaches, evaluate the strengths and limitations of our compositional approach, and identify key areas for future development.
Prototype Implementation
Requirements for Composable Infectious Disease Modelling
Modelling infectious disease dynamics requires quantifying uncertainty at every level because decisions must account for incomplete knowledge about individual infection risk, transmission dynamics, and observation processes. A clear separation between distinct model components, infection processes, observation processes, and latent dynamics is also key as these allows reasoning on each of these components separately. Because diseases affect populations heterogeneously across age, location, and risk groups, the framework needs to be able to support arbitrary stratification schemes for all components. These stratified models must also remain data-agnostic as this allows the model to be generalised to different datasets, tested based on just its prior specification, and used for forecasting. Similarly, the book work of supporting multiple strata needs to be abstracted from the user to make the system easier to use but at the same time they need to be able to model relationships between strata to support partial pooling of parameters for sparse data settings. To allow for models to be validated, the framework must support nesting models within models and programming over model structure itself, allowing simple components to compose into sophisticated models while remaining individually interrogable for debugging, validation, and mechanistic understanding. This compositional approach requires a clear, concise modelling language so that it can be used by a wide pool of users and so that model specifications can be written quickly but with clarity. Supporting modern inference methods is important so that complex models can be fit and this necessitates gradient computation throughout via automatic differentiation. It is also important to allow for a wide range of inference methods so that the best approach for a given model/data combination can be used. This means supporting abstract back-ends that seamlessly switch between inference approaches.
We also need to have model components that encapsulate both structure and prior distributions so that domain experts can contribute specialised knowledge: a virologist’s understanding of within-host dynamics, an epidemiologist’s of contact patterns, a clinician’s of disease progression insights without reviewing the entire modelling framework. Standardised interfaces between components are needed to allow individual components to work together, to support handling of multiple strata, and to allow for proper uncertainty propagation. Such interfaces also enable large language models to serve as model construction agents, composing models from component libraries whilst validation methods ensure statistical rigour [18]. As there are a range of different potential ways to express infectious disease models including ordinary differential equations, agent-based models, network models, stochastic processes, and discrete time models these all need to be supported both independently and in combination. Importantly, the design must enable incremental adoption without requiring complete rebuilds of existing models, and components should remain functional as standalone tools outside the compositional framework to maximise their utility and adoption. Finally, we need a framework that can be composed with out of domain approaches and expertise, such as neural networks, Gaussian processes, and other machine learning approaches.
Our approach
Meeting these requirements requires programming with probabilities, for which probabilistic programming languages are designed. We also need a probabilistic programming language that supports automatic differentiation for modern inference, the ability to program over model structure itself to enable model nesting and composition, and access to as wide an ecosystem as possible to avoid lock-in and enable integration with existing scientific computing tools. As far as we are aware, only probabilistic programming languages built in Julia [19] provide the metaprogramming capabilities needed to create domain-specific abstractions that can handle arbitrary stratification, standardised interfaces between components, and programming over the model structure. Among Julia’s options, Turing.jl
[8] best meets our requirements with mature submodel support for nesting models within models, extensive inference algorithm choices, and it’s implementation as a light abstraction layer on top of the wider Julia ecosystem. Additional benefits of Julia include eliminating the two-language problem, leveraging multiple dispatch for clean component composition, and accessing the mature SciML ecosystem [10] for differential equations and other scientific computing tools.
Our approach uses a two-layer architecture with high-level domain-specific language (DSL) for epidemiological modelling and a low-level implementation using Turing.jl
(though importantly we are not locked in to this choice as the DSL is agnostic of the backed used). This separation enables incremental adoption without rebuilding existing models to use our DSL, with all components remaining functional as standalone tools outside the compositional framework. The domain-specific language layer provides clear, concise model specification using epidemiological concepts, enabling domain experts to contribute components encapsulating their specialised knowledge without understanding the full framework. The backend layer aims to handle the automated bookkeeping of stratification, interface validation, and uncertainty propagation whilst supporting multiple inference approaches and auto differentiation options by leveraging the Turing.jl
and wider Julia ecosystems.
Domain-Specific Language Structure
Our prototype domain-specific language builds on Julia’s type system to enable composable epidemiological modelling through two key design patterns. First, abstract types define interfaces that implementations must follow, analogous to contracts specifying what operations a model component must support rather than how it implements them. All model components inherit from a parent AbstractModel
type, establishing a common foundation whilst allowing specialised behaviour through subtypes. Second, structures contain other structures as fields (a struct-in-struct pattern), allowing complex models to be built by nesting simpler components. This pattern enables models to be assembled like building blocks whilst maintaining clear boundaries between different epidemiological processes.
We organise model components into three abstract type hierarchies, each of which inherits from AbstractModel
, corresponding to distinct epidemiological processes. AbstractEpiModel
represents infection generation processes such as renewal models or ordinary differential equation transmission dynamics. AbstractLatentModel
captures time-varying parameters and unobserved processes such as changing reproduction numbers or reporting rates, implemented through structures like autoregressive processes, random walks, or moving averages. AbstractObservationModel
links latent states to observed data by encoding measurement processes such as reporting delays, aggregation over time periods, and observation error distributions. These structures are data-agnostic, specifying what to do when they encounter data rather than containing data themselves, making model definitions reusable across different datasets and scenarios. Each hierarchy supports multiple concrete implementations that can be swapped to compare modelling assumptions whilst keeping other components fixed.
Models can compose across these hierarchies rather than being restricted to combining components within a single type. For example, an observation model can contain a latent process as a field to represent time-varying ascertainment, or wrap another observation model to add reporting delays. This cross-hierarchy composition extends the building block analogy to include more complex models. For instance, we can construct a delay convolution observation model, LatentDelay
, using an underlying observation model and a delay distribution model.
The EpiProblem
structure is a top-level container assembling these components into a complete epidemiological model. It holds an infection process (epi_model
), a latent process (latent_model
), an observation process (observation_model
), and a time span for inference or simulation. Structures can be modified in place using tools like Accessors.jl
, enabling both model iteration and patterns such as partially pooled models that update low-level priors based on grouping structures. The abstract type system enables shared methods across all model components, such as print methods (shown below) for displaying model specifications or functions for visualising model directed acyclic graphs. This approach also allows for the creation of mappings between submodels such as one to one, one to many, and many to many mappings so that, for example, a single infection process can be linked to multiple observations models by specifying a mapping model.
To demonstrate the structure and use of AbstractLatentModels
, we start with an autoregressive order two (AR(2)) process, which mathematically is:
\[Z_t = \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t, \quad \epsilon_t \sim \text{Normal}(0, \sigma)\]
In our prototype DSL, this is defined using the AR
struct. We use priors based on [20] which we will reuse in Section 3.1. Prior distributions are specified using Distributions.jl
[21], the standard probability distributions package in the Julia ecosystem, which provides a unified interface for probability distributions that is interoperable with both our framework and Turing.jl
.
This constructor has created the following struct definition.
ar2
AR(damp_prior = Product(Truncated(0.4, 0.2, 0.0, 1.0)),
init_prior = TuringScalMvNormal([0.0, 0.0], 0.2),
p = 2,
ϵ_t = HierarchicalNormal(mean = 0.0,
std_prior = HalfNormal(0.1),
add_mean = false))
Another common latent model is the moving average model
\[Z_t = \epsilon_t + \theta \epsilon_{t-1}, \quad \epsilon_t \sim \text{Normal}(0, \sigma)\]
The MA
struct defines this in the same way that the AR
did for the AR process.
= MA(;
ma1 =[truncated(Normal(0.0, 0.2), -1, 1)],
θ_priors=HierarchicalNormal(std_prior=HalfNormal(0.1))
ϵ_t )
A popular combination of these models is the autoregressive moving average (ARMA) model that can have different orders for both the AR and MA components. An ARMA(2,1) can be defined as:
\[Z_t = \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t + \theta \epsilon_{t-1}\]
In our DSL this can be represented as a composition of the AR and MA structs by updating the AR error term:
using Accessors
= @set ar2.ϵ_t = ma1 arma21
The result of this step has been to update the ar2
struct so that the definition of the moving average model is nested inside it.
arma21
AR(damp_prior = Product(Truncated(0.4, 0.2, 0.0, 1.0)),
init_prior = TuringScalMvNormal([0.0, 0.0], 0.2),
p = 2,
ϵ_t = MA(θ = Product(Truncated(0.0, 0.2, -1.0, 1.0)),
q = 1,
ϵ_t = HierarchicalNormal(mean = 0.0,
std_prior = HalfNormal(0.1),
add_mean = false)))
Similarly, autoregressive integrated moving average (ARIMA) models extend ARMA by adding differencing operations that transform the series
\[\Delta Z_t = Z_{t} - Z_{t-1}\]
So that the ARIMA(2,1,1) model is defined as an ARMA(2,1) model for the first order differences:
\[\Delta Z_t = \rho_1 \Delta Z_{t-1} + \rho_2 \Delta Z_{t-2} + \epsilon_t + \theta \epsilon_{t-1}\]
We compose the ARMA model with a differencing operation using the DiffLatentModel
wrapper:
= DiffLatentModel(arma21, Normal(0, 0.2); d=1) arima211
Alternatively, EpiAware provides an arima()
constructor function that simplifies this specification.
Other latent models extend modelling options through combining models additively, multiplicative scaling, and piecewise processes. This approach enables representation of arbitrary latent processes through composition.
Backend Implementation: Turing Interface
Our DSL translates to Turing models through generate functions that dispatch on abstract type hierarchies. Each abstract type (AbstractLatentModel
, AbstractEpiModel
, AbstractObservationModel
) has a generate function interface that concrete model types must implement a method to dispatch upon. When a generate function receives a model component, Julia’s multiple dispatch automatically selects the appropriate implementation based on the component’s type, producing Turing.jl
[8] code using the @model
macro. Unit tests are used to ensure concrete implementations satisfy interface requirements, enabling all components to interoperate correctly regardless of which specific model types are composed together. This approach means new model types integrate seamlessly without modifying existing code, and users can swap components to compare modelling assumptions whilst keeping other parts of the model fixed.
The generated Turing models serve dual purposes: simulation by sampling from prior distributions, or inference by conditioning on observed data and applying Turing’s suite of algorithms including gradient-based methods like the No U-Turn Sampler (NUTS) [22]. These are standard Turing models with no special constraints beyond those imposed by the Turing.jl
framework itself, supporting all DynamicPPL.jl
operations such as parameter fixing, model conditioning, and posterior predictive sampling. Generated components can be used directly as standalone models or nested within other Turing models using the @submodel
macro, enabling incremental adoption where only parts of a model use the compositional framework whilst other parts use custom Turing.jl
.
Many computational components are backend-agnostic, containing no probabilistic programming constructs and therefore portable to other frameworks. For example, we use a common pattern, accumulate_scan
, which builds on the base accumulate function to model iterative temporal processes through step functions. These step functions are structs built on the AbstractAccumulationStep
interface that implement a callable defining the single-step update rule. Mathematical utilities such as functions for converting between reproduction numbers and growth rates, discretising continuous delay distributions, and reparameterising observation error models are similarly backend-agnostic. This separation enables a package extension pattern where standard Julia packages provide domain-specific functionality whilst the compositional layer is added as an optional extension, allowing users to adopt the framework incrementally without requiring their entire workflow to commit to the compositional approach.
Returning to our example from the DSL section, the autoregressive process can be mapped from its high-level representation to a Turing model using the generate_latent
function and multiple dispatch, which generates the following Turing.jl
:
@model function EpiAwareBase.generate_latent(latent_model::AR, n)
p = latent_model.p
@assert n>p "n must be longer than order of the autoregressive process"
ar_init ~ latent_model.init_prior
damp_AR ~ latent_model.damp_prior
@submodel ϵ_t = generate_latent(latent_model.ϵ_t, n - p)
ar = accumulate_scan(ARStep(damp_AR), ar_init, ϵ_t)
return ar
end
The key line @submodel ε_t = generate_latent(latent_model.ε_t, n - p)
enables composition by delegating to whatever error model was provided. The AR dynamics are implemented through a custom accumulation step that maintains the autoregressive state.
function (ar::ARStep)(state, ϵ)
new_val = dot(ar.damp_AR, state) + ϵ
new_state = vcat(state[2:end], new_val)
return new_state
end
This step function works with accumulate_scan
to build the AR series by applying the autoregressive equation at each time step. The MA model has a similar structure with its own internal step function. The accumulate_scan
pattern enables composable iteration steps, allowing complex processes like renewal models with susceptible depletion to be built by composing simple step operations. This design means we only need to write the single-step operation without worrying about the iteration process, making components more modular and reusable.
The full ARIMA(2,1,1) model we defined in the DSL section can then be generated using the same approach, producing the following Turing.jl
.
@model function EpiAwareBase.generate_latent(latent_model::DiffLatentModel, n)
d = latent_model.d
@assert n>d "n must be longer than d"
latent_init ~ latent_model.init_prior
@submodel diff_latent = generate_latent(latent_model.model, n - d)
return _combine_diff(latent_init, diff_latent, d)
end
The DiffLatentModel’s @submodel diff_latent = generate_latent(latent_model.model, n - d)
calls the ARMA model, which in turn calls its composed AR and MA components, then applies differencing through cumulative summing. This recursion through @submodel
enables arbitrary composition while maintaining separation between components.
To demonstrate fitting these submodels we first create a model that uses our ARIMA(2,1,1) model as submodel combining it with a normal observation error process.
using DynamicPPL, Turing, LinearAlgebra
@model function arima_with_obs(arima_spec, n_timesteps)
@submodel Z_t = generate_latent(arima_spec, n_timesteps)
~ truncated(Normal(0.0, 0.001), 0, Inf)
σ_obs ~ MvNormal(Z_t, σ_obs * I)
y_obs return y_obs
end
We can then generate synthetic data with fixed parameters which we sample from the prior distribution using the rand
, and fix
functions and calling the model to simulate the observations. We first define the model.
= 20
n_timesteps = arima_with_obs(arima211, n_timesteps) gen_model
Then sample from it,
= rand(gen_model) simulated_params
Now we have parameters we can simulated some data using the generative model by fixing the random variables using the sampled parameters and the fix
function. We can then call it, like any normal function, to get simulated observations for y
.
= fix(gen_model, simulated_params)
fixed_model = fixed_model() y_observed
For inference, we condition on the generative model using simulated observations and the condition
function or here the equivalent |
notation. Now we have a model conditioned on data we can fit it using our choice of approach supported by Turing.jl
. Here we decide to use the No-U-turn sampler (a popular variant of MCMC).
= gen_model | (; y_obs = y_observed)
conditioned_model = sample(conditioned_model, NUTS(), MCMCThreads(), 2000, 4) chains
We can then compare our posterior distributions to the true sampled values from our ARIMA(2, 1, 1) model using PairPlots.jl
[23] with the CairoMakie.jl
[24] backend for visualisation (Figure 2). The posterior distributions recover the simulated parameter values.
Case Studies
We demonstrate how our prototype compositional modelling DSL can recreate and extend existing epidemiological models. Each case study shows how complex models are built by composing reusable components, validating their behaviour through prior predictive checks, and fitting them to real data. The examples progress from simple renewal models to more complex observation processes that account for reporting delays and temporal effects.
All code and data for reproducing the analyses in this paper are available at: https://github.com/EpiAware/PrototypeCompositionalProbablisticInfectiousDiseaseModelling.
On the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective
In On the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective, Mishra et al (2020) [20] demonstrate the mathematical correspondance between age-dependent branching processes and time-since-infection epidemiological models, as a renewal model with time-varying reproduction number \(R_t\). Renewal models use the renewal equation to model how new infections arise from previous infections, weighted by the generation time distribution (or serial interval) [25]. This is analogous to an autoregressive process where the autoregressive coefficients have epidemiological meaning rather than being estimated parameters. They show how solutions to the renewal equation, when combined with a negative binomial observation model, define a Bayesian hierarchical framework for inference on reported case time series data, demonstrating this on test-confirmed cases of COVID-19 in South Korea.
Data
Mishra et al used daily reported test-confirmed cases of COVID-19 in South Korea between January to July 2020. This data is curated by the covidregionaldata
package, but we have saved the South Korean data locally.
using Chain, CSV, DataFramesMeta, Dates
= "data/south_korea_data.csv"
datapath = @chain datapath begin
south_korea_data read(DataFrame)
CSV.= _.cases_new, dates = _.date)
(y_t end
Model
Our model is inspired by Mishra et al and uses a log-scale time-varying reproductive number \(\log R_t\) modelled as an AR(2) process, which in turn specifies the latent infections \(I_t\) as a solution to the renewal equation conditional on the trajectory of \(\log R_t\). The latent infection process is then linked directly to reported cases \(C_t\) on matching days using a negative binomial link distribution. The key difference from Mishra et al is in the initialization. They seed the renewal equation with importations (independent daily effects \(\mu_t \sim \text{Exponential}(0.5)\)), whilst we initialize by solving for the growth rate corresponding to the initial reproduction number and extrapolating backwards without allowing for ongoing importations.
\[ \begin{aligned} \rho_1, \rho_2, Z_0, Z_{-1}, I_0, \sigma, \phi &\sim \pi(\cdot), \\ \epsilon_t & \sim \text{Normal}(0, \sigma)~~ \text{i.i.d } \forall t, \\ Z_t &= \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t,~~ t = 1, 2, 3, \dots\\ R_t & = \exp(Z_t), \\ r & \text{ solves } G(r) = 1 / R_1, \\ I_{t} & = I_0 e^{rt},~~ t = 0, -1, -2, -3,-n+1, \\ I_t &= R_t \sum_{s \geq 1} g_s I_{t-s}, ~~ t = 1, 2, 3, \dots, \\ y_t & \sim \text{NegBin}(I_t, \phi), ~~ t = 1, 2, 3, \dots. \end{aligned} \]
Where \(\pi\) is a prior distribution for the hyperparmeters of the AR(2) model, initial states \(Z_0, Z_{-1}\), the autoregressive coefficients \(\rho_1,\rho_2\) and innovations standard deviation \(\sigma\), along with initial condition value for the latent infections \(I_0\) and observation overdispersion \(\phi\). \(g_t\) is the generation distribution probability mass function (pmf). \(r\) is the growth rate determined by the by \(R_1 = \exp(Z_1)\) using the implicit relationship [26].
\[ G(r) = \sum_{j \geq 1} g_j \exp(- r j) = 1 / R_1. \]
This means that we determine the initial condition of the latent infecteds before \(t=0\), \(I_{-1}, I_{-2}, \dots, I_{-n+1}\) jointly with sampling \(R_1\) where \(n\) is the maximum support value of the \(g_t\) pmf.
Latent Model
We reuse the AR(2) model ar2
defined in Section 2.3, which has the appropriate priors from Mishra et al. Prior predictive samples (Figure 3 A) show that a priori these priors assign a few percent chance of achieving very high \(R_t\) values, i.e. \(R_t \sim 10-1000\) is not excluded.
Infection Generating Process
The renewal equation requires a discrete generation time pmf \(g_t\). Our prototype provides a constructor that converts continuous distributions into the required discrete pmf using double interval censoring [27]. This is called inside the EpiData
struct which we use to define the expected model specific input data. Mishra et al used a \(\text{Gamma}(6.5, 0.62)\) serial interval distribution,
= Gamma(6.5, 0.62)
truth_SI = EpiData(gen_distribution=truth_SI) model_data
Figure 3 D compares the discretized generation interval with the underlying continuous distribution. As expected the observed discrete generation interval differs from the underlying continuous distrbution due to the double censoring process.
As in the earlier example, we define our renewal model using a specialised struct, Renewal
. This requires the discretized generation time (from model_data
) and an initialisation prior. We use a lognormal prior for the initial latent infections as it has a skewed right tail allow some weight on large initial values.
= Normal(log(1.0), 0.1)
log_I0_prior = Renewal(model_data; initialisation_prior=log_I0_prior) renewal
This results in the following model structure.
renewal
Renewal(
data =
EpiAware.EpiInfModels.EpiData{Float64, typeof(exp)}([0.026663134095601056, 0.14059778064943784, 0.2502660305615846, 0.24789569560506844, 0.1731751163417783, 0.09635404000022223, 0.04573437575216367, 0.019313826994143808], 8, exp),
initialisation_prior = Normal(0.0, 0.1),
recurrent_step =
EpiAware.EpiInfModels.ConstantRenewalStep{Float64}([0.019313826994143808, 0.04573437575216367, 0.09635404000022223, 0.1731751163417783, 0.24789569560506844, 0.2502660305615846, 0.14059778064943784, 0.026663134095601056]))
To demonstrate the infection model independently, we define a fixed \(R_t\) trajectory that decreases from 3 to 0.5 over 50 days and pass this to the generate_latent_infs
function which, like the generate_latent
function, relies on multiple dispatch to contruct the model that the struct it is passed defines.
= [0.5 + 2.5 / (1 + exp(t - 15)) for t in 1:50]
Rt = generate_latent_infs(renewal, log.(Rt)) renewal_mdl
The implied distribution of \(I_t\) trajectories conditional on this \(R_t\) trajectory can then be sampled independently of other model components (Figure 3 B). As expected this gives us a “classical” outbreak like dynamic with infection incidence initially increasing exponentially, the rate of growth slowing over time, and then finally the infection incidence “turning over”.
The full infection generating process, that is the model defined in Section 3.1 without the link to case data, can be constructed by passing samples of \(Z_t\) into the renewal model sampler.
Observation Model
In Mishra et al the overdispersion parameter \(\phi\) sets the relationship between the mean and variance of the negative binomial errors. We default to a prior on \(\sqrt{1/\phi}\) (referred to as the cluster factor) because this quantity is approximately the coefficient of variation of the observation noise and, therefore, is easier to reason on a priori beliefs. A prior for \(\phi\) was not specified in Mishra et al, so we use \(\sqrt{1/\phi} \sim \text{HalfNormal}(0.1)\).
= NegativeBinomialError(cluster_factor_prior=HalfNormal(0.1)) negbin
As with the latent model, we can generate a Turing.jl
model conditional on a fixed latent infection trajectory. Here we simulated this to look like an outbreak with a symmetrical growth and decay phase.
= [1000 * exp(-(t - 15)^2 / (2 * 4)) for t in 1:30]
I_t = generate_observations(negbin, missing, I_t) negbin_mdl
Here we use a missing
argument to indicate observed variables that are to be sampled rather than to be used to accumulate log posterior density. Prior predictive samples (Figure 3 C) show the dispersion around the mean implied by our choice of prior.
Fitting to Data
We now compose the three model components into an EpiProblem
, which defines the complete generative model for the time range tspan
.
= (45, 80)
tspan = EpiProblem(epi_model = renewal,
mishra = ar2,
latent_model = negbin,
observation_model = tspan) tspan
We create training data by subsetting the full data to match tspan
.
= @chain south_korea_data begin
training_data @set _.y_t = _.y_t[first(tspan):last(tspan)]
@set _.dates = _.dates[first(tspan):last(tspan)]
end
As a last step before fitting the model, we generate a new Turing.jl
model using generate_epiaware
, the EpiProblem
we just defined, and our training data.
using Turing
= generate_epiaware(mishra, training_data) mishra_mdl
Following the same compositional pattern as our modelling DSL, we also support composing inference approaches using EpiMethod
, which, for example, can combine pre-sampling steps with sampling algorithms. We use the No U-Turns (NUTS) sampler with batched implementation of pathfinder variational inference [28] that returns the single pathfinder route with maximum estimated evidence lower bound to estimate the intialise the sampler.
using ReverseDiff
= EpiMethod(
inference_method =[ManyPathfinder(nruns=5, maxiters=100)],
pre_sampler_steps=NUTSampler(
sampler=0.9,
target_acceptance=AutoReverseDiff(compile=true),
adtype=1000,
ndraws=4,
nchains=MCMCThreads(),
mcmc_parallel=1000)
nadapts )
We now need to combine our inference approach with our generated model. We can do this using the apply_method
.
= apply_method(mishra_mdl,
mishra_results
inference_method,
training_data )
This gives the following posterior estimates for our model parameters.
summarystats(mishra_results.samples)
Summary Statistics parameters mean std mcse ess_bulk ess_tail ⋯ Symbol Float64 Float64 Float64 Float64 Float64 Flo ⋯ latent.ar_init[1] 0.0637 0.2005 0.0064 986.0895 451.4509 1. ⋯ latent.ar_init[2] 0.0307 0.1859 0.0051 1333.7954 870.6752 1. ⋯ latent.damp_AR[1] 0.6414 0.0844 0.0052 269.8427 403.3955 1. ⋯ latent.damp_AR[2] 0.1722 0.0462 0.0016 863.9201 726.0120 0. ⋯ latent.std 0.4962 0.0601 0.0039 229.7914 538.8243 1. ⋯ latent.ϵ_t[1] 0.7201 0.8415 0.0324 693.9246 604.9609 1. ⋯ latent.ϵ_t[2] 1.1493 0.8221 0.0300 747.9757 621.9631 1. ⋯ latent.ϵ_t[3] 1.5750 0.8353 0.0239 1240.9136 849.9880 1. ⋯ latent.ϵ_t[4] 1.2829 0.8452 0.0268 1005.2694 720.9283 1. ⋯ latent.ϵ_t[5] 2.4581 0.7840 0.0306 646.9801 596.0810 1. ⋯ latent.ϵ_t[6] 2.4990 0.7310 0.0280 671.5462 615.6128 0. ⋯ latent.ϵ_t[7] 1.3361 0.6435 0.0213 910.5136 645.6753 1. ⋯ latent.ϵ_t[8] 1.0511 0.6276 0.0259 596.9875 572.8984 1. ⋯ latent.ϵ_t[9] 0.1353 0.5847 0.0221 703.9782 603.8956 1. ⋯ latent.ϵ_t[10] -1.6547 0.6322 0.0306 428.1374 765.7226 1. ⋯ latent.ϵ_t[11] -1.9795 0.6651 0.0393 296.3930 434.6210 1. ⋯ latent.ϵ_t[12] -0.2925 0.5327 0.0179 902.4608 652.9039 1. ⋯ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 2 columns and 24 rows omitted
Figure 3 shows that the compositional model defined using our prototype system recovers the main finding in Mishra et al; that the \(R_t\) in South Korea peaked at a very high value (\(R_t \sim 10\) at peak, Figure 3 F) before rapidly dropping below 1 in early March 2020, with the model capturing both the epidemic trajectory and day-to-day variation in case counts (Figure 3 E).
EpiNow2: Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
In this case study, we replicate a common EpiNow2
configuration using our prototype framework. EpiNow2
[29] is a widely-used R package for estimating the time-varying reproduction number and making forecasts for epidemics in real-time. EpiNow2
is built around three core modeling components that work together to reconstruct epidemic dynamics from delayed and noisy case count observations. These are: a discrete renewal equation to model how new infections arise from previous infections, weighted by the generation time distribution (the time between successive infections in a transmission chain); the delay between infection and case reporting, which typically involves multiple sequential delays including incubation periods and reporting lags; observation error to capture overdispersion and model misspecification, with additional modifiers such as day-of-week effects, and underascertainment to account for biases in reporting patterns. We recreate the core EpiNow2
functionality by reusing model components from Section 2.3 and Section 3.1 and composing them with new delay and temporal effect components.
Data
We use the example dataset included with the EpiNow2
R package, which contains daily confirmed COVID-19 cases from Italy during the first wave in 2020, from 22nd February to 30th June.
using Chain, CSV, DataFramesMeta, Dates
= "data/italy_data.csv"
datapath = @chain datapath begin
italy_data read(DataFrame)
CSV.= _.confirm, dates = Date.(_."date"))
(y_t end
Model
Our model reuses the renewal infection model and ARIMA(2,1,1) latent process from earlier sections, making it piecewise constant by week, whilst building an observation model that accounts for reporting delays and day-of-week effects using discrete convolutions and a partially pooled day of the week effect. Unlike Section 3.1 where the serial interval distribution was used (as in Mishra et al), EpiNow2
uses the generation time distribution, which represents the time between successive infections in a transmission chain. Mathematically, we represent the complete model as:
\[ \begin{aligned} \rho_1, \rho_2, Z_0, Z_{-1}, I_0, \sigma_Z, \sigma_\omega, \phi &\sim \pi(\cdot), \\ \epsilon_w & \sim \text{Normal}(0, \sigma_Z)~~ \text{i.i.d } \forall w, \\ Z_w - Z_{w-1} &= \rho_1 (Z_{w-1} - Z_{w-2}) + \rho_2 (Z_{w-2} - Z_{w-3}) + \epsilon_w,~~ w = 1, 2, 3, \dots\\ R_t &= \exp\left(Z_{\lfloor t/7 \rfloor}\right),~~ \text{Piecewise } R_t \text{ constant by week}\\ I_t &= R_t \sum_{s \geq 1} g_s I_{t-s}, ~~ t = 1, 2, 3, \dots, \\ S_t &= \sum_{s\geq1} I_{t-s} \eta_s,~~ \text{Incubation delay: infections to symptom onset}\\ D_t &= \sum_{s\geq1} S_{t-s} \xi_s,~~ \text{Reporting delay: symptom onset to case reports}\\ \tilde{\omega}_k &\sim \text{N}(0, \sigma^2_\omega),~ k = 0,\dots, 6,~~ \text{Day-of-week modifier} \\ \omega &= 7 \times \text{softmax}(\tilde{\omega}),\\ y_t &\sim \text{NegBin}( D_t \omega_{t~\text{mod}~7}, \phi),~~ \text{Link to data}. \end{aligned} \tag{1}\]
Where \(\pi\) represents prior distributions for model hyperparameters. \(g_t\) is the generation time distribution pmf, \(\eta_t\) is the incubation period distribution pmf, and \(\xi_t\) is the reporting delay distribution pmf (all obtained by double interval censoring continuous distributions [27]). \(S_t\) represents symptom onsets (infections convolved with incubation period) and \(D_t\) represents delayed case reports (symptom onsets convolved with reporting delay). The vector \(\omega = [\omega_0,\dots,\omega_6]\) encodes day-of-week effects on reporting.
Latent Model
We reuse the ARIMA(2,1,1) model arima211
defined in Section 2.3 but modify it to be piecewise constant by week using the broadcast_weekly
modifier.
= broadcast_weekly(arima211) weekly_arima211
This approach models the log reproduction number as constant within each week whilst allowing weekly changes to follow the ARIMA(2,1,1) process. In EpiNow2
, although stationary process representations of \(R_t\) are available, the default latent process is a differenced stationary (Matern) Gaussian Process; that is stationary only on its increments. Similarly, our piecewise constant weekly model applies the ARIMA(2,1,1) model to \(\log R_{w}\) at the weekly scale, then broadcasts this to daily values. The weekly piecewise constant formulation we use is an option in the EpiNow2
package but not part of the default model. Prior predictive samples (Figure 4 A) show the behaviour of the piecewise constant by week process.
Infection Generating Process
Unlike Section 3.1 where the serial interval distribution was used (as in Mishra et al), EpiNow2
uses the generation time distribution, which represents the time between successive infections in a transmission chain. Whilst the serial interval (time between symptom onsets) is often used as a proxy for the generation time because it is more readily observable, using the serial interval can be problematic [30] and is primarily done in practice when generation time estimates are unavailable, particularly early in an outbreak [31].
Here we follow the EpiNow2
getting started vignette to parameterise our model, this has a Gamma distribution with uncertain parameters for the generation time: shape ~ Normal(1.4, 0.48) and rate ~ Normal(0.38, 0.25). Since our prototype does not yet support uncertain delay distributions, we use the mean parameter values, giving a Gamma(1.4, 0.38) distribution with mean 3.68 days
= Gamma(1.4, 1 / 0.38)
gen_time_dist = EpiData(gen_distribution=gen_time_dist) epinow2_data
We use a similar renewal
structure as we did in Section 3.1, updating it to use the generation time distribution rather than the serial interval.
= Normal(log(1.0), 2.0)
initialisation_prior = Renewal(epinow2_data; initialisation_prior=initialisation_prior) renewal_gt
As before, to demonstrate the infection model independently, we supply a fixed \(R_t\) trajectory that decreases from 3 to 0.5 over 50 days. Figure 4 B shows prior samples from the renewal process conditional on this \(R_t\) trajectory.
Observation Model
We build the observation model through the composition of modular components reusing the negative binomial link observation model (negbin
) from Section 3.1, and then layer additional modelling components on top. We start by adding a day of the week ascertainment model using the ascertainment_dayofweek
helper function.
= ascertainment_dayofweek(negbin;
dayofweek_negbin =HierarchicalNormal(std_prior=HalfNormal(1.0))
latent_model )
Unpacking this helper function reveals a nested stack of modelling constructions. First, transformation and broadcasting
function broadcast_dayofweek(model::AbstractTuringLatentModel; link = x -> 7 * softmax(x))
transformed_model = TransformLatentModel(model, link)
return BroadcastLatentModel(transformed_model, 7, RepeatEach())
end
which uses the TransformLatentModel
and BroadcastLatentModel
structs to dispatch the transformation \(\omega = 7 \times \text{softmax}(\tilde{\omega})\) and then broadcast this along the time series.
Second is the linkage of the day of week model to the latent infections time series as a multiplicative ascertainment process, which produces
@model function EpiAwareBase.generate_observations(obs_model::Ascertainment, y_t, Y_t)
@submodel expected_obs_mod = generate_latent(
obs_model.latent_model, length(Y_t)
)
expected_obs = obs_model.transform(Y_t, expected_obs_mod)
@submodel y_t = generate_observations(obs_model.model, y_t, expected_obs)
return y_t
end
In addition to ascertainment, we also need to model the incubation period (infection to symptom onset) and a reporting delay (symptom onset to case reporting). Again following the EpiNow2
getting started vignette, the incubation period uses a LogNormal distribution with uncertain parameters: meanlog ~ Normal(1.6, 0.064) and sdlog ~ Normal(0.42, 0.069), giving a mean of 5.41 days at the parameter means. The reporting delay uses a LogNormal with meanlog = 0.58 and sdlog = 0.47, giving a mean of 1.99 days.
Since our prototype does not yet support uncertain delay distributions, we use the mean parameter values:
= LogNormal(1.6, 0.42)
incubation_distribution = LogNormal(0.58, 0.47) reporting_distribution
The LatentDelay
struct uses double interval censoring [27] to convert continuous delay models into pmfs and stacks with the temporal modifier model. We compose two delay layers sequentially:
= LatentDelay(dayofweek_negbin, incubation_distribution)
incubation_dayofweek_negbin = LatentDelay(incubation_dayofweek_negbin, reporting_distribution) delay_dayofweek_negbin
This code demonstrates component reuse (base negbin
from Section 3.1), layered composition (adding day-of-week effects via ascertainment_dayofweek
), and sequential composition (adding two delay layers via nested LatentDelay
). Note that since these are deterministic delays, we could more efficiently compute this by first convolving the two delay distributions or by automating this using the strata-in-strata approach with multiple dispatch. Figure 4 C demonstrates how these delay and day-of-week effects work together to transform a latent infection signal.
Fitting to Data
We compose these modelling subcomponents into one EpiProblem
model. The delay padding accounts for the combined length of the incubation period and reporting delay pmfs, allowing the model to properly account for infections that have occurred but not yet been observed by the end of the training period.
= length(incubation_dayofweek_negbin.rev_pmf)
incubation_pmf_length = length(delay_dayofweek_negbin.rev_pmf)
reporting_pmf_length = incubation_pmf_length + reporting_pmf_length
delay_padding
= (1, 40 + delay_padding)
tspan = EpiProblem(epi_model=renewal_gt,
epinow2 =weekly_arima211,
latent_model=delay_dayofweek_negbin,
observation_model=tspan) tspan
This EpiProblem
uses the renewal model with generation time data (renewal_gt
) and the piecewise constant by week ARIMA(2,1,1) latent model (weekly_arima211
). The observation model reuses negbin
from Section 3.1, layering on delays and day-of-week effects.
We filter the data as before to the timespan of interest.
= @chain italy_data begin
italy_training_data @set _.y_t = _.y_t[first(tspan):last(tspan)]
@set _.dates = _.dates[first(tspan):last(tspan)]
end
We again construct a Turing model using generate_epiaware
.
= generate_epiaware(epinow2, italy_training_data) epinow2_mdl
We reuse the same inference_method
defined in Section 3.1..
= apply_method(epinow2_mdl,
epinow2_results
inference_method,
italy_training_data )
Here is the summarised posterior:
summarystats(epinow2_results.samples)
Summary Statistics parameters mean std mcse ess_bulk ess_tail ⋯ Symbol Float64 Float64 Float64 Float64 Float64 ⋯ latent.latent_init[1] 0.3094 0.1605 0.0068 556.2173 662.0437 ⋯ latent.ar_init[1] 0.0882 0.1504 0.0057 702.2536 693.7840 ⋯ latent.ar_init[2] 0.0235 0.1496 0.0057 693.3706 574.9652 ⋯ latent.damp_AR[1] 0.3060 0.1662 0.0058 723.8216 376.9081 ⋯ latent.damp_AR[2] 0.1017 0.0504 0.0018 669.8727 357.4515 ⋯ latent.θ[1] 0.0369 0.1860 0.0066 795.1720 711.4211 ⋯ latent.std 0.1926 0.0550 0.0022 572.2796 550.1314 ⋯ latent.ϵ_t[1] -1.5427 0.5405 0.0254 463.7243 652.2106 ⋯ latent.ϵ_t[2] -1.3429 0.5853 0.0220 732.8516 639.1567 ⋯ latent.ϵ_t[3] 0.5980 0.5590 0.0208 726.2907 717.0990 ⋯ latent.ϵ_t[4] 0.5498 0.5383 0.0187 852.8721 671.5520 ⋯ latent.ϵ_t[5] -0.2135 0.7120 0.0223 1029.9857 760.0157 ⋯ latent.ϵ_t[6] -0.0362 1.0194 0.0329 961.9357 688.8595 ⋯ init_incidence 6.2263 0.4632 0.0186 640.2673 639.1262 ⋯ obs.DayofWeek.std 0.1528 0.0635 0.0031 383.5912 604.3721 ⋯ obs.DayofWeek.ϵ_t[1] 0.0557 0.5364 0.0241 498.0382 670.2969 ⋯ obs.DayofWeek.ϵ_t[2] -0.4793 0.5410 0.0255 467.0653 439.5359 ⋯ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 2 columns and 6 rows omitted
We see the model has converged and the summary statistics are similar to Section 3.1. Figure 4 shows that the compositional model recovers similar \(R_t\) dynamics to Section 3.1 (Figure 4 F), whilst explicitly accounting for reporting delays (Figure 4 D) and day-of-week effects that help disentangle true transmission changes from reporting artifacts (Figure 4 E).
Contemporary statistical inference for infectious disease models using Stan
In this vignette, we’ll demonstrate how to use EpiAware
in conjunction with the SciML ecosystem [10] to replicate Contemporary statistical inference for infectious disease models using Stan Chatzilena et al. 2019.
This case study is currently being developed separately in case-study-3.qmd
and will be integrated once it follows the standard case study format.
Discussion
This paper has demonstrated that compositional approaches can address barriers in epidemiological modelling. We presented a prototype that enables “LEGO-like” model construction, maintaining the statistical rigour of joint models whilst providing the flexibility of pipeline approaches. The autoregressive model example illustrated how complex models emerge from simple component combinations using the struct-in-struct pattern. Our three case studies, which were based on previous studies [15–17], demonstrate model composability for a range of problems and using different underlying methods.
A strength of our prototype approach is its modular design, which enables faster model development and component reuse, whilst also facilitating comparison of modelling assumptions without large reimplementation efforts. The approach reduces implementation barriers for methodological advances and enables researchers to embed new components within existing models more easily. Standardised interfaces should also allow large language models to compose models programmatically with validation methods ensuring correctness. However, a clear limitation is that this is just a prototype and not designed for ongoing use or adoption. For example, we only partially implemented support for automated mappings between latent and observed states, which remains challenging to do well. Similarly we only added partial support for partial pooling approaches with no infrastructure in place to support users in using this functionality in our DSL layer. The current component library also remains limited to basic epidemiological patterns. Finally, the current struct manipulation tooling cannot coordinate updates of related parameters such as model order and corresponding priors without manual intervention, requiring either avoiding this pattern or developing additional solutions. Whilst the prototype implementation has limitations, the three real-world case studies presented here establish the viability of compositional approaches for practical epidemiological problems. On top of this, we have layed the ground work for future work and given clear requirements that future composable frameworks should seek to meet to address the needs of applied infectious disease modelling. Another limitation is the learning curve required for researchers familiar with monolithic approaches to adopt this compositional paradigm. However, as they can embed elements of our system within their existing models and as they can implement their own compositional elements with minimal understanding of the overall architecture these barriers to use should be mitigated. Computational overhead from abstraction layers is also a concern, the impact of this overhead and potential optimisations remain unclear. However, Turing.jl
[8] and Julia [19] are well positioned for optimisation with new auto differentiation backends like Enzyme [32] and Mooncake [33], which could reduce computational costs in future implementations. Similarly, ongoing work in the Turing.jl
ecosystem should be able to identify and resolve any bottlenecks in performance that do emerge. A strength of the Julia [19] ecosystem for compositional work is that domain experts develop specialised tools that interoperate through shared interfaces, enabling innovation across projects. However, this distributed development also creates challenges as it can be difficult to locate appropriate tools and unclear where responsibility lies when issues emerge. Users may not know whether problems originate in Turing.jl
, our prototype DSL, or elsewhere in the ecosystem and even once located it can be difficult to determine who is responsible for resolving them. There are also additional technical barriers within the Turing.jl
ecosystem. The Turing PPL DSL is not fully stable and since developing this prototype the @submodel
macro has changed syntax and become a function with breaking changes in how it handles prefixes and in how it manages variables within a submodel, meaning our current prototype implementation is not compatible with the latest Turing.jl
release. However, some of these changes do allow for potentially more flexible models as they reduce the difference between specifying something as a distribution and as a submodel. Other recent changes in Turing.jl
such as the primary workflow for observed data being to condition and fix with no argument for observations impact how data is conditioned on the model, which would mean not having to pass observations through layers of the model and so would make building composable models much easier. Turing.jl
also has limited handling of numerical instability and so things like large counts being simulated can cause errors, which makes evaluating and developing models frustrating. An advantage of the Julia [19] ecosystem for a composable modelling tool built on top of a PPL is the access to the full features of a programming language. However, in practice some of these utilities, such as profiling tools can be frustrating and difficult to use with Turing.jl
when compared to similar tools offered in other ecosystem because the probabilistic programming language abstractions and macro expansions obscure the relationship between source code and runtime behaviour, making it challenging to identify performance bottlenecks in model components. The ecosystem needs further development in areas such as debugging tools that can trace through model execution whilst preserving the semantic structure of probabilistic models, better error messages that relate numerical issues to model specification rather than low-level implementation details, and standardised approaches for model validation and testing. The EpiMethod
approach for chaining inference (combining pre-sampling like Pathfinder [28] with sampling like the No U-Turn Sampler (NUTS) [22]) shows promise but should be a Turing.jl
general solution rather than epi-specific because many domains face similar challenges of difficult posterior geometries where adaptive warmup strategies could improve both efficiency and reliability of inference across the broader probabilistic programming ecosystem.
Our prototype could draw more heavily on category theory [fong2018seven?], which provides formal mathematical foundations for composability through operadic structures that guarantee valid model composition whilst maintaining modularity. The AlgebraicJulia ecosystem demonstrates how these theoretical foundations can be operationalised into software, with AlgebraicPetri.jl
[9] and AlgebraicDynamics.jl
[34] implementing operadic composition through structured cospans and wiring diagrams. These provide explicit graphical syntax where boxes represent model components and wires represent connections between them, with the categorical structure ensuring that composed models remain mathematically valid. AlgebraicJulia approaches offer stronger formal guarantees about compositional correctness than our type-based interfaces, with explicit separation of structural syntax from computational semantics enabling multiple interpretations of the same compositional structure. The SciML ecosystem [10] takes a different symbolic-numeric approach, where ModelingToolkit.jl
[11] and Catalyst.jl
[12] use acausal equation-based modelling with automated symbolic transformations to support mixed equation types, achieving substantial performance improvements through automated parallelisation, sparsity detection, and equation simplification. However, both category theory and symbolic-numeric frameworks focus primarily on dynamical systems with no native support for probability distributions as first-class objects, requiring external tools like Turing.jl
for Bayesian inference and additional layers to connect symbolic specifications to observational data with measurement error. Our probabilistic programming foundation offers more direct support for statistical inference and uncertainty quantification, supports a broader range of modelling approaches, but lacks the formal compositional guarantees of category theory approaches and automated optimisations of symbolic-numeric frameworks.
Alternative probabilistic programming languages could support similar compositional approaches. Gen.jl
[35] focuses on programmable inference with explicit control over inference strategies, supporting composition through its combinator-based approach to building generative models. However, Gen operates more independently from the broader Julia ecosystem, maintaining its own distribution types rather than using Distributions.jl
[21], which negates a key strength of Julia approaches where packages interoperate through shared interfaces. Despite this, Gen-based tools demonstrate useful functionality for epidemiological modelling, such as AutoGP.jl
[36], which implements real-time inference and automated composition of Gaussian process kernels that could be useful for modelling time-varying epidemiological parameters. Genify [37] provides an approach that translates Julia code into Gen models, potentially easing adoption by allowing modellers to write in familiar Julia syntax whilst accessing Gen’s programmable inference capabilities. Python-based probabilistic programming languages such as NumPyro [38] built on JAX [39] could enable similar composability [40], with potential advantages from JAX’s focus on efficiency and GPU scaling. However, the smaller JAX-specific ecosystem compared to general Python packages, JAX’s optimisation for neural network tasks, and Python PPLs’ more programmer-oriented syntax that diverges from mathematical notation may create barriers for epidemiological modellers. Julia [19] supports GPU computing with ongoing work to expand integration into Turing.jl
[8]. JuliaBUGS.jl
[41] offers an alternative Julia-based PPL with BUGS syntax that also builds off of Distributions.jl
. Unlike imperative probabilistic programming languages such as Turing.jl
, JuliaBUGS.jl
uses a declarative graph-based approach where users specify directed graphical models by explicitly stating conditional dependencies between variables. This graph-based representation offers several advantages including clearer understanding of dependencies, more transparent model structure and assumptions, and efficient inference through algorithms that leverage the model’s graphical structure to identify conditional independence relationships. However, the declarative approach trades off the flexibility of Turing.jl
’s imperative style, which allows procedural code with loops and conditionals that can be more expressive for certain complex modelling patterns. Building our domain-specific language on Distributions.jl
rather than Turing.jl
could enable compatibility with multiple PPLs including JuliaBUGS.jl
whilst trading off Turing.jl
’s expressive submodel interface.
Areas for future work include expanding the component library to address epidemiological applications across multiple scales and data types. More work is also needed to allow for modifying deeply nested model specifications without the user needing to be aware of the structure of the nested model. Methodological advances are also needed including for the joint estimation of interdependent epidemiological parameters, and then integration of individual and population-level observations. Alternative approaches also need to investigated, particularly more formal category theory based methods that use operadic composition for mathematically rigorous hierarchical model construction, as demonstrated in AlgebraicPetri.jl
and AlgebraicDynamics.jl
[9]. Symbolic-numeric frameworks like ModelingToolkit.jl
[11] and Catalyst.jl
[12] offer potential performance improvements through automatic optimisation and code generation. However, these approaches require generalisation to explicitly model probabilities and support a range of different modelling approaches which are both needed for infectious disease modelling applications. An alternative potential abstraction approach would be to build the domain-specific language on Distributions.jl
rather than Turing.jl
, which would enable compatibility with multiple probabilistic programming languages like JuliaBUGS
whilst trading off the expressiveness of Turing’s submodel interface. Further investigation of Gen.jl
[35], Genify [37], and JuliaBUGS.jl
[41] is needed, as Gen’s programmable inference capabilities and tools like AutoGP.jl
[36] may overcome some limitations of Turing.jl
, Genify’s metaprogramming bridge could ease adoption by allowing modellers to write familiar Julia code, and JuliaBUGS’s graph-based approach could enable more efficient inference through explicit model structure. Performance optimisation through parallelisation and approximate inference methods, along with integration bridges to existing epidemiological software ecosystems, will be needed for practical adoption. The compositional framework creates opportunities for large language model integration. Language models could serve as model construction agents, using a composable framework to construct epidemiological models from component libraries [18]. The explicit structure and validation tools of a composable framework should enable language models to reason about model design and propose structural adaptations more easily and with less room for error. Another advantage of our approach for large language model assisted model construction is that the high-level component interface reduces context and reasoning requirements, enabling deployment of smaller models on local compute in resource-constrained settings.
A complete implementation of the prototype we present here could improve real-time analysis of infectious disease dynamics by enabling “LEGO-like” assembly of epidemiological components. This prototype establishes the feasibility of integrating diverse expertise whilst maintaining statistical rigour, addressing limitations of current modelling approaches. Such frameworks are also likely key for enabling, and increasing the robustness of, large language model assisted model construction, especially in resource-constrained settings where such capabilities could prove most valuable. Given the unpredictable nature of future infectious disease threats such adaptable modelling capabilities that can incorporate diverse data sources and domain expertise are needed for future public health decision making.
Acknowledgements
Poppy the dog for growling at the right times.