using EpiAware, Distributions
ar2 = AR(;
damp_priors=[truncated(Normal(0.2, 0.2), 0, 1),
truncated(Normal(0.1, 0.05), 0, 1)],
init_priors=[Normal(0, 0.2), Normal(0, 0.2)],
ϵ_t=HierarchicalNormal(std_prior=HalfNormal(0.1))
)A Prototype for Compositional Probabilistic Infectious Disease Modelling
Recent outbreaks of Ebola, COVID-19 and mpox have demonstrated the value of modelling for synthesising data for rapid evidence to inform decision making. Effective models require integration of expert domain knowledge from clinical medicine, environmental science, behavioural research, and public health to accurately capture transmission processes, yet current modelling approaches create barriers to this integration. Methods used to synthesise available data broadly fall into pipeline approaches that chain separate models together, or joint models that are often monolithic and difficult to adapt. These barriers have prevented advances across multiple settings where models could have provided actionable insights. Composable models where components can be reused across different contexts and combined in various configurations whilst maintaining statistical rigour could address these limitations. Beyond enabling modellers to build models more efficiently, standardised component interfaces provide structured environments for large language models to compose and validate epidemiological models. In this work, we start by outlining the key requirements for a composable infectious disease modelling framework and then present a prototype that addresses these requirements through composable epidemiological components built on Julia’s type system and Turing.jl. Our approach enables “LEGO-like” model construction where complex models emerge from composing simpler, reusable components. Through three case studies using the prototype, we show how components can be reused across different models whilst maintaining statistical rigour. The first replicates a COVID-19 analysis for South Korea using renewal processes with time-varying reproduction numbers. The second extends these components with reporting delays and day-of-week effects for real-time nowcasting applications. The third integrates ODE solvers for compartmental disease transmission models applied to influenza outbreak data. We explore other potential options and compare them to our proposed approach. The prototype demonstrates promise but future work is needed to solve remaining composability challenges, expand the component library, integrate bridges to existing epidemiological software ecosystems, and explore opportunities for large language model assisted model construction in resource-constrained settings.
1 Center for Forecasting and Outbreak Analysis; Centers for Disease Control, United States of America
2 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, United Kingdom
✉ Correspondence: Sam Abbott <sam.abbott@lshtm.ac.uk>
Introduction
Modelling infectious disease dynamics can be a valuable tool for synthesising information and developing quantitative evidence for understanding and control [1]. The utility of such models lies in their ability to combine expert knowledge about transmission with the often complex ways in which infections are ultimately observed - from environmental surveillance [e.g. in wastewater, 2] and by-products of clinical management [e.g., hospital admissions and bed occupancy, 3] to dedicated transmission studies [e.g., in households, 4]. However, transmission mechanisms and observation processes vary across pathogens and settings, and, consequently, models are usually tailored to particular diseases and datasets. This hampers transfer of learning. For example, a model developed for understanding the spread of SARS-CoV-2 via observations in wastewater would not have carried over to subsequently studying the 2022 global outbreak of mpox, where contact structure and specific behaviours played a key role and would have needed to be represented explicitly in order to obtain correct insights.
As a consequence, modellers often have to re-implement model components in new models that used for different situations even when underlying concepts (e.g. of how population-level infections manifest as pathogen concentrations in wastewater) are, in principle, the same. This repeated re-implementation is wasteful and time-consuming at moments when evidence might be urgently needed [5]. It also limits opportunities for collaboration and co-development of robust methods that others can re-use, and raises technical barriers to building state-of-the-art models, as individual modellers must understand and implement all parts of the relevant transmission and observation models. As an example, analyses during the COVID-19 pandemic demonstrated that cross-sectional viral-load measurements can inform epidemic dynamics [6], yet this approach has not found its way into most models used to inform policy in real time even where such data was routinely available [7].
Probabilistic programming languages (PPLs) such as Stan or LibBi [8,9], and modelling frameworks such as pomp and monty [king16_statistical;fitzjohn25_monty], have transformed how infectious disease models are fitted to data. These tools enable statistically principled inference with probabilistic reasoning across a wide range of applications. However, they are designed for implementing complete models and do not provide straightforward ways to share or reuse model components. The practical unit of reuse therefore remains the whole model or its codebase, not the underlying epidemiological concepts. Building on these frameworks, several efforts have aimed for greater generality within specific modelling paradigms such as flexible statistical models for real-time estimation [10,11]. Yet these efforts, too, have encountered clear limits of extensibility: extending them requires embedding new data types or processes within a single, expanding codebase, which becomes harder to maintain and adapt as complexity grows. To work around these constraints, some analyses have adopted multi-stage or “pipeline” approaches, in which separate models are fitted sequentially and their outputs passed as inputs to subsequent stages [12]. This allows different data sources or processes to be analysed independently but prevents proper propagation of uncertainty and can introduce biased estimates compared with fully joint models [13].
Recent developments in computational statistics and scientific computing demonstrate the potential for composable modelling, in which components can be reused across contexts and combined in different configurations whilst maintaining statistical rigour. Advances in Turing.jl have introduced submodel interfaces that enable composable probabilistic programming, offering a pathway for epidemiological model composition, though initial epidemiological applications revealed interoperability challenges [14,15]. Category theory provides one mathematical framework for this composability through operadic composition (a mathematical framework for composing operations hierarchically where operations can have multiple inputs and be combined according to formal rules) in hierarchical model construction, as applied in the AlgebraicJulia ecosystem [16]. Alternative approaches to compositional modelling include the SciML ecosystem’s [17] symbolic-numeric framework, where ModelingToolkit.jl [18] and Catalyst.jl [19] use acausal equation-based modelling with automated symbolic transformations to support mixed equation types including differential-algebraic equations, partial differential equations, and stochastic differential equations through unified interfaces. HydroModels.jl [20] demonstrates compositional hydrological modelling with differentiable neural-enhanced components, whilst SpeedyWeather.jl uses an interactive domain-specific language (a programming language tailored to a specific application domain that uses terminology and concepts familiar to domain experts rather than general programming constructs) approach for “LEGO-like” atmospheric modelling with modular component assembly [21]. The key insight underlying these approaches is the separation of structural syntax, which defines valid compositions, from computational semantics, enabling modularity and independence of components whilst maintaining mathematical rigour.
This paper presents a prototype that combines the modularity of pipeline approaches with the statistical rigour of joint models through composable epidemiological components. Our approach enables “LEGO-like” model decisions through standardised interfaces similar to those used in SpeedyWeather.jl [21]. The prototype supports composability beyond ordinary differential equations, accommodating mixed equation types and the potential for different computational backends. We implement this as a domain-specific language operating intended to be implemented as optional package extensions [22]. We demonstrate our approach using an autoregressive model example to illustrate the proposed compositional pattern and component swapping capabilities. Through three case studies using the prototype, we show how components can be reused across different models: the first is inspired by [23]; the second reuses components from the first, along with new elements, inspired by [11]; the third is inspired by [24], again reusing components, alongside the use of an ODE. Finally, we discuss alternative design approaches, evaluate the strengths and limitations of our compositional approach, and identify key areas for future development.
Prototype Implementation
Requirements for Composable Infectious Disease Modelling
Modelling infectious disease dynamics requires quantifying uncertainty at every level because decisions must account for incomplete knowledge about individual infection risk, transmission dynamics, and observation processes. A clear separation between distinct model components, infection processes, observation processes, and latent dynamics is also key as these allows reasoning on each of these components separately. Because diseases affect populations heterogeneously across age, location, and risk groups, the framework needs to be able to support arbitrary stratification schemes for all components. These stratified models must also remain data-agnostic as this allows the model to be generalised to different datasets, tested based on just its prior specification, and used for forecasting. Similarly, the book work of supporting multiple strata needs to be abstracted from the user to make the system easier to use but at the same time they need to be able to model relationships between strata to support partial pooling of parameters for sparse data settings. To allow for models to be validated, the framework must support nesting models within models and programming over model structure itself, allowing simple components to compose into sophisticated models while remaining individually interrogable for debugging, validation, and mechanistic understanding. Model specifications must serve dual purposes, supporting both forward simulation for prior predictive checks and forecasting, and inference when conditioned on observed data, without requiring separate implementations. This compositional approach requires a clear, concise modelling language so that it can be used by a wide pool of users and so that model specifications can be written quickly but with clarity. Supporting modern inference methods is important so that complex models can be fit and this necessitates gradient computation throughout via automatic differentiation. It is also important to allow for a wide range of inference methods so that the best approach for a given model/data combination can be used. This means supporting abstract back-ends that seamlessly switch between inference approaches.
We also need to have model components that encapsulate both structure and prior distributions so that domain experts can contribute specialised knowledge: a virologist’s understanding of within-host dynamics, an epidemiologist’s of contact patterns, a clinician’s of disease progression insights without reviewing the entire modelling framework. Standardised interfaces between components are needed to allow individual components to work together, to support handling of multiple strata, and to allow for proper uncertainty propagation. Such interfaces also enable large language models to serve as model construction agents, composing models from component libraries whilst validation methods ensure statistical rigour [25]. As there are a range of different potential ways to express infectious disease models including ordinary differential equations, agent-based models, network models, stochastic processes, and discrete time models these all need to be supported both independently and in combination. Importantly, the design must enable incremental adoption without requiring complete rebuilds of existing models, and components should remain functional as standalone tools outside the compositional framework to maximise their utility and adoption. Finally, we need a framework that can be composed with out of domain approaches and expertise, such as neural networks, Gaussian processes, and other machine learning approaches.
Our approach
Meeting these requirements requires programming with probabilities, for which probabilistic programming languages are designed. We also need a probabilistic programming language that supports automatic differentiation for modern inference, the ability to program over model structure itself to enable model nesting and composition, and access to as wide an ecosystem as possible to avoid lock-in and enable integration with existing scientific computing tools. As far as we are aware, only probabilistic programming languages built in Julia [26] provide the metaprogramming capabilities needed to create domain-specific abstractions that can handle arbitrary stratification, standardised interfaces between components, and programming over the model structure. Metaprogramming is the ability to write code that generates or manipulates other code, enabling the creation of custom domain-specific syntax and automated code transformations. Among Julia’s options, Turing.jl [15] best meets our requirements with mature submodel support for nesting models within models, extensive inference algorithm choices, and its implementation as a light abstraction layer (a simplified interface that hides complex implementation details whilst enabling access to underlying functionality) on top of the wider Julia ecosystem. Additional benefits of Julia include eliminating the two-language problem, leveraging multiple dispatch (programming paradigm where function behaviour is determined by the types of all arguments rather than just the first, enabling automatic selection of the most specific implementation for given inputs) for clean component composition, and accessing the mature SciML ecosystem [17] for differential equations and other scientific computing tools.
Our approach uses a two-layer architecture with a high-level domain-specific language for epidemiological modelling and a low-level implementation using Turing.jl. Though importantly we are not locked in to this choice as the DSL is agnostic of the backend used. This separation enables incremental adoption without rebuilding existing models to use our DSL, with all components remaining functional as standalone tools outside the compositional framework. The domain-specific language layer provides clear, concise model specification using epidemiological concepts, enabling domain experts to contribute components encapsulating their specialised knowledge without understanding the low-level details of all framework elements. The backend layer aims to handle the automated bookkeeping of stratification, interface validation, and uncertainty propagation whilst supporting multiple inference approaches and auto differentiation options by leveraging the Turing.jl and wider Julia ecosystems. This structure enables integration of diverse data sources by allowing researchers to add a submodel with its own latent and observation process to an existing model, for example adding wastewater measurements to a case-based renewal model or clinical biomarkers to transmission dynamics without modifying the core infection model.
Figure 1 demonstrates how these compositional principles could enable component reuse across different epidemiological applications. Three example applications wastewater surveillance, biomarker modelling, and early outbreak analysis each compose models from components of different types. The schematic also highlights two examples of component reuse. The incubation period model appears in all three applications, and the within-host viral kinetics model is shared between biomarker modelling and wastewater surveillance. This highlights how components developed for one application could be incorporated into others when they share common underlying processes.
Domain-Specific Language Structure
Our prototype domain-specific language builds on Julia’s type system to enable composable epidemiological modelling through two key design patterns. First, abstract types define interfaces that implementations must follow, analogous to contracts specifying what operations a model component must support rather than how it implements them. All model components inherit from a parent AbstractModel type, establishing a common foundation whilst allowing specialised behaviour through subtypes. Second, structures (data containers that group related variables and their types together) contain other structures as fields, a struct-in-struct pattern that allows complex models to be built by nesting simpler components. This pattern enables models to be assembled like building blocks whilst maintaining clear boundaries between different epidemiological processes.
We organise model components into three abstract type hierarchies, each of which inherits from AbstractModel, corresponding to distinct epidemiological processes. AbstractEpiModel represents infection generation processes such as renewal models or ordinary differential equation transmission dynamics. AbstractLatentModel captures time-varying parameters and unobserved processes such as changing reproduction numbers or reporting rates, implemented through structures like autoregressive processes, random walks, or moving averages. AbstractObservationModel links latent states to observed data by encoding measurement processes such as reporting delays, aggregation over time periods, and observation error distributions. These structures are data-agnostic, specifying what to do when they encounter data rather than containing data themselves, making model definitions reusable across different datasets and scenarios. Each hierarchy supports multiple concrete implementations that can be swapped to compare modelling assumptions whilst keeping other components fixed.
Models can compose across these hierarchies rather than being restricted to combining components within a single type. For example, an observation model can contain a latent process as a field to represent time-varying ascertainment, or wrap another observation model to add reporting delays. This cross-hierarchy composition extends the building block analogy to include more complex models. For instance, we can construct a delay convolution observation model, LatentDelay, using an underlying observation model and a delay distribution model.
The EpiProblem structure is a top-level container assembling these components into a complete epidemiological model. It holds an infection process (epi_model), a latent process (latent_model), an observation process (observation_model), and a time span for inference or simulation. The latent model generates time-varying epidemiological parameters, the infection model uses these parameters to simulate disease transmission and generate infections, and the observation model links these latent infections to observed data through measurement processes. However, this is just the top-level structure: each submodel can itself contain any of these abstract types, enabling flexible composition such as observation models that include latent processes for time-varying ascertainment. Structures can be modified in place using tools like Accessors.jl, enabling both model iteration and patterns such as partially pooled models that update low-level priors based on grouping structures. The abstract type system enables shared methods across all model components, such as print methods (shown below) for displaying model specifications or functions for visualising model directed acyclic graphs. This approach also allows for the creation of mappings between submodels such as one to one, one to many, and many to many mappings so that, for example, a single infection process can be linked to multiple observations models by specifying a mapping model.
To demonstrate the structure and use of AbstractLatentModels, we start with an autoregressive order two (AR(2)) process, which mathematically is:
\[Z_t = \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t, \quad \epsilon_t \sim \text{Normal}(0, \sigma)\]
In our prototype DSL, this is defined using the AR struct. We use priors based on [27] which we will reuse in Section 3.1. Prior distributions are specified using Distributions.jl [28], the standard probability distributions package in the Julia ecosystem, which provides a unified interface for probability distributions that is interoperable with both our framework and Turing.jl.
This constructor has created the following struct definition.
ar2AR(damp_prior = Product(Truncated(0.2, 0.2, 0.0, 1.0)),
init_prior = TuringScalMvNormal([0.0, 0.0], 0.2),
p = 2,
ϵ_t = HierarchicalNormal(mean = 0.0,
std_prior = HalfNormal(0.1),
add_mean = false))
Another common latent model is the moving average model
\[Z_t = \epsilon_t + \theta \epsilon_{t-1}, \quad \epsilon_t \sim \text{Normal}(0, \sigma)\]
The MA struct defines this in the same way that the AR did for the AR process.
ma1 = MA(;
θ_priors=[truncated(Normal(0.0, 0.2), -1, 1)],
ϵ_t=HierarchicalNormal(std_prior=HalfNormal(0.1))
)A popular combination of these models is the autoregressive moving average (ARMA) model that can have different orders for both the AR and MA components. An ARMA(2,1) can be defined as:
\[Z_t = \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t + \theta \epsilon_{t-1}\]
In our DSL this can be represented as a composition of the AR and MA structs by updating the AR error term:
using Accessors
arma21 = @set ar2.ϵ_t = ma1The result of this step has been to update the ar2 struct so that the definition of the moving average model is nested inside it.
arma21AR(damp_prior = Product(Truncated(0.2, 0.2, 0.0, 1.0)),
init_prior = TuringScalMvNormal([0.0, 0.0], 0.2),
p = 2,
ϵ_t = MA(θ = Product(Truncated(0.0, 0.2, -1.0, 1.0)),
q = 1,
ϵ_t = HierarchicalNormal(mean = 0.0,
std_prior = HalfNormal(0.1),
add_mean = false)))
Similarly, autoregressive integrated moving average (ARIMA) models extend ARMA by adding differencing operations that transform the series
\[\Delta Z_t = Z_{t} - Z_{t-1}\]
So that the ARIMA(2,1,1) model is defined as an ARMA(2,1) model for the first order differences:
\[\Delta Z_t = \rho_1 \Delta Z_{t-1} + \rho_2 \Delta Z_{t-2} + \epsilon_t + \theta \epsilon_{t-1}\]
We compose the ARMA model with a differencing operation using the DiffLatentModel wrapper:
arima211 = DiffLatentModel(arma21, Normal(0, 0.2); d=1)Alternatively, EpiAware provides an arima() constructor function that simplifies this specification.
Other latent models extend modelling options through combining models additively, multiplicative scaling, and piecewise processes. This approach enables representation of arbitrary latent processes through composition.
Backend Implementation: Turing Interface
Our DSL translates to Turing models through generate functions that dispatch on abstract type hierarchies. Each abstract type (AbstractLatentModel, AbstractEpiModel, AbstractObservationModel) has a generate function interface that concrete model types must implement a method to dispatch upon. When a generate function receives a model component, Julia’s multiple dispatch automatically selects the appropriate implementation based on the component’s type, producing Turing.jl [15] code using the @model macro. Unit tests are used to ensure concrete implementations satisfy interface requirements, enabling all components to interoperate correctly regardless of which specific model types are composed together. This approach means new model types integrate seamlessly without modifying existing code, and users can swap components to compare modelling assumptions whilst keeping other parts of the model fixed.
The generated Turing models serve dual purposes: simulation by sampling from prior distributions, or inference by conditioning on observed data and applying Turing’s suite of algorithms including gradient-based methods like the No U-Turn Sampler (NUTS) [29]. These are standard Turing models with no special constraints beyond those imposed by the Turing.jl framework itself, supporting all DynamicPPL.jl operations such as parameter fixing, model conditioning, and posterior predictive sampling. Generated components can be used directly as standalone models or nested within other Turing models using the @submodel macro, enabling incremental adoption where only parts of a model use the compositional framework whilst other parts use custom Turing.jl.
Many computational components are backend-agnostic, containing no probabilistic programming constructs and therefore portable to other frameworks. For example, we use a common pattern, accumulate_scan, which builds on the base accumulate function to model iterative temporal processes through step functions. These step functions are structs built on the AbstractAccumulationStep interface that implement a callable (a struct that can be invoked like a function) defining the single-step update rule. Mathematical utilities such as functions for converting between reproduction numbers and growth rates, discretising continuous delay distributions, and reparameterising observation error models are similarly backend-agnostic. This separation enables a package extension pattern where standard Julia packages provide domain-specific functionality whilst the compositional layer is added as an optional extension, allowing users to adopt the framework incrementally without requiring their entire workflow to commit to the compositional approach.
Returning to our example from the DSL section, the autoregressive process can be mapped from its high-level representation to a Turing model using the generate_latent function and multiple dispatch, which generates the following Turing.jl model:
@model function EpiAwareBase.generate_latent(latent_model::AR, n)
p = latent_model.p
@assert n>p "n must be longer than order of the autoregressive process"
ar_init ~ latent_model.init_prior
damp_AR ~ latent_model.damp_prior
@submodel ϵ_t = generate_latent(latent_model.ϵ_t, n - p)
ar = accumulate_scan(ARStep(damp_AR), ar_init, ϵ_t)
return ar
end
The key line @submodel ε_t = generate_latent(latent_model.ε_t, n - p) enables composition by delegating to whatever error model was provided. The AR dynamics are implemented through a custom accumulation step that maintains the autoregressive state.
function (ar::ARStep)(state, ϵ)
new_val = dot(ar.damp_AR, state) + ϵ
new_state = vcat(state[2:end], new_val)
return new_state
end
This step function works with accumulate_scan to build the AR series by applying the autoregressive equation at each time step. The MA model has a similar structure with its own internal step function. The accumulate_scan pattern enables composable iteration steps, allowing complex processes like renewal models with susceptible depletion to be built by composing simple step operations. This design means we only need to write the single-step operation without worrying about the iteration process, making components more modular and reusable.
The full ARIMA(2,1,1) model we defined in the DSL section can then be generated using the same approach, producing the following Turing.jl model:
@model function EpiAwareBase.generate_latent(latent_model::DiffLatentModel, n)
d = latent_model.d
@assert n>d "n must be longer than d"
latent_init ~ latent_model.init_prior
@submodel diff_latent = generate_latent(latent_model.model, n - d)
return _combine_diff(latent_init, diff_latent, d)
end
The DiffLatentModel’s @submodel diff_latent = generate_latent(latent_model.model, n - d) calls the ARMA model, which in turn calls its composed AR and MA components, then applies differencing through cumulative summing. This recursion through @submodel enables arbitrary composition while maintaining separation between components.
To demonstrate fitting these submodels we first create a model that uses our ARIMA(2,1,1) model as submodel combining it with a normal observation error process.
using DynamicPPL, Turing, LinearAlgebra
@model function arima_with_obs(arima_spec, n_timesteps)
@submodel Z_t = generate_latent(arima_spec, n_timesteps)
σ_obs ~ truncated(Normal(0.0, 0.001), 0, Inf)
y_obs ~ MvNormal(Z_t, σ_obs * I)
return y_obs
endWe can then generate synthetic data with fixed parameters which we sample from the prior distribution using the rand, and fix functions and calling the model to simulate the observations. We first define the model.
n_timesteps = 20
gen_model = arima_with_obs(arima211, n_timesteps)Then sample from it,
simulated_params = rand(gen_model)Now we have parameters we can simulated some data using the generative model by fixing the random variables using the sampled parameters and the fix function. We can then call it, like any normal function, to get simulated observations for y.
fixed_model = fix(gen_model, simulated_params)
y_observed = fixed_model()For inference, we condition on the generative model using simulated observations and the condition function or here the equivalent | notation. Now we have a model conditioned on data we can fit it using our choice of approach supported by Turing.jl. Here we decide to use the No-U-turn sampler (a popular variant of MCMC).
conditioned_model = gen_model | (; y_obs = y_observed)
chains = sample(conditioned_model, NUTS(), MCMCThreads(), 2000, 4)We can then compare our posterior distributions to the true sampled values from our ARIMA(2, 1, 1) model using PairPlots.jl [30] with the CairoMakie.jl [31] backend for visualisation (Figure 2). The posterior distributions recover the simulated parameter values.
Case Studies
We demonstrate how our prototype compositional modelling DSL can recreate and extend existing epidemiological models. Each case study shows how complex models are built by composing reusable components, validating their behaviour through prior predictive checks, and fitting them to real data. The examples progress from simple renewal models to more complex observation processes that account for reporting delays and temporal effects.
All code and data for reproducing the analyses in this paper are available at: https://github.com/EpiAware/PrototypeCompositionalProbablisticInfectiousDiseaseModelling.
On the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective
In On the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective, Mishra et al (2020) [27] demonstrate the mathematical correspondance between age-dependent branching processes and time-since-infection epidemiological models, as a renewal model with time-varying reproduction number \(R_t\). Renewal models use the renewal equation to model how new infections arise from previous infections, weighted by the generation time distribution (or serial interval) [32]. This is analogous to an autoregressive process where the autoregressive coefficients have epidemiological meaning rather than being estimated parameters. They show how solutions to the renewal equation, when combined with a negative binomial observation model, define a Bayesian hierarchical framework for inference on reported case time series data, demonstrating this on test-confirmed cases of COVID-19 in South Korea.
Data
Mishra et al used daily reported test-confirmed cases of COVID-19 in South Korea between January to July 2020. This data is curated by the covidregionaldata package, but we have saved the South Korean data locally.
using Chain, CSV, DataFramesMeta, Dates
datapath = "data/south_korea_data.csv"
south_korea_data = @chain datapath begin
CSV.read(DataFrame)
(y_t = _.cases_new, dates = _.date)
endModel
Our model is inspired by Mishra et al and uses a log-scale time-varying reproductive number \(\log R_t\) modelled as an AR(2) process, which in turn specifies the latent infections \(I_t\) as a solution to the renewal equation conditional on the trajectory of \(\log R_t\). The latent infection process is then linked directly to reported cases \(C_t\) on matching days using a negative binomial link distribution. The key difference from Mishra et al is in the initialization. They seed the renewal equation with importations (independent daily effects \(\mu_t \sim \text{Exponential}(0.5)\)), whilst we initialize by solving for the growth rate corresponding to the initial reproduction number and extrapolating backwards without allowing for ongoing importations.
\[ \begin{aligned} \rho_1, \rho_2, Z_0, Z_{-1}, I_0, \sigma, \phi &\sim \pi(\cdot), \\ \epsilon_t & \sim \text{Normal}(0, \sigma)~~ \text{i.i.d } \forall t, \\ Z_t &= \rho_1 Z_{t-1} + \rho_2 Z_{t-2} + \epsilon_t,~~ t = 1, 2, 3, \dots\\ R_t & = \exp(Z_t), \\ r & \text{ solves } G(r) = 1 / R_1, \\ I_{t} & = I_0 e^{rt},~~ t = 0, -1, -2, -3,-n+1, \\ I_t &= R_t \sum_{s \geq 1} g_s I_{t-s}, ~~ t = 1, 2, 3, \dots, \\ y_t & \sim \text{NegBin}(I_t, \phi), ~~ t = 1, 2, 3, \dots. \end{aligned} \]
Where \(\pi\) is a prior distribution for the hyperparmeters of the AR(2) model, initial states \(Z_0, Z_{-1}\), the autoregressive coefficients \(\rho_1,\rho_2\) and innovations standard deviation \(\sigma\), along with initial condition value for the latent infections \(I_0\) and observation overdispersion \(\phi\). \(g_t\) is the generation distribution probability mass function (pmf). \(r\) is the growth rate determined by the by \(R_1 = \exp(Z_1)\) using the implicit relationship [33].
\[ G(r) = \sum_{j \geq 1} g_j \exp(- r j) = 1 / R_1. \]
This means that we determine the initial condition of the latent infecteds before \(t=0\), \(I_{-1}, I_{-2}, \dots, I_{-n+1}\) jointly with sampling \(R_1\) where \(n\) is the maximum support value of the \(g_t\) pmf.
Latent Model
We reuse the AR(2) model ar2 defined in Section 2.3, which has the appropriate priors from Mishra et al. Prior predictive samples (Figure 3 A) show that a priori these priors assign a few percent chance of achieving very high \(R_t\) values, i.e. \(R_t \sim 10-1000\) is not excluded.
Infection Generating Process
The renewal equation requires a discrete generation time pmf \(g_t\). Our prototype provides a constructor that converts continuous distributions into the required discrete pmf using double interval censoring [34]. This is called inside the EpiData struct which we use to define the expected model specific input data. Mishra et al used a \(\text{Gamma}(6.5, 0.62)\) serial interval distribution,
truth_SI = Gamma(6.5, 0.62)
model_data = EpiData(gen_distribution=truth_SI)Figure 3 D compares the discretized generation interval with the underlying continuous distribution. As expected the observed discrete generation interval differs from the underlying continuous distrbution due to the double censoring process.
As in the earlier example, we define our renewal model using a specialised struct, Renewal. This requires the discretized generation time (from model_data) and an initialisation prior. We use a lognormal prior for the initial latent infections as it has a skewed right tail allow some weight on large initial values.
log_I0_prior = Normal(log(1.0), 0.1)
renewal = Renewal(model_data; initialisation_prior=log_I0_prior)This results in the following model structure.
renewalRenewal(
data =
EpiAware.EpiInfModels.EpiData{Float64, typeof(exp)}([0.026663134095601056, 0.14059778064943784, 0.2502660305615846, 0.24789569560506844, 0.1731751163417783, 0.09635404000022223, 0.04573437575216367, 0.019313826994143808], 8, exp),
initialisation_prior = Normal(0.0, 0.1),
recurrent_step =
EpiAware.EpiInfModels.ConstantRenewalStep{Float64}([0.019313826994143808, 0.04573437575216367, 0.09635404000022223, 0.1731751163417783, 0.24789569560506844, 0.2502660305615846, 0.14059778064943784, 0.026663134095601056]))
To demonstrate the infection model independently, we define a fixed \(R_t\) trajectory that decreases from 3 to 0.5 over 50 days and pass this to the generate_latent_infs function which, like the generate_latent function, relies on multiple dispatch to contruct the model that the struct it is passed defines.
Rt = [0.5 + 2.5 / (1 + exp(t - 15)) for t in 1:50]
renewal_mdl = generate_latent_infs(renewal, log.(Rt))The implied distribution of \(I_t\) trajectories conditional on this \(R_t\) trajectory can then be sampled independently of other model components (Figure 3 B). As expected this gives us a “classical” outbreak like dynamic with infection incidence initially increasing exponentially, the rate of growth slowing over time, and then finally the infection incidence “turning over”.
The full infection generating process, that is the model defined in Section 3.1 without the link to case data, can be constructed by passing samples of \(Z_t\) into the renewal model sampler.
Observation Model
In Mishra et al the overdispersion parameter \(\phi\) sets the relationship between the mean and variance of the negative binomial errors. We default to a prior on \(\sqrt{1/\phi}\) (referred to as the cluster factor) because this quantity is approximately the coefficient of variation of the observation noise and, therefore, is easier to reason on a priori beliefs. A prior for \(\phi\) was not specified in Mishra et al, so we use \(\sqrt{1/\phi} \sim \text{HalfNormal}(0.1)\).
negbin = NegativeBinomialError(cluster_factor_prior=HalfNormal(0.1))As with the latent model, we can generate a Turing.jl model conditional on a fixed latent infection trajectory. Here we simulated this to look like an outbreak with a symmetrical growth and decay phase.
I_t = [1000 * exp(-(t - 15)^2 / (2 * 4)) for t in 1:30]
negbin_mdl = generate_observations(negbin, missing, I_t)Here we use a missing argument to indicate observed variables that are to be sampled rather than to be used to accumulate log posterior density. Prior predictive samples (Figure 3 C) show the dispersion around the mean implied by our choice of prior.
Fitting to Data
We now compose the three model components into an EpiProblem, which defines the complete generative model for the time range tspan.
tspan = (45, 80)
mishra = EpiProblem(epi_model = renewal,
latent_model = ar2,
observation_model = negbin,
tspan = tspan)We create training data by subsetting the full data to match tspan.
training_data = @chain south_korea_data begin
@set _.y_t = _.y_t[first(tspan):last(tspan)]
@set _.dates = _.dates[first(tspan):last(tspan)]
endAs a last step before fitting the model, we generate a new Turing.jl model using generate_epiaware, the EpiProblem we just defined, and our training data.
using Turing
mishra_mdl = generate_epiaware(mishra, training_data)Following the same compositional pattern as our modelling DSL, we also support composing inference approaches using EpiMethod, which, for example, can combine pre-sampling steps with sampling algorithms. We use the No U-Turns (NUTS) sampler with batched implementation of pathfinder variational inference [35] that returns the single pathfinder route with maximum estimated evidence lower bound to estimate the intialise the sampler.
using ReverseDiff
inference_method = EpiMethod(
pre_sampler_steps=[ManyPathfinder(nruns=5, maxiters=100)],
sampler=NUTSampler(
target_acceptance=0.9,
adtype=AutoReverseDiff(compile=true),
ndraws=1000,
nchains=4,
mcmc_parallel=MCMCThreads(),
nadapts=1000)
)We now need to combine our inference approach with our generated model. We can do this using the apply_method.
mishra_results = apply_method(mishra_mdl,
inference_method,
training_data
)This gives the following posterior estimates for our model parameters.
summarystats(mishra_results.samples)Summary Statistics parameters mean std mcse ess_bulk ess_tail ⋯ Symbol Float64 Float64 Float64 Float64 Float64 Flo ⋯ latent.ar_init[1] 0.0476 0.1966 0.0064 940.8929 575.9168 1. ⋯ latent.ar_init[2] 0.0262 0.1971 0.0063 969.7171 716.3177 1. ⋯ latent.damp_AR[1] 0.5940 0.0890 0.0064 194.3703 443.1125 1. ⋯ latent.damp_AR[2] 0.1755 0.0489 0.0020 625.3266 575.7129 1. ⋯ latent.std 0.5112 0.0587 0.0039 228.5665 284.5891 1. ⋯ latent.ϵ_t[1] 0.6144 0.8063 0.0393 421.9310 553.6828 1. ⋯ latent.ϵ_t[2] 1.1076 0.8112 0.0280 841.7300 597.1298 1. ⋯ latent.ϵ_t[3] 1.5452 0.8268 0.0280 877.0404 604.7987 1. ⋯ latent.ϵ_t[4] 1.2637 0.8594 0.0250 1174.7336 732.3216 1. ⋯ latent.ϵ_t[5] 2.5834 0.7666 0.0285 719.1471 627.2643 1. ⋯ latent.ϵ_t[6] 2.6251 0.7247 0.0281 663.6743 521.7679 1. ⋯ latent.ϵ_t[7] 1.5282 0.6653 0.0285 586.7519 424.1313 1. ⋯ latent.ϵ_t[8] 1.2669 0.6133 0.0322 369.8274 594.3730 1. ⋯ latent.ϵ_t[9] 0.3273 0.5657 0.0242 548.5812 613.5581 1. ⋯ latent.ϵ_t[10] -1.4987 0.5726 0.0335 288.9834 516.3740 1. ⋯ latent.ϵ_t[11] -1.9111 0.6380 0.0399 254.1133 496.7557 1. ⋯ latent.ϵ_t[12] -0.1963 0.4718 0.0206 605.6018 461.7635 1. ⋯ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 2 columns and 24 rows omitted
Figure 3 shows that the compositional model defined using our prototype system recovers the main finding in Mishra et al; that the \(R_t\) in South Korea peaked at a very high value (\(R_t \sim 10\) at peak, Figure 3 F) before rapidly dropping below 1 in early March 2020, with the model capturing both the epidemic trajectory and day-to-day variation in case counts (Figure 3 E).
EpiNow2: Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
In this case study, we replicate a common EpiNow2 configuration using our prototype framework. EpiNow2 [36] is a widely-used R package for estimating the time-varying reproduction number and making forecasts for epidemics in real-time. EpiNow2 is built around three core modeling components that work together to reconstruct epidemic dynamics from delayed and noisy case count observations. These are: a discrete renewal equation to model how new infections arise from previous infections, weighted by the generation time distribution (the time between successive infections in a transmission chain); the delay between infection and case reporting, which typically involves multiple sequential delays including incubation periods and reporting lags; observation error to capture overdispersion and model misspecification, with additional modifiers such as day-of-week effects, and underascertainment to account for biases in reporting patterns. We recreate the core EpiNow2 functionality by reusing model components from Section 2.3 and Section 3.1 and composing them with new delay and temporal effect components.
Data
We use the example dataset included with the EpiNow2 R package, which contains daily confirmed COVID-19 cases from Italy during the first wave in 2020, from 22nd February to 30th June.
using Chain, CSV, DataFramesMeta, Dates
datapath = "data/italy_data.csv"
italy_data = @chain datapath begin
CSV.read(DataFrame)
(y_t = _.confirm, dates = Date.(_."date"))
endModel
Our model reuses the renewal infection model and ARIMA(2,1,1) latent process from earlier sections, making it piecewise constant by week, whilst building an observation model that accounts for reporting delays and day-of-week effects using discrete convolutions and a partially pooled day of the week effect. Unlike Section 3.1 where the serial interval distribution was used (as in Mishra et al), EpiNow2 uses the generation time distribution, which represents the time between successive infections in a transmission chain. Mathematically, we represent the complete model as:
\[ \begin{aligned} \rho_1, \rho_2, Z_0, Z_{-1}, I_0, \sigma_Z, \sigma_\omega, \phi &\sim \pi(\cdot), \\ \epsilon_w & \sim \text{Normal}(0, \sigma_Z)~~ \text{i.i.d } \forall w, \\ Z_w - Z_{w-1} &= \rho_1 (Z_{w-1} - Z_{w-2}) + \rho_2 (Z_{w-2} - Z_{w-3}) + \epsilon_w,~~ w = 1, 2, 3, \dots\\ R_t &= \exp\left(Z_{\lfloor t/7 \rfloor}\right),~~ \text{Piecewise } R_t \text{ constant by week}\\ I_t &= R_t \sum_{s \geq 1} g_s I_{t-s}, ~~ t = 1, 2, 3, \dots, \\ S_t &= \sum_{s\geq1} I_{t-s} \eta_s,~~ \text{Incubation delay: infections to symptom onset}\\ D_t &= \sum_{s\geq1} S_{t-s} \xi_s,~~ \text{Reporting delay: symptom onset to case reports}\\ \tilde{\omega}_k &\sim \text{N}(0, \sigma^2_\omega),~ k = 0,\dots, 6,~~ \text{Day-of-week modifier} \\ \omega &= 7 \times \text{softmax}(\tilde{\omega}),\\ y_t &\sim \text{NegBin}( D_t \omega_{t~\text{mod}~7}, \phi),~~ \text{Link to data}. \end{aligned} \tag{1}\]
Where \(\pi\) represents prior distributions for model hyperparameters. \(g_t\) is the generation time distribution pmf, \(\eta_t\) is the incubation period distribution pmf, and \(\xi_t\) is the reporting delay distribution pmf (all obtained by double interval censoring continuous distributions [34]). \(S_t\) represents symptom onsets (infections convolved with incubation period) and \(D_t\) represents delayed case reports (symptom onsets convolved with reporting delay). The vector \(\omega = [\omega_0,\dots,\omega_6]\) encodes day-of-week effects on reporting.
Latent Model
We reuse the ARIMA(2,1,1) model arima211 defined in Section 2.3 but modify it to be piecewise constant by week using the broadcast_weekly modifier.
weekly_arima211 = broadcast_weekly(arima211)This approach models the log reproduction number as constant within each week whilst allowing weekly changes to follow the ARIMA(2,1,1) process. In EpiNow2, although stationary process representations of \(R_t\) are available, the default latent process is a differenced stationary (Matern) Gaussian Process; that is stationary only on its increments. Similarly, our piecewise constant weekly model applies the ARIMA(2,1,1) model to \(\log R_{w}\) at the weekly scale, then broadcasts this to daily values. The weekly piecewise constant formulation we use is an option in the EpiNow2 package but not part of the default model. Prior predictive samples (Figure 4 A) show the behaviour of the piecewise constant by week process.
Infection Generating Process
Unlike Section 3.1 where the serial interval distribution was used (as in Mishra et al), EpiNow2 uses the generation time distribution, which represents the time between successive infections in a transmission chain. Whilst the serial interval (time between symptom onsets) is often used as a proxy for the generation time because it is more readily observable, using the serial interval can be problematic [37] and is primarily done in practice when generation time estimates are unavailable, particularly early in an outbreak [38].
Here we follow the EpiNow2 getting started vignette to parameterise our model, this has a Gamma distribution with uncertain parameters for the generation time: shape ~ Normal(1.4, 0.48) and rate ~ Normal(0.38, 0.25). Since our prototype does not yet support uncertain delay distributions, we use the mean parameter values, giving a Gamma(1.4, 0.38) distribution with mean 3.68 days
gen_time_dist = Gamma(1.4, 1 / 0.38)
epinow2_data = EpiData(gen_distribution=gen_time_dist)We use a similar renewal structure as we did in Section 3.1, updating it to use the generation time distribution rather than the serial interval.
initialisation_prior = Normal(log(1.0), 2.0)
renewal_gt = Renewal(epinow2_data; initialisation_prior=initialisation_prior)As before, to demonstrate the infection model independently, we supply a fixed \(R_t\) trajectory that decreases from 3 to 0.5 over 50 days. Figure 4 B shows prior samples from the renewal process conditional on this \(R_t\) trajectory.
Observation Model
We build the observation model through the composition of modular components reusing the negative binomial link observation model (negbin) from Section 3.1, and then layer additional modelling components on top. We start by adding a day of the week ascertainment model using the ascertainment_dayofweek helper function.
dayofweek_negbin = ascertainment_dayofweek(negbin;
latent_model=HierarchicalNormal(std_prior=HalfNormal(1.0))
)Unpacking this helper function reveals a nested stack of modelling constructions. First, transformation and broadcasting
function broadcast_dayofweek(model::AbstractTuringLatentModel; link = x -> 7 * softmax(x))
transformed_model = TransformLatentModel(model, link)
return BroadcastLatentModel(transformed_model, 7, RepeatEach())
end
which uses the TransformLatentModel and BroadcastLatentModel structs to dispatch the transformation \(\omega = 7 \times \text{softmax}(\tilde{\omega})\) and then broadcast this along the time series.
Second is the linkage of the day of week model to the latent infections time series as a multiplicative ascertainment process, which produces
@model function EpiAwareBase.generate_observations(obs_model::Ascertainment, y_t, Y_t)
@submodel expected_obs_mod = generate_latent(
obs_model.latent_model, length(Y_t)
)
expected_obs = obs_model.transform(Y_t, expected_obs_mod)
@submodel y_t = generate_observations(obs_model.model, y_t, expected_obs)
return y_t
end
This day-of-week model assumes a static weekly pattern; to allow temporal evolution, the weekly pattern could be modelled by concatenating latent processes, where each day’s effect becomes a time-varying process.
In addition to ascertainment, we also need to model the incubation period (infection to symptom onset) and a reporting delay (symptom onset to case reporting). Again following the EpiNow2 getting started vignette, the incubation period uses a LogNormal distribution with uncertain parameters: meanlog ~ Normal(1.6, 0.064) and sdlog ~ Normal(0.42, 0.069), giving a mean of 5.41 days at the parameter means. The reporting delay uses a LogNormal with meanlog = 0.58 and sdlog = 0.47, giving a mean of 1.99 days.
Since our prototype does not yet support uncertain delay distributions, we use the mean parameter values:
incubation_distribution = LogNormal(1.6, 0.42)
reporting_distribution = LogNormal(0.58, 0.47)The LatentDelay struct uses double interval censoring [34] to convert continuous delay models into pmfs and stacks with the temporal modifier model. We compose two delay layers sequentially:
incubation_dayofweek_negbin = LatentDelay(dayofweek_negbin, incubation_distribution)
delay_dayofweek_negbin = LatentDelay(incubation_dayofweek_negbin, reporting_distribution)This code demonstrates component reuse (base negbin from Section 3.1), layered composition (adding day-of-week effects via ascertainment_dayofweek), and sequential composition (adding two delay layers via nested LatentDelay). Note that since these are deterministic delays, we could more efficiently compute this by first convolving the two delay distributions or by automating this using the struct-in-struct approach with multiple dispatch. Figure 4 C demonstrates how these delay and day-of-week effects work together to transform a latent infection signal.
Fitting to Data
We compose these modelling subcomponents into one EpiProblem model. The delay padding accounts for the combined length of the incubation period and reporting delay pmfs, allowing the model to properly account for infections that have occurred but not yet been observed by the end of the training period.
incubation_pmf_length = length(incubation_dayofweek_negbin.rev_pmf)
reporting_pmf_length = length(delay_dayofweek_negbin.rev_pmf)
delay_padding = incubation_pmf_length + reporting_pmf_length
tspan = (1, 40 + delay_padding)
epinow2 = EpiProblem(epi_model=renewal_gt,
latent_model=weekly_arima211,
observation_model=delay_dayofweek_negbin,
tspan=tspan)This EpiProblem uses the renewal model with generation time data (renewal_gt) and the piecewise constant by week ARIMA(2,1,1) latent model (weekly_arima211). The observation model reuses negbin from Section 3.1, layering on delays and day-of-week effects.
We filter the data as before to the timespan of interest.
italy_training_data = @chain italy_data begin
@set _.y_t = _.y_t[first(tspan):last(tspan)]
@set _.dates = _.dates[first(tspan):last(tspan)]
endWe again construct a Turing model using generate_epiaware.
epinow2_mdl = generate_epiaware(epinow2, italy_training_data)We reuse the same inference_method defined in Section 3.1..
epinow2_results = apply_method(epinow2_mdl,
inference_method,
italy_training_data
)Here is the summarised posterior:
summarystats(epinow2_results.samples)Summary Statistics parameters mean std mcse ess_bulk ess_tail ⋯ Symbol Float64 Float64 Float64 Float64 Float64 ⋯ latent.latent_init[1] 0.2837 0.1656 0.0071 543.3543 635.9462 ⋯ latent.ar_init[1] 0.1082 0.1601 0.0064 614.0274 659.7491 ⋯ latent.ar_init[2] 0.0299 0.1464 0.0066 484.9543 516.4906 ⋯ latent.damp_AR[1] 0.1969 0.1295 0.0039 906.0275 562.4052 ⋯ latent.damp_AR[2] 0.1042 0.0479 0.0015 897.5121 427.2224 ⋯ latent.θ[1] 0.0496 0.1996 0.0072 760.1904 652.6312 ⋯ latent.std 0.1891 0.0527 0.0024 459.3943 562.6200 ⋯ latent.ϵ_t[1] -1.5638 0.5711 0.0235 626.8490 488.6286 ⋯ latent.ϵ_t[2] -1.3453 0.6172 0.0273 510.4229 570.5520 ⋯ latent.ϵ_t[3] 0.4867 0.5509 0.0190 840.9536 670.1245 ⋯ latent.ϵ_t[4] 0.3820 0.4858 0.0170 819.3578 723.6612 ⋯ latent.ϵ_t[5] -0.2293 0.6750 0.0192 1233.4135 973.5783 ⋯ latent.ϵ_t[6] 0.0111 1.0384 0.0270 1502.0122 796.4518 ⋯ init_incidence 6.2784 0.4532 0.0207 499.5471 505.4530 ⋯ obs.DayofWeek.std 0.1576 0.0703 0.0034 393.3732 625.6108 ⋯ obs.DayofWeek.ϵ_t[1] 0.0545 0.4974 0.0221 509.7989 658.2117 ⋯ obs.DayofWeek.ϵ_t[2] -0.4377 0.4916 0.0215 522.1205 471.3773 ⋯ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 2 columns and 6 rows omitted
We see the model has converged and the summary statistics are similar to Section 3.1. Figure 4 shows that the compositional model recovers similar \(R_t\) dynamics to Section 3.1 (Figure 4 F), whilst explicitly accounting for reporting delays (Figure 4 D) and day-of-week effects that help disentangle true transmission changes from reporting artifacts (Figure 4 E).
Contemporary statistical inference for infectious disease models using Stan
In this vignette, we’ll demonstrate how to use EpiAware in conjunction with the SciML ecosystem [17] to replicate Contemporary statistical inference for infectious disease models using Stan Chatzilena et al. 2019.
This case study is currently being developed separately in case-study-3.qmd and will be integrated once it follows the standard case study format.
Discussion
This paper has demonstrated that compositional approaches can address barriers in epidemiological modelling. We presented a prototype that enables “LEGO-like” model construction, maintaining the statistical rigour of joint models whilst providing the flexibility of pipeline approaches. The autoregressive model example illustrated how complex models emerge from simple component combinations using the struct-in-struct pattern (where structures, data containers that group related variables and their types together, contain other structures as fields to build complex models by nesting simpler components). Our three case studies, which were based on previous studies [11,23,24], demonstrate model composability for a range of problems and using different underlying methods.
A strength of our prototype approach is its modular design, which enables faster model development and component reuse, whilst also facilitating comparison of modelling assumptions without large reimplementation efforts. The approach reduces implementation barriers for methodological advances and enables researchers to embed new components within existing models more easily. For instance, adding a new data source such as wastewater surveillance to an existing case-based model requires only defining a wastewater submodel with its own latent and observation process, avoiding the need to reimplement the existing infection, latent, and observation models. Standardised interfaces should also allow large language models to compose models programmatically with validation methods ensuring correctness. However, a clear limitation is that this is just a prototype and not designed for ongoing use or adoption. For example, we only partially implemented support for automated mappings between latent and observed states, which remains challenging to do well. Similarly, we only added partial support for partial pooling approaches with no infrastructure in place to support users in using this functionality in our DSL layer. The current component library also remains limited to basic epidemiological patterns. Finally, the current struct manipulation tooling cannot coordinate updates of related parameters such as model order and corresponding priors, without manual intervention, requiring either avoiding this pattern or developing additional solutions. Whilst the prototype implementation has limitations, the three real-world case studies presented here establish the viability of compositional approaches for practical epidemiological problems. On top of this, we have laid the groundwork for future work and given clear requirements that future composable frameworks should seek to meet to address the needs of applied infectious disease modelling. Another limitation is the learning curve required for researchers familiar with monolithic approaches to adopt this compositional paradigm. However, as they can embed elements of our system within their existing models and as they can implement their own compositional elements with minimal understanding of the overall architecture these barriers to use should be mitigated. Computational overhead from abstraction layers is also a concern; the impact of this overhead and potential optimisations remains unclear. However, Turing.jl [15] and Julia [26] are well-positioned for optimisation with new automatic differentiation backends like Enzyme [39] and Mooncake [40], which could reduce computational costs in future implementations. Similarly, ongoing work in the Turing.jl ecosystem should be able to identify and resolve any performance bottlenecks that do emerge. A strength of the Julia [26] ecosystem for compositional work is that domain experts develop specialised tools that interoperate through shared interfaces. However, this distributed development also creates challenges as it can be difficult to locate appropriate tools and unclear where responsibility lies when issues emerge. Users may not know whether problems originate in Turing.jl, our prototype DSL, or elsewhere in the ecosystem, and even once located, it can be difficult to determine who is responsible for resolving them. There are also additional technical barriers within the Turing.jl ecosystem. The Turing PPL DSL is not fully stable and since developing this prototype, the @submodel macro has changed syntax and become a function with breaking changes in how it handles prefixes and in how it manages variables within a submodel, meaning our current prototype implementation is not compatible with the latest Turing.jl release. However, some of these changes do allow for potentially more flexible models as they reduce the difference between specifying something as a distribution and as a submodel. Other recent changes in Turing.jl, such as the primary workflow for observed data being to condition and fix with no argument for observations, impact how data is conditioned on the model. This means not having to pass observations through layers of the model, making it easier to build composable models. Turing.jl also has limited handling of numerical instability, and so things like large counts being simulated can cause errors, which makes evaluating and developing models frustrating. An advantage of the Julia [26] ecosystem for a composable modelling tool built on top of a PPL is access to the full features of a programming language. However, in practice, some of these utilities, such as profiling tools, can be frustrating and difficult to use with Turing.jl when compared to similar tools offered in other ecosystems because the probabilistic programming language abstractions and macro expansions obscure the relationship between source code and runtime behaviour, making it challenging to identify performance bottlenecks in model components. The ecosystem needs further development in areas such as debugging tools that can trace through model execution whilst preserving the semantic structure of probabilistic models, better error messages that relate numerical issues to model specification rather than low-level implementation details, and standardised approaches for model validation and testing. The EpiMethod approach for chaining inference (combining pre-sampling like Pathfinder [35] with sampling like the No U-Turn Sampler (NUTS) [29]) shows promise but should be a Turing.jl general solution rather than epi-specific because many domains face similar challenges of difficult posterior geometries where adaptive warmup strategies could improve both efficiency and reliability of inference across the broader probabilistic programming ecosystem.
Our prototype could draw more heavily on category theory [41], which provides formal mathematical foundations for composability. This theory enables automatic modifications to the model when changes are made to underlying components. The AlgebraicJulia ecosystem demonstrates how category theory can be operationalised in software, with AlgebraicPetri.jl [16] and AlgebraicDynamics.jl [42] implementing operadic composition through structured cospans and wiring diagrams [43]. These provide multiple interchangeable representations of model components as open Petri nets, with connection “ports” allowing composability with other components via “junction” porting [16]. Another symbolic-numeric approach to defining dynamical systems, ModelingToolkit.jl [18] and Catalyst.jl [19] from the SciML ecosystem [17], allow for acausal component-based models which achieve performance improvements through automated parallelisation, sparsity detection, and equation simplification. AlgebraicPetri.jl uses the direct association between Petri nets and reaction networks [44] to link the graphical/algebraic representation of the model to its computational semantics via ModelingToolkit.jl [18] and Catalyst.jl [19], allowing models composed using this approach access to the SciML ecosystem. However, all of these tools are currently designed for representing the generators of dynamical systems and do not naturally extend to general probabilistic models. Our probabilistic programming foundation offers more direct support for statistical inference and uncertainty quantification, supports a broader range of modelling approaches, but lacks the formal compositional guarantees of category theory approaches and automated optimisations of symbolic-numeric frameworks.
Alternative probabilistic programming languages could support similar compositional approaches. Gen.jl [45] focuses on lower-level implementation of probabilistic programs compared to Turing.jl, with composition support through nested generative function calls and combinators of generative functions. However, Gen.jl operates more independently from the broader Julia ecosystem, maintaining its own distribution types rather than using Distributions.jl [28], which negates a key strength of Julia approaches where packages interoperate through shared interfaces. Despite this, Gen-based tools demonstrate useful functionality for epidemiological modelling, such as AutoGP.jl [46], which implements real-time inference and automated composition of Gaussian process kernels that could be useful for modelling time-varying epidemiological parameters. Genify [47] provides an approach that translates Julia code into Gen models, potentially easing adoption by allowing modellers to write in familiar Julia syntax whilst accessing Gen’s programmable inference capabilities. Python-based probabilistic programming languages such as NumPyro [48] and Oryx [49] built on JAX [50] could enable similar composability [52], with potential advantages from JAX’s focus on efficiency and GPU scaling. However, the smaller JAX-specific ecosystem compared to general Python packages, JAX’s optimisation for neural network tasks, and Python PPLs’ more programmer-oriented syntax that diverges from mathematical notation may create barriers for epidemiological modellers. As well as imperative probabilistic programming languages (specifies how to perform computations through explicit sequences of commands) such as Turing.jl, there are declarative graph-based approaches (specifies what should be computed by stating relationships and constraints) available such as JuliaBUGS.jl [53] and RxInfer.jl [54] where users specify directed graphical models by explicitly stating conditional dependencies between variables. This graph-based representation offers several advantages including clearer understanding of dependencies, more transparent model structure and assumptions, and efficient inference through algorithms that leverage the model’s graphical structure to identify conditional independence relationships. However, the declarative approach trades off the flexibility of Turing.jl’s imperative style, which allows procedural code, such as a loop over recursive updates for a dynamical system, that can be more expressive for certain complex modelling patterns. Building a domain-specific backend on Distributions.jl rather than Turing.jl could enable compatibility with multiple PPLs including JuliaBUGS.jl whilst trading off Turing.jl’s expressive submodel interface.
Areas for future work include expanding the component library to address epidemiological applications across multiple scales and data types. More work is also needed to allow for modifying deeply nested model specifications without the user needing to be aware of the structure of the nested model. Methodological advances are also needed including for the joint estimation of interdependent epidemiological parameters, and then integration of individual and population-level observations. Alternative approaches also need to investigated, particularly more formal category theory based methods that use operadic composition for mathematically rigorous hierarchical model construction, as demonstrated in AlgebraicPetri.jl and AlgebraicDynamics.jl [16]. Symbolic-numeric frameworks like ModelingToolkit.jl [18] and Catalyst.jl [19] offer potential performance improvements through automatic optimisation and code generation. However, these approaches require generalisation to explicitly model probabilities and support a range of different modelling approaches which are both needed for infectious disease modelling applications. An alternative potential abstraction approach would be to build the domain-specific language on Distributions.jl rather than Turing.jl, which would enable compatibility with multiple probabilistic programming languages like JuliaBUGS whilst trading off the expressiveness of Turing’s submodel interface. Further investigation of Gen.jl [45], Genify [47], and JuliaBUGS.jl [53] is needed, as Gen’s programmable inference capabilities and tools like AutoGP.jl [46] may overcome some limitations of Turing.jl, Genify’s metaprogramming bridge could ease adoption by allowing modellers to write familiar Julia code, and JuliaBUGS’s graph-based approach could enable more efficient inference through explicit model structure. Performance optimisation through parallelisation and approximate inference methods, along with integration bridges to existing epidemiological software ecosystems, will be needed for practical adoption. The compositional framework creates opportunities for large language model integration. Language models could serve as model construction agents, using a composable framework to construct epidemiological models from component libraries [25]. The explicit structure and validation tools of a composable framework should enable language models to reason about model design and propose structural adaptations more easily and with less room for error. The high-level component interface could also reduce context and reasoning requirements by allowing language models to compose models without understanding backend details such as Turing.jl syntax, distribution parameterisations, or inference algorithm specifications, potentially enabling deployment of smaller models on local compute in resource-constrained settings. Further work is needed to realise this potential.
A complete implementation of the prototype we present here could improve real-time analysis of infectious disease dynamics by enabling “LEGO-like” assembly of epidemiological components. This prototype establishes the feasibility of integrating diverse expertise whilst maintaining statistical rigour, addressing limitations of current modelling approaches. Such frameworks are also likely key for enabling, and increasing the robustness of, large language model assisted model construction, especially in resource-constrained settings where such capabilities could prove most valuable. Given the unpredictable nature of future infectious disease threats such adaptable modelling capabilities that can incorporate diverse data sources and domain expertise are needed for future public health decision making.
Acknowledgements
Poppy the dog for growling at the right times.