epidWaves: A code for fitting multi-wave epidemic models

The COVID-19 pandemic has given rise to a great demand for computational models capable of describing and inferring the evolution of an epidemic outbreak in the short term. In this sense, we introduce epidWaves, a package that provides a framework for fitting multi-wave epidemic models to data from actual outbreaks of COVID-19 and other infectious diseases.


Introduction
Using computational models to study past epidemics and make predictions for ongoing outbreaks is not something new or has only become popular in the COVID-19 pandemic. Mathematical tools such as differential equations, statistical regressors, complex networks, etc have been present in the world of epidemiology (biology in a broader sense) for many decades [1][2][3][4][5][6][7], being used in the study of diseases such as malaria [8,9], Dengue [10,11], Zika virus and other arboviruses [12][13][14], etc. Meanwhile, the global health emergency imposed by the recent COVID-19 pandemic has generated numerous original challenges for computational epidemiology [15]. Among them, we can highlight the development of robust data-driven predictive models to be used in The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals. * Corresponding author. Data-driven models employing phenomenological equations to make predictions or to represent past outbreak data are very useful.
Once they do not present excessive complexity or rely on information that is difficult to obtain during the epidemic, as is typically the case with compartmental models based on differential equations, they are very appealing in this context of real-time (or near-real-time) analysis. Therefore, we introduce epidWaves, a Matlab/Python package for fitting multi-wave epidemic models to data from COVID-19 and other

Software details
The package epidWaves implements the epidemiological data analysis framework proposed by Gianfelice et al. [16], which is illustrated in Fig. 1. In this methodology, epidemic surveillance data are combined with nonlinear statistical regressors [17,18] and Monte Carlo simulations [19,20] to generate a family of predictive models for the evolution of reported cases and deaths associated with an epidemic outbreak. The available models correspond to logistic curves with multiple modes, describing epidemic dynamics with several waves of contagion (each with three phases: expansion, transition, and exhaustion of the outbreak). The user selects the desired amount of waves or can test different models for the same data. The model which presents the best compromise between good adherence to data and simplicity is chosen based on statistical information criteria. Such a selected model can predict reported cases and deaths during the outbreak in a short-term time horizon or extract various information about previous outbreaks. For instance, with such a model, it is possible to estimate the start date of each wave of contagion (as done by Gianfelice et al. [16]). Alternatively, one can infer other quantitative information, such as the ''velocity" of expansion of a contagion wave (through the infection rate), the date of a wave peak (which characterizes the beginning of the outbreak exhaustion phase), or even an estimated date for the beginning of a contagion wave. Such metrics may allow an epidemiologist to assess the severity of a wave of contagion, compare different waves, and investigate the possible factors that triggered the wave. Matlab and Python codes for each step of the framework are available in the package.

Impact overview
Epidemics are recurrent in the history of humanity, and due to globalization, they tend to be increasingly frequent and challenging. Making real-time (or near-real-time) decisions driven by computer models in such scenarios has been critical in the COVID-19 pandemic and will be in future large-scale epidemics. The epidWaves package is a valuable tool for such a task, which due to its simplicity, can be used by professionals with relatively modest mathematical training. In addition to building predictive models, the statistical regressors obtained with this package can provide qualitative and quantitative descriptions of past outbreak data, improve understanding of the epidemic dynamics, and extract critical information (e.g., the start date of a contagion wave). Examples of application in this sense can be seen in the Refs. [16,21]. In the first work, the authors use the epidWaves framework to estimate the starting date of COVID-19 contagion waves in Rio de Janeiro city, between 2020 and 2021. In the second, the authors investigate the dynamics of COVID-19 in Portugal, investigating details of the 2020 outbreak. In addition, it is worth mentioning that the independent initiative COVID-19: Observatório Fluminense 1 used initial versions of this code during its epidemic surveillance work [22].
Due to the versatility and simplicity of the package, epidWaves also has a tremendous educational appeal. It is also worth mentioning that the epidemic curve fitting methodology employed here and early versions of the underlying Matlab code served as a basis for the forecasting module of the educational code EPIDEMIC [23]. Thus, this novel package joins other tools for teaching computational epidemiology, such as EPIDEMIC [23] and ARBO [24], both developed by researchers from Rio de Janeiro State University and collaborators.

Final remarks
epidWaves is a simple and powerful tool to fit statistical multiwave models to epidemic data from COVID-19 and other infectious diseases. The analyzes performed with the statistical models built with this package can be carried out during the epidemic outbreak or a posteriori, with a view to better understanding the evolution of the disease in the latter case and guiding immediate decision-making in the first case. The code also has enormous potential as an educational tool for computational epidemiology, already being used for both research and teaching activities.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.