EpiRust: Simulating the spread of COVID-19 from lockdown
By
Published: October 19, 2020
Engineering for Research (E4R) was founded to work exclusively on novel computational problems faced by scientific organizations, especially big science projects such as the Thirty Metre Telescope, Square Kilometre Array Radio Telescope, and pandemic response similar to COVID-19. We interviewed Harshal Hayatnagarkar, Lead Computer Scientist and Research & Development Partner, and Chhaya Yadav, Delivery Partner, to learn more about their team's work simulating the spread of COVID-19 from lockdown.
Q&A with Harshal Hayatnagarkar
What do you work on as a Lead Computer Scientist at Thoughtworks?
As a Lead Computer Scientist, I lead Thoughtworks’ research and development in the Engineering for Research (E4R) practice. One of my primary responsibilities is connecting with scientists and scientific organizations across disciplines, explore collaboration opportunities, and forge partnerships. Last but not least, I’m involved in research and development (R&D) projects with E4R and publish findings in peer-reviewed scholarly conferences and journals with my colleagues. Chhaya Yadav and I pair in such a way that she looks after program management and operations and I take care of the scientific community interactions and technology landscape.
Where did the idea for EpiRust originate? What does it aim to achieve?
In the last two years in E4R, we believed that the application domain of disaster response—in particular epidemic response—had some high-quality computational problems. In early 2019, we started to work towards developing a large-scale, agent-based epidemiological simulation so that a researcher could model an infectious disease spreading through a society and its mitigation strategies via what-if and if-what scenarios. An agent-based (person-based) simulation is a virtual society where people live their daily lives, such as house chores, office commute, office hours and so on. Then a researcher can introduce an infection and see how it spreads over time and space via person-to-person interactions. However, this Java-based implementation could not scale up beyond a 10,000-agent population and our goal was to simulate the city of Pune with 5+ million agents.
We started to develop EpiRust at the end of 2019 from the ground up in collaboration with Dr. Gautam Menon of Ashoka University on various scenarioss such as lockdown and risks to healthcare workers. This collaboration has been converted into another ambitious project with Dr. Menon called BharatSim (Bharat is India’s native name) and will develop India’s first ultra-large scale agent-based simulation framework for epidemiology, economics and climate change. The project is funded by the Bill and Melinda Gates Foundation.
Can you talk about transitioning EpiRust’s focus from smallpox to COVID-19?
We chose smallpox because of the extensive literature and data that exists around the world, including India. This also helped us to validate the simulation model and outcomes. By late 2019, the EpiRust MVP was complete and that’s when we started noticing stories from China about a mysterious emerging epidemic, now known as COVID-19. We started to read any available literature and began to mold EpiRust to tackle COVID-19. Due to the volatility of available information, the EpiRust design choices for flexibility started to pay off sooner than anticipated.
What are the technologies that EpiRust uses?
EpiRust is short for the Epidemiology framework in Rust programming language. We chose Rust for its performance and unique memory management approach. It’s closer to C and C++ in terms for performance, but without their caveats around explicit memory management. This explicit memory management is the source of a significant chunk of defects in the software industry today and their absence definitely adds to the robustness and the safety of a software. We believe that choosing the Rust programming language has given us necessary foundation to build a complex yet performant framework. In addition, we have used Kafka as a distributed messaging platform.
The explicit memory management requires developers to take care of allocation and deallocation of memory chunks for data structures used in the algorithms. While prima facie it appears trivial, this is the source of majority of bugs and defects in the everyday software. For example, 'buffer overflow' is one such dominant type of defect associated with the explicit memory management. The defect is caused when an algorithm tries to access memory beyond the limits of its allocated chunk, and is an exploitable defect.
EpiRust is one of the few extreme-scale, agent-based simulation platforms in the world, and perhaps the only in India. We achieved this feat within six months of development and Rust has definitely played a unique role in it. To our knowledge, today no other programming language offers best of the both worlds.
What sort of challenges did you encounter as you were using EpiRust in the real-life scenario of COVID-19? How did you and your team overcome those challenges?
We’ve observed that COVID-19’s dynamics are not firmly characterized due to variations of the pathogen and the lack of clinical case data. Secondly, interventions such as lockdown and healthcare system optimization evolved more quickly in society than our model. We addressed the first challenge via the Mordecai Model suggested by our collaborator, Dr. Gautam Menon. The second challenge still requires a lot of work, especially inference from the field data.
COVID-19 raises questions such as hospital beds occupancy, ICU occupancy, ventilator support, risk to healthcare staff, and transmission by infectious yet asymptomatic people, and impact of various interventions to improve a situation. The Mordecai model is a nine-compartment SEIR model such that the infection compartment 'I' is divided into sub-compartments such asymptomatically infected, mildly infected, and severely infected. It adds hospitalized compartments as well. These compartments help track infection stages in an individual.
Would you consider EpiRust a success?
We have some early results for Pune and Mumbai cities. We simulated Pune with both a baseline and lockdown/intervention scenario (only essential services are allowed). We submitted a paper to the SIMS 2020 (a prestigious conference for simulation techniques) and am pleased to share that this paper has been accepted for publication.
‘Success’ in this case is a relative term, in that EpiRust allows epidemiological studies and simulations, However, COVID-19 is too illusive a disease to be properly modeled by the framework. There's a lot of work to be done and the good news is that we see the path in front of us is clear enough to walk on.
How did it feel for you as a technologist to apply your work to something unfolding in real time that was impacting people around the world?
It’s a strange feeling of satisfaction combined with impatience, simultaneously. We're excited to watch our ideas materialize and have a chance to contribute to the development of simulation-based policy making in India, and perhaps beyond.
Q&A with Harshal Hayatnagarkar
What do you work on as a Lead Computer Scientist at Thoughtworks?
As a Lead Computer Scientist, I lead Thoughtworks’ research and development in the Engineering for Research (E4R) practice. One of my primary responsibilities is connecting with scientists and scientific organizations across disciplines, explore collaboration opportunities, and forge partnerships. Last but not least, I’m involved in research and development (R&D) projects with E4R and publish findings in peer-reviewed scholarly conferences and journals with my colleagues. Chhaya Yadav and I pair in such a way that she looks after program management and operations and I take care of the scientific community interactions and technology landscape.
Where did the idea for EpiRust originate? What does it aim to achieve?
In the last two years in E4R, we believed that the application domain of disaster response—in particular epidemic response—had some high-quality computational problems. In early 2019, we started to work towards developing a large-scale, agent-based epidemiological simulation so that a researcher could model an infectious disease spreading through a society and its mitigation strategies via what-if and if-what scenarios. An agent-based (person-based) simulation is a virtual society where people live their daily lives, such as house chores, office commute, office hours and so on. Then a researcher can introduce an infection and see how it spreads over time and space via person-to-person interactions. However, this Java-based implementation could not scale up beyond a 10,000-agent population and our goal was to simulate the city of Pune with 5+ million agents.
We started to develop EpiRust at the end of 2019 from the ground up in collaboration with Dr. Gautam Menon of Ashoka University on various scenarioss such as lockdown and risks to healthcare workers. This collaboration has been converted into another ambitious project with Dr. Menon called BharatSim (Bharat is India’s native name) and will develop India’s first ultra-large scale agent-based simulation framework for epidemiology, economics and climate change. The project is funded by the Bill and Melinda Gates Foundation.
Can you talk about transitioning EpiRust’s focus from smallpox to COVID-19?
We chose smallpox because of the extensive literature and data that exists around the world, including India. This also helped us to validate the simulation model and outcomes. By late 2019, the EpiRust MVP was complete and that’s when we started noticing stories from China about a mysterious emerging epidemic, now known as COVID-19. We started to read any available literature and began to mold EpiRust to tackle COVID-19. Due to the volatility of available information, the EpiRust design choices for flexibility started to pay off sooner than anticipated.
What are the technologies that EpiRust uses?
EpiRust is short for the Epidemiology framework in Rust programming language. We chose Rust for its performance and unique memory management approach. It’s closer to C and C++ in terms for performance, but without their caveats around explicit memory management. This explicit memory management is the source of a significant chunk of defects in the software industry today and their absence definitely adds to the robustness and the safety of a software. We believe that choosing the Rust programming language has given us necessary foundation to build a complex yet performant framework. In addition, we have used Kafka as a distributed messaging platform.
The explicit memory management requires developers to take care of allocation and deallocation of memory chunks for data structures used in the algorithms. While prima facie it appears trivial, this is the source of majority of bugs and defects in the everyday software. For example, 'buffer overflow' is one such dominant type of defect associated with the explicit memory management. The defect is caused when an algorithm tries to access memory beyond the limits of its allocated chunk, and is an exploitable defect.
EpiRust is one of the few extreme-scale, agent-based simulation platforms in the world, and perhaps the only in India. We achieved this feat within six months of development and Rust has definitely played a unique role in it. To our knowledge, today no other programming language offers best of the both worlds.
What sort of challenges did you encounter as you were using EpiRust in the real-life scenario of COVID-19? How did you and your team overcome those challenges?
We’ve observed that COVID-19’s dynamics are not firmly characterized due to variations of the pathogen and the lack of clinical case data. Secondly, interventions such as lockdown and healthcare system optimization evolved more quickly in society than our model. We addressed the first challenge via the Mordecai Model suggested by our collaborator, Dr. Gautam Menon. The second challenge still requires a lot of work, especially inference from the field data.
COVID-19 raises questions such as hospital beds occupancy, ICU occupancy, ventilator support, risk to healthcare staff, and transmission by infectious yet asymptomatic people, and impact of various interventions to improve a situation. The Mordecai model is a nine-compartment SEIR model such that the infection compartment 'I' is divided into sub-compartments such asymptomatically infected, mildly infected, and severely infected. It adds hospitalized compartments as well. These compartments help track infection stages in an individual.
Would you consider EpiRust a success?
We have some early results for Pune and Mumbai cities. We simulated Pune with both a baseline and lockdown/intervention scenario (only essential services are allowed). We submitted a paper to the SIMS 2020 (a prestigious conference for simulation techniques) and am pleased to share that this paper has been accepted for publication.
‘Success’ in this case is a relative term, in that EpiRust allows epidemiological studies and simulations, However, COVID-19 is too illusive a disease to be properly modeled by the framework. There's a lot of work to be done and the good news is that we see the path in front of us is clear enough to walk on.
How did it feel for you as a technologist to apply your work to something unfolding in real time that was impacting people around the world?
It’s a strange feeling of satisfaction combined with impatience, simultaneously. We're excited to watch our ideas materialize and have a chance to contribute to the development of simulation-based policy making in India, and perhaps beyond.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.