Phylogenetics And Networks for Generalised HIV Epidemics in Africa




Prevalence of HIV continues to be high in parts of Southern Africa. In the face of vaccine failures, and limited impact of behavioral interventions, much hope is pinned on combination strategies involving antiretroviral therapy. However, there remains a major challenge in developing tools to detect reductions in incidence, or providing insights into continual transmissions, particularly in generalized HIV-1 epidemic settings. Could viral phylogenetics play a role in describing the epidemic and in assessing the impact of interventions?

Phylogenetics is the study of evolutionary relationships among groups of organisms. Phylogenetic analysis has played a crucial role in increasing our understanding of many aspects of HIV biology, including identification of the zoonotic origin of the virus and its migration across the planet.  In developed countries, the opportunities presented by large-scale sequence datasets generated through universal treatment monitoring have promoted development of methodology for their analysis to better characterize the transmission of HIV within these epidemics. Different modelling approaches have been adopted in UK, European and US settings with the aim of establishing direct links between epidemiological models of infectious disease and the interpretation of viral sequence data, but even in these data-rich settings there is no consensus on the optimum approach overall. 

The greatest challenge to the characterization of HIV transmission is the complexity of the networks along which it spreads. These differ between populations, between risk groups and between communities and to date only concentrated epidemics have been studied. The current study is the first to develop a coherent approach to extending these analyses to generalized epidemics. This represents a potential major advance in monitoring the impact of new interventions to reduce HIV-1 spread, including the expanded use of antiretrovirals. Further, we are assessing the utility of embedding such approaches in future trials of new interventions.


The PANGEA_HIV project aims to adapt modern molecular epidemiology and phylodynamics of HIV sequence data to generate new insights into HIV transmission dynamics in generalized epidemics in Africa, and to provide a potential new approach for the evaluation of transmission interventions. 


Work-package 1 aims to produce 20,000 HIV genomes by the end of September 2016. In order to achieve this goal the sequencing capacity at the Wellcome Trust Sanger Institute (Sanger Institute) needed to be scaled up and a sustainable, high content sequencing pipeline had to be developed at the Africa Centre for Health and Population Sciences. These milestones were achieved during the first few months of the project.

The Sanger Institute in the United Kingdom is one of the world leading, research-led genomic centers and a major part of its core activity focuses on high throughput sequencing. It now has the capacity to generate up to 8,000 sample sequencing libraries per month.

The sequencing activities at the Sanger Institute are being supported by the Africa Centre for Health and Population Sciences, at the University of Kwazulu Natal in South Africa. The Africa Centre has been generating HIV genomic sequences for the past 12 years and has developed some of the most respected drug resistance databases and pipelines for sequencing analysis. It is one of the Wellcome Trust Major Overseas Programmes, which provides synergies with the Sanger Institute in terms of common approaches and data sharing. 

predicted sample rate
Figure 1: Predicted sample and sequence generation rate at the WTSI (blue line), Africa Centre (red line) and as a cumulative total (green line) over the project.


The objective of work-package 2 is to obtain samples from prioritized clinical sites, with linked clinical and epidemiological data. In order to achieve this goal numerous different institutions and study teams have joined PANGEA_HIV.

Ethical and, where applicable, regulatory approval for shipping plasma samples that are being collected as part of the Rakai cohort, the Botswana Combination and Prevention Project (BCPP), the Mochudi project, PopART as well as studies conducted by the MRC Uganda and the Africa Centre in South Africa has or will be obtained before the samples are shipped either to the Africa Centre in Kwa-Zulu Natal, South Africa, or to the Welcome Trust Sanger Institute (Sanger Institute) in Cambridge, UK.  This is illustrated in Figure 2 below.

Sample Flow


As part of work-package 3 an appropriate and ethical database is being developed. This database will contain effectively anonymized linked clinical and sequencing data and will be housed within a secure haven at the Farr Institute at University College London (UCL).

The Farr Institute of Health Informatics Research was created in March 2013 and aims to harness health data for patient and public benefit by setting the international standard for the safe and secure use of electronic patient records and other population-based datasets for research purposes.

The data collected as part of PANGEA_HIV will only be used to investigate the specific objectives outlined in pre-agreed analysis proposals and only effectively anonymized and pseudo-anonymized data will be released to consortium members and external collaborators. The patients’ identities will therefore always be protected. 


Work-package 4 uses the HIV-1 genomes produced as part of PANGEA_HIV to characterize transmission dynamics in generalized epidemics with an initial focus on these specific issues: 

  1. What characterizes different epidemics at different stages: what is the difference in transmission dynamics between an older and a younger generalized epidemic? 
  2. How do high-risk groups (i.e. young women, commercial sex workers, migrants) influence the transmission dynamics of generalized epidemics?
  3. What is the role of acute infection in the transmission dynamics of a generalized epidemic?
  4. What is the current and estimated impact of antiretroviral treatment expansion on transmission?
  5. What is the relationship between ART intervention, and proportion of transmitted viruses carrying drug resistance?

Addressing the specific questions above will contribute substantially to our understanding of the epidemics in areas where major interventions are planned.