GWAS revealed that the FOXP4 locus on chromosome 6 is associated with long COVID

The consortium of authors conducted the first genome-wide association study (GWAS) focused specifically on long COVID. The analysis identified genetic variants within the FOXP4 locus as a risk factor for long COVID.

A new condition, known as post-acute sequelae of COVID-19 (PASC) or long COVID, has been recognized after COVID-19 pandemic. The most common symptoms are fatigue, pulmonary dysfunction, muscle and chest pain, and neurological symptoms. The incidence of long COVID varies widely, ranging from 10% to 70%. The biological mechanisms that contribute to the development of long COVID remain unknown. Potential mechanisms include the persistent infection with SARS-CoV-2, autoimmunity, reactivation of latent pathogens such as Epstein Barr virus, and dysregulation of the autonomic nervous system.

The COVID-19 Host Genetics Initiative (HGI) was launched to investigate the role of host genetics in COVID-19 and its various clinical subtypes. Specifically, it has identified 51 distinct genome-wide significant loci that are associated with severe forms of COVID-19 and hospitalization. The authors emphasized that these variants largely implicate canonical pathways involved in viral entry, mucosal airway defense, and type I interferon response.

About the study

The authors analyzed 24 independent GWAS research projects with long COVID cases and calculated four GWAS meta-analyses based on two case and two control definitions. This extensive study included 6,450 patients with long COVID and 1,093,995 controls. The data came from 16 countries, and represented populations from six genetic ancestry regions.

A strict long COVID case definition required an earlier test-verified SARS-CoV-2 infection, whereas a broader long COVID case definition included self-reported or clinician-diagnosed infection.

According to the questionnaire-based studies with available symptom information, the most common symptoms in long COVID cases were fatigue, shortness of breath, and problems with memory and concentration.


GWAS meta-analysis performed using a strict case definition (N = 3,018) and a broad control definition (N = 994,582) from 11 studies revealed a strong association between the FOXP4 locus on chromosome 6 and long COVID. Increased risk for long COVID was found to be associated with the presence of the rs9367106-C allele. The genomic region surrounding the lead variant associated with long COVID contains four genes (FOXP4, FOXP4-AS1, LINC01276, MIR4641).

The authors then analyzed single-cell sequencing data to understand the role of FOXP4 in healthy lung before SARS-CoV-2 infection, and to identify cells that express FOXP4 and may contribute to long COVID. The highest expression of FOXP4 was observed in type 2 alveolar cells. These cells participate in a robust innate immune response, secrete surfactant, keep alveoli free of fluid, and serve as progenitor cells for repopulating damaged epithelium after injury. Equally high expression of FOXP4 was observed in granulocytes which are similarly involved in regulating the innate immune response.

Most of the studies included in this analysis were conducted with individuals of mainly European ancestry. The allele frequency of rs9367106-C at the FOXP4 locus varied widely among the different study populations. Despite the smaller sample size, significant associations for the FOXP4 variant were observed in studies of mixed American, East Asian, and Finnish ancestry. Frequencies vary from 1.6% in non-Finnish Europeans to higher frequencies such as 7.1% in Finnish, 19% in mixed Americans, and 36% in East Asians.

FOXP4 is a transcription factor gene expressed in almost all tissues, with the highest expression in cervix, thyroid, vasculature, stomach, and testis. In addition, FOXP4 has been associated with COVID-19 severity, lung function, and cancer. The researchers, therefore, investigated a possible link between the FOXP4 variant and other diseases. They focused on the Biobank Japan because the highest frequency of long COVID risk allele was observed in the East Asia. A phenome-wide association study of rs9367106 and all phenotypes in Biobank Japan (N = 262) showed that the long COVID risk allele was associated with lung cancer.

In conclusion, this study identified the first GWAS for long COVID at the FOXP4 locus, and elucidated possible mechanisms of how FOXP4 contributes to the risk of Long COVID. These findings provide direct genetic evidence that lung pathophysiology may play a significant role in the development of long COVID, and further confirm the role of pulmonary dysfunction in the development of long COVID. However, the authors emphasize that the severity of acute SARS-CoV-2 infection alone cannot explain the association between the genetic risk factor located in the FOXP4 locus and long COVID.

The results of the study have been published on a preprint server and are currently being peer-reviewed.

Journal Reference

Lammi V, Nakanishi T, Jones SE et al. Genome-wide Association Study of Long COVID. medRxiv preprint. (Open Access)