Association of common variant rs9934336 of SLC5A2 ( SGLT2 ) gene with SARS-CoV-2 infection and mortality

COVID-19 has its life-threatening complications more pronounced in people with underlying health conditions such as diabetes, cardiovascular disease and kidney disease. Inhibition of the sodiumglucose cotransporter 2 (SGLT2), which primarily increases urinary glucose excretion, is shown to be beneficial in patients with type 2 diabetes mellitus (T2D) and other comorbidities. SGLT2 is encoded by SLC5A2 gene, and of the several genetic variants, SNP rs9934336 is gaining importance for being associated with reduced HbA1c level and lower incidence of T2D. Since a complex bidirectional relationship exists between COVID-19 and hyperglycaemia, we conducted a worldwide association study to investigate the effect of rs9934336 on COVID-19 outcomes. Worldwide prevalence data of SLC5A2 SNP rs9934336 were obtained from relevant published articles and databases for genomic variants. COVID-19 data procured from the Worldometer website were used for conducting Spearman’s correlation analysis with minor allele frequency data of rs9934336. Significant positive correlation was observed between rs9934336 and COVID-19 incidence ( p < 0.0001, r = 0.6225) as well as deaths ( p < 0.0001, r = 0.5837). The present finding of significant association of SLC5A2 variant rs9934336 with COVID-19 risk has to be validated by case–control studies in diverse populations along with other variants regulating the expression and function of SGLT2.


Introduction
The devastating pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has its life-threatening complications more pronounced in people with underlying health conditions or comorbidities, such as diabetes, cardiovascular disease and kidney disease [1].Of these, a complex bidirectional relationship seems to exist between COVID-19 and diabetes.While hyperglycaemia in diabetes predisposes to severe disease and adverse outcomes including death and long-term COVID-19 sequelae, SARS-CoV-2 infection is described to induce new-onset hyperglycaemia and diabetes in COVID-19-infected individuals as well as poor glycaemic control in pre-existing diabetes [2].Kidney plays an important role in glucose homoeostasis.Specifically, of the two sodium-glucose cotransporters (SGLTs-SGLT1 and SGLT2) expressed in the kidneys, approximately 90% of glucose reabsorption from the glomerular ultra-filtrate occur via SGLT2, which would have otherwise excreted out through urine [3].Interestingly, inhibition of SGLT2 renders excellent blood glucose lowering potential in an insulin-independent manner via glucosuria, and SGLT2 inhibitors (SGLT-2is) are being widely used for the control of type 2 diabetes (T2D).Moreover, SGLT-2is are shown to have cardio-and reno-protective effects, which have been confirmed in clinical trials [3].SGLT-2is are also presently gaining importance in the treatment of heart failure [4] and chronic kidney disease [5] even in patients without diabetes.
SGLT2 is encoded by SLC5A2 (Solute Carrier Family-5 Member-2) gene, which is located on chromosome 16 of humans.Common genetic variants of SLC5A2, primarily SNP rs9934336: G > A, perhaps have a role in the regulation of glucose homoeostasis and may influence the risk of T2D including treatment response with SGLT-2is in T2D patients [3].Since a complex bidirectional relationship seems to exist between COVID-19 and hyperglycaemia and that SLC5A2 variant rs9934336 is likely to affect glucose homoeostasis, we conducted a worldwide association study to investigate the effect of this genetic variant on SARS-CoV-2 infection and mortality.

Prevalence of SLC5A2 variant rs9934336 data
The present study is a worldwide retrospective cohort study.Genotype and/or allele frequency data of rs9934336 were obtained from relevant published articles through literature search (PubMed, Google scholar) and databases for genomic variants, such as ALFRED (Allele frequency database) (https:// alfred.med.yale.edu/ alfred/ or https:// alfred.med.yale.edu/ alfred/ index.asp) and 1000 Genomes Project (https:// www.inter natio nalge nome.org or https:// www.ensem bl.org/ index.html) to acquire the maximum possible data across worldwide human populations.Though ALFRED lacked genotype data and also had no information about Hardy-Weinberg equilibrium (HWE), inclusion of its data will probably not affect the reliability of the study dataset as intensive curation and data integrity checks are performed preceding any data upload into the database [6].The minor allele frequency (MAF) data obtained from ALFRED was compared with the MAF data in gnomAD (genome aggregation database) (https:// gnomad.broad insti tute.org).The data were merged only for countries with more than one study/report by both including and excluding data from ALFRED separately.

Statistical analysis
Countrywide incidence (total cases/million of population) and mortality (deaths/million of population) data of COVID-19 were used for conducting Spearman's correlation analysis with MAF data of SLC5A2 rs9934336 in GraphPad Prism, version 5.0.A p value ≤ 0.05 was considered to be statistically significant.In the absence of patient-wise data of rs9934336 polymorphism and COVID-19 outcome data, our association study correlating countrywide COVID-19 outcomes with their respective country-wise rs9934336 MAF data from people of all age groups and gender minimises the possible bias due to disparity in race, age and gender.Besides, data obtained from more than one study/report for a country being merged together, and studies lacking sufficient data and/or those deviating from the HWE being excluded from the analysis, the correlation results seem to be convincing.

Results
A total of 105 reports with genotype and/or allele frequency data of SLC5A2 variant rs9934336 across worldwide human populations were obtained from relevant published articles and databases for genomic variants (ALFRED and 1000 Genomes Project) after systematic and vigorous literature search.Some reports obeying HWE in 1000 Genomes Project database were also found in ALFRED.Moreover, the MAF data obtained from ALFRED matched with the data in gnomAD, indicating that ALFRED is a reliable source of MAF data though it lacks information about HWE.Countrywide COVID-19 data obtained from the two sources (Worldometer and WHO websites) were comparable.
Thirty reports obtained from published articles and 1000 Genomes Project database had information about HWE.Among these 30 studies, three studies-one each from India, Italy and the Gambia-were omitted as they deviated from the HWE.After merging MAF data for countries with more than one report (each obeying HWE), data from 21 countries were finally obtained.After considering the MAF data of rs9934336 acquired from ALFRED, data from 20 more countries were attained.Finally, data from 41 countries were taken into consideration for the present study (Table 1).Thus, among these 41 countries, some had data obeying HWE (from published articles and 1000 Genomes Project database) as well as data from ALFRED which had no information about HWE.

Discussion
The present findings of significant association of COVID-19 occurrence and mortality with SLC5A2 variant rs9934336 across worldwide human populations suggest possible deleterious impact of SLC5A2 expression on the disease.Upregulation of SGLT2 along with downregulation of apelin and angiotensin-converting enzyme 2 followed by severe inflammation, oxidative stress, endothelial damage and fibrosis have been demonstrated to contribute to the adverse cardio-renal injury in COVID-19 patients [13].Renal SGLT2 expression is also increased in T2D patients [14].Though rs9934336 likely provides some protection from the development of T2D [7,12,15] and heart failure [16] in few populations of the world, other studies investigating its association with susceptibility to T2D show inconsistent results [8,9].However, the association of rs9934336 with poor glycaemic control and higher risk of diabetic retinopathy in T2D patients in a Slovenian study [11] as well as with COVID-19 risk in our study is interesting.Being located in deep intronic position, it is suggested that rs9934336 possibly has no functional effect [7].Since the exact role of this variant with regard to SGLT2 expression is still unclear, in such case it is possible that another variant associated with rs9934336 might actually affect the function of SGLT2, which requires to be examined.Currently dapagliflozin (DAPA), empagliflozin, canagliflozin and ertugliflozin are the primary commercially available SGLT-2is [3].They possess anti-inflammatory properties and reduce mRNA expression of some cytokines and chemokines (TNF-α, IL-6, MCP-1) and other inflammatory markers (C-reactive protein) [17,18].Besides being cardio-and reno-protective [3,5,14], SGLT-2is also protect the lungs by decreasing oxidative stress, tissue hypoxia and inflammation [14].Moreover, DAPA has been reported to reduce lactate levels and is proposed to protect against the severe course of SARS-CoV-2 infection by preventing the lowering of cytosolic pH and reducing the viral load [19].SGLT-2is usage was also associated with reduced hospitalisation, severity and mortality in COVID-19 patients with T2D [20,21], and no increased risk of developing COVID-19 was observed [22], which assured safe use of the drug for primary care of diabetes during COVID-19 pandemic.Though euglycemic diabetic ketoacidosis was reported in some COVID-19 patients with T2D using SGLT2-is   [23], DAPA was found to be safe in a randomised, phase 3 trial (DARE-19 study) [24].Hence, discontinuation of DAPA was discouraged in these patients as long as they were monitored [24].It seems that the benefits of using SGLT-2is in COVID-19 patients with pre-existing diabetes and/or cardio-renal conditions outweigh the rare side-effects found in a minority of patients.Recent study documents upregulated expression of SGLT2 induced by plasma from COVID-19 patients disrupt vascular homoeostasis, which gets restored upon treatment with SGLT2-is [25].It is also documented that individuals with SLC5A2 rs9934336 minor allele in both homozygous (AA) and heterozygous (GA) conditions excrete higher level of urinary glucose [7,15].However, better response to SGLT-2is is predicted in individuals who excrete lower level of glucose in the urine [26] and thus, individuals carrying the G allele of rs9934336 may respond better to the SGLT2-is than individuals with the minor allele A [7].Though the exact mechanism behind the positive correlation between rs9934336 A allele and COVID-19 detected in our study remains unclear, it proposes poor treatment response to SGLT-2is in COVID-19 patients carrying the variant, whose frequency is, however, found to be lower in T2D [7].Additional studies to validate this observation are necessary.
The present study is the first study reporting the association of SLC5A2 variant rs9934336 with COVID-19 risk.However, the finding has to be validated by case-control studies in diverse populations along with other variants regulating the expression and function of SGLT2.Moreover, treatment response studies involving the use of SGLT-2is in COVID-19 patients having different rs9934336 genotypes as well as other functional genetic variants might help to provide some insight regarding the utility of SGLT-2is in COVID-19 patients with T2D or other comorbidities.

Fig. 1
Fig.1Correlation of minor allele frequency (MAF) of SLC5A2 rs9934336 with COVID-19 incidence and mortality.Data from 41 countries were analysed.Each circle in the figure represents a country.Spearman's correlation analysis showed that MAF of SLC5A2 variant rs9934336 was positively associated with COVID-19 incidence (cases/million of population) (p < 0.0001, r = 0.6225) (A) and COVID-19 mortality (deaths/million of population) (p < 0.0001, r = 0.5837) (B).The countries enrolled in the study are specified in Table1