Angiotensin-converting enzyme-2 (ACE2) receptor has been identified as the key adhesion molecule for the transmission of the SARS-CoV-2. However, ACE2 gene variation is not associated with COVID-19 susceptibility, severity, or outcomes. We hypothesize that genes interacting with ACE2 activity are enriched for molecular pathways relevant for COVID-19 susceptibility. Accordingly, we employed a top-down approach starting with genes that interact with ACE2, followed by tissue gene expression enrichment, drug-gene interaction and enrichment and variant prioritization using genetic variants within the ACE2 gene-gene connectome and protein-protein interaction networks. With this approach we identified several risk loci with support from the COVID-19 Host Genetics Initiative’s published genome-wide association studies (GWAS) and demonstrable effects of ACE2-gene network relevant for the vast symptoms observed following SARS-CoV-2 infection.
Table for above graph
Each data point presents trait associated with gene as mined from the GWAS Atlas, traits are grouped in domains (x-axis) and size of the data point represents the sample size (legend on right) of the study for which the association statistic was reported. The y-axis shows -log10(p-value) of the gene with the respective trait. The dotted line presents Bonferroni significance line (1e-5) correcting for the traits present in the GWASAtlas.
## [1] "Number of data points: 245"
##
## Not.Signi Signi
## 203 42
## [1] "Number of data points: 323"
##
## Not.Signi Signi
## 300 23
## [1] "Number of data points: 64"
##
## Not.Signi
## 64
## [1] "Number of data points: 240"
##
## Not.Signi Signi
## 239 1
## [1] "Number of data points: 393"
##
## Not.Signi Signi
## 380 13
## [1] "Number of data points: 401"
##
## Not.Signi Signi
## 296 105
## [1] "Number of data points: 267"
##
## Not.Signi Signi
## 265 2
## [1] "Number of data points: 259"
##
## Not.Signi
## 259
## [1] "Number of data points: 247"
##
## Not.Signi Signi
## 237 10
## [1] "Number of data points: 228"
##
## Not.Signi Signi
## 226 2
## [1] "Number of data points: 324"
##
## Not.Signi Signi
## 322 2
## [1] "Number of data points: 223"
##
## Not.Signi Signi
## 212 11
## [1] "Number of data points: 218"
##
## Not.Signi
## 218
## [1] "Number of data points: 466"
##
## Not.Signi Signi
## 400 66
## [1] "Number of data points: 477"
##
## Not.Signi Signi
## 431 46
## [1] "Number of data points: 260"
##
## Not.Signi Signi
## 258 2
## [1] "Number of data points: 422"
##
## Not.Signi Signi
## 382 40
## [1] "Number of data points: 314"
##
## Not.Signi Signi
## 311 3
## [1] "Number of data points: 226"
##
## Not.Signi Signi
## 224 2
## [1] "Number of data points: 425"
##
## Not.Signi Signi
## 393 32
## [1] "Number of data points: 256"
##
## Not.Signi
## 256
## [1] "Number of data points: 342"
##
## Not.Signi Signi
## 292 50
## [1] "Number of data points: 376"
##
## Not.Signi Signi
## 357 19
## [1] "Number of data points: 269"
##
## Not.Signi Signi
## 266 3
## [1] "Number of data points: 243"
##
## Not.Signi
## 243
## [1] "Number of data points: 237"
##
## Not.Signi Signi
## 236 1
## [1] "Number of data points: 296"
##
## Not.Signi Signi
## 294 2
## [1] "Number of data points: 196"
##
## Not.Signi Signi
## 195 1
## [1] "Number of data points: 366"
##
## Not.Signi Signi
## 360 6
## [1] "Number of data points: 307"
##
## Not.Signi Signi
## 302 5
## [1] "Number of data points: 314"
##
## Not.Signi Signi
## 307 7
## [1] "Number of data points: 313"
##
## Not.Signi Signi
## 308 5
## [1] "Number of data points: 257"
##
## Not.Signi Signi
## 254 3
## [1] "Number of data points: 217"
##
## Not.Signi Signi
## 216 1
## [1] "Number of data points: 275"
##
## Not.Signi Signi
## 273 2
## [1] "Number of data points: 309"
##
## Not.Signi Signi
## 308 1
## [1] "Number of data points: 347"
##
## Not.Signi Signi
## 346 1
## [1] "Number of data points: 384"
##
## Not.Signi Signi
## 360 24
## [1] "Number of data points: 957"
##
## Not.Signi Signi
## 682 275
## [1] "Number of data points: 167"
##
## Not.Signi
## 167
## [1] "Number of data points: 335"
##
## Not.Signi Signi
## 330 5
## [1] "Number of data points: 80"
##
## Not.Signi Signi
## 79 1
## [1] "Number of data points: 227"
##
## Not.Signi
## 227
## [1] "Number of data points: 252"
##
## Not.Signi Signi
## 249 3
## [1] "Number of data points: 83"
##
## Not.Signi
## 83
## [1] "Number of data points: 527"
##
## Not.Signi Signi
## 503 24
## [1] "Number of data points: 632"
##
## Not.Signi Signi
## 529 103
## [1] "Number of data points: 428"
##
## Not.Signi Signi
## 404 24
## [1] "Number of data points: 444"
##
## Not.Signi Signi
## 425 19
## [1] "Number of data points: 255"
##
## Not.Signi Signi
## 241 14
## [1] "Number of data points: 181"
##
## Not.Signi Signi
## 180 1
Common genes across all six domains - “SLC44A4” “APOA1” “RORA”
## Variation.ID dbSNP Chromosome Position REF.Allele ALT.Allele..IUPAC.
## 1 rs760194105 rs760194105 1 151768549 ATTC -
## 2 rs1265893702 rs1265893702 1 151768556 C T
## 3 rs1195052699 rs1195052699 1 151768558 A G
## 4 rs1488141823 rs1488141823 1 151768564 G Y
## 5 rs1202366215 rs1202366215 1 151768566 T G
## 6 rs545985998 rs545985998 1 151768567 C G
## Minor.Allele Minor.Allele.Global.Frequency Contig Contig.Position Band
## 1 None None GL000016.1 3257191 q21.3
## 2 None None GL000016.1 3257198 q21.3
## 3 None None GL000016.1 3257200 q21.3
## 4 None None GL000016.1 3257206 q21.3
## 5 None None GL000016.1 3257208 q21.3
## 6 G 0.000200 GL000016.1 3257209 q21.3
Download full length results here: Gene-coordinates.txt https://yaleedu-my.sharepoint.com/:f:/g/personal/gita_pathak_yale_edu/Emq-SyQupL9Nj8SkSYPuM04BlBebU0lIbwDU7lGR84bCzg?e=frnpjq
• Column Headers: * o Variation ID: <dbsnp rs#> * o dbSNP: link to dbSNP, if known * o Chromosome: Variant mapped chromosome location * o Position: Variant start position on chromosome * o REF Allele: Reference allele * o ALT Allele (IUPAC): Observed allele * o Minor Allele: Minor allele observed in global population, if known * o Minor Allele Frequency: Minor allele frequency observed in global population, if known * o Contig: Variant mapped contig location * o contigPosition: Variant start position on contig * o Band: SNP cytogenetic location
## Variation.ID Chromosome Position Overlapped.Gene Type Annotation
## 1 <NA> <NA> NA <NA> <NA> <NA>
## 2 rs760194105 chr1 151768549 None None None
## 3 rs1265893702 chr1 151768556 None None None
## 4 rs1195052699 chr1 151768558 None None None
## 5 rs1488141823 chr1 151768564 None None None
## 6 rs1202366215 chr1 151768566 None None None
## Nearest.Upstream.Gene Type.of.Nearest.Upstream.Gene
## 1 <NA> <NA>
## 2 RP11-98D18.9 antisense
## 3 RP11-98D18.9 antisense
## 4 RP11-98D18.9 antisense
## 5 RP11-98D18.9 antisense
## 6 RP11-98D18.9 antisense
## Distance.to.Nearest.Upstream.Gene Nearest.Downstream.Gene
## 1 <NA> <NA>
## 2 1671 RP11-98D18.17
## 3 1678 RP11-98D18.17
## 4 1680 RP11-98D18.17
## 5 1686 RP11-98D18.17
## 6 1688 RP11-98D18.17
## Type.of.Nearest.Downstream.Gene Distance.to.Nearest.Downstream.Gene
## 1 <NA> <NA>
## 2 lincRNA 1981
## 3 lincRNA 1974
## 4 lincRNA 1972
## 5 lincRNA 1966
## 6 lincRNA 1964
Download full length results here:nearestgene_annotation.txt https://yaleedu-my.sharepoint.com/:f:/g/personal/gita_pathak_yale_edu/Emq-SyQupL9Nj8SkSYPuM04BlBebU0lIbwDU7lGR84bCzg?e=frnpjq
• Column Headers: * o Variation ID: <dbsnp rs#> * o Chromosome: Variant mapped chromosome location * o Position: Variant start position on chromosome * o Overlapped Gene: Name of the gene (HGNC system) to which the variant is overlapped * o Type: Gene type, e.g., protein coding, miRNA, non coding, Pseudogene, snoRNA, lincRNA etc. * o Annotation: Summary of whether the variant overlapped with the coding, intronic or untranslated regions of the various transcript isoforms of the gene, as annotated from Ensembl gene system. * o Nearest Upstream Gene: If variant is not overlapped with any gene, then the gene whose end position is nearest to the variant on the left (considering the alignment of genes on the positive strand as left-to-right) * o Type of Nearest Upstream Gene: Gene type, e.g., protein coding, miRNA, non coding, Pseudogene, snoRNA, lincRNA etc. * o Distance to Nearest Upstream Gene: distance from the end position of the nearest upstream gene. * o Nearest Downstream Gene: If variant is not overlapped with any gene, then the gene whose start position is nearest to the variant on the right (considering the alignment of genes on the positive strand as left-to-right) * o Type of Nearest Downstream Gene: Gene type, e.g., protein coding, miRNA, non coding, Pseudogene, snoRNA, lincRNA etc. * o Distance to Nearest Downstream Gene: distance from the start position of the nearest downstream gene.
## Variation.ID Chromosome Position Variant PHRED
## 1 rs1265893702 chr1 151768556 C/T 13.53
## 2 rs1195052699 chr1 151768558 A/G 12.55
## 3 rs1309532353 chr1 151768828 T/C 15.02
## 4 rs946376285 chr1 151768833 G/A 14.38
## 5 rs1044802297 chr1 151768837 C/T 15.29
## 6 rs903034465 chr1 151768839 G/C 16.03
Download annotation for CADD PHRED score > 10 here: SNPswithpathogenicCADD-annotation.txt https://yaleedu-my.sharepoint.com/:f:/g/personal/gita_pathak_yale_edu/Emq-SyQupL9Nj8SkSYPuM04BlBebU0lIbwDU7lGR84bCzg?e=frnpjq
• Column Headers: * o Variation ID: <dbsnp rs#> * o Chromosome: Chromosome name * o Position: Variant start position on chromosome * o Variant: <reference allele,“/”,observed allele> as reported in the tool’s genome-wide score * o PHRED: PHRED-like (-10*log10(rank/total)) scaled CADD-score ranking a variant relative to all possible substitutions of the human genome. A score≥10 indicates that it is predicted to be in the 10% most deleterious substitutions that you can do to the human genome, a score≥20 indicates the 1% most deleterious and so on
## Variation.ID Chromosome Position Variant Functional.Significance.Score
## 1 rs543482228 chr11 522483 C/T 0.55310
## 2 rs763218255 chr11 522724 A/T 0.52076
## 3 rs1381432050 chr11 522725 A/T 0.53188
## 4 rs1048970710 chr11 522953 T/A 0.50105
## 5 rs1030030831 chr11 523050 A/G 0.53433
## 6 rs987669684 chr11 525258 G/A 0.62874
## eQTL.Probability GWAS.Probability HGMD.Probability
## 1 0.35134 0.27441 0.36835
## 2 0.37491 0.29143 0.36774
## 3 0.42790 0.30231 0.37801
## 4 0.34974 0.27633 0.37562
## 5 0.41663 0.31488 0.37194
## 6 0.35291 0.24213 0.36358
Download annotation for functional score > 0.5 : SNPsDeepScoreTOP-annotation.txt https://yaleedu-my.sharepoint.com/:f:/g/personal/gita_pathak_yale_edu/Emq-SyQupL9Nj8SkSYPuM04BlBebU0lIbwDU7lGR84bCzg?e=frnpjq
• Column Headers: * o Variation ID: <dbsnp rs#> * o Chromosome: Chromosome name * o Position: Variant start position in the chromosome * o Variant: <reference allele,“/”,observed allele> as reported in the tool’s genome-wide score * o eQTL Probability: The probability of the variant being a eQTL variant given by functional variant prioritization classifier. * o GWAS Probability: The probability of the variant being a trait-associated (GWAS) variant given by functional variant prioritization classifier. * o HGMD Probability: The probability of the variant being a inherited disease-associated (HGMD) variant given by functional variant prioritization classifier. * o Functional Significance Score: A measure in the range [0-1] depicting the significance of magnitude of predicted chromatin effect and evolutionary conservation. Lower score indicates higher likelihood of functional significance of the variant.
Top 50 FDR-significant associations are shown in the bar graph and all significant associations are shown in the table listed under the figure
## Selecting by Term.2
We compared mean probability of Neanderthal LA between the ACE2 network SNP set (mean=0.032) and 1,000 randomly selected SNP sets with comparable genomic features (range of Neanderthal LA means = 0.027-0.036). The ACE2 network SNP set had significantly greater Neanderthal LA probabilities than 663/1,000 randomly selected SNP sets.
LD pruned and p-value clumped network-SNPs from from the COVID19-HGI initiative (Freeze 3) for all six phenotypes - https://www.covid19hg.org/results/
The association of SNPs as QTLs for expression, methylation
The association of SNPs as QTLs for expression, methylation
The association of SNPs as QTLs for expression, methylation
The association of SNPs as QTLs for expression, methylation, protein, histone, splicing and miRNA
The association of SNPs as QTLs for expression, methylation
noQTLs
nomQTLs
Pathak et. al (2020) ACE2 Netlas: In-silico functional characterization and drug-gene interactions of ACE2 gene network and its potential involvement in COVID-19 susceptibility
Gita A Pathak, Frank R Wendt, Aranyak Goswami, Flavio De Angelis, COVID-19 Human Genetics Initiative, Renato Polimanti
Affiliation: Yale School of Medicine, Department of Psychiatry, Division of Human Genetics, New Haven, CT Veteran Affairs Connecticut Healthcare System, West Haven, CT
Corresponding authors:
* Renato Polimanti - renato.polimanti@yale.edu
* Gita Pathak - gita.pathak@yale.edu