- Open Access
Prediction and in silico validation of MYH7 gene missense variants in the Iranian population: a bioinformatics analysis based on Iranome database
Egyptian Journal of Medical Human Genetics volume 21, Article number: 17 (2020)
Identifying disease-causing genetic variants in a particular population improves the molecular diagnosis of genetic disorders. National genome databases provide valuable information on this matter. This study aimed to investigate the genomic variants of the MYH7 gene, related to the common heart disease, i.e., hereditary cardiomyopathy.
MYH7 gene variants were extracted from the Iranome database and loaded into SPSS software. The filtration steps were performed based on the variant specification and with emphasis on identifying missense changes. Using predictive algorithms, different aspects of the changes such as allele frequency and functional defects were investigated. Our results showed that 41 (17.4%) coding variants were synonymous compared with 18 (7.7%) missense alterations. The missense variants were mostly observed in exons 20–40 that encode MyHC α-helical rod tail. The p.Pro211Leu, p.Arg787His, p.Val964Leu, p.Arg1277Gln, and p.Ala1603Thr were already known to be associated with inherited cardiomyopathy. Four of the missense variants, p.Asn1623Ser, p.Arg1588His, p.Phe1498Tyr, and p.Arg1129Ser, were located on MyHC α-helical rod tail and none of them was annotated on dbSNP or genomAD databases.
Our study showed several MYH7 variants associated with the disease in the Iranian population. The results emphasize the importance of analyzing the exons encoding MyHC α-helical rod tail. The investigation of genomic databases can be considered as a cost-effective strategy using targeted mutation detection analyses. The efficacy of this prediction method should be elucidated in further studies on patients’ cohorts.
Recent progress in the detection of molecular genetic defects has led to a major development in the diagnosis and treatment of diseases. Decoding the human genome has provided important clues about the genetic diversity of diseases and paved the way for the development of more specialized prevention, diagnostic and therapeutic strategies. By using high-throughput technologies, next-generation sequencing (NGS) generated a significant amount of genomic data, which has been widely used over the past decade .
The NGS method can generally be used to sequence genes regardless of their size and complexity and cover all parts of the genome. This widespread coverage has improved the sensitivity of mutation detection methods more than other conventional approaches. Currently, the causative variants of many single-gene disorders have been identified by the NGS-based method. However, at the clinical level, identifying the effect of genetic variants on the cell function and pathogenesis is extremely important. Thus, various software and web-based bioinformatics tools have been designed and presented for variant evaluation .
Hereditary cardiomyopathies include a group of diseases that involve the heart muscle . Their most common complications comprise the thickening of the heart muscle or dilation of the ventricles, which lead to hypertrophic (HCM) and dilated (DCM) cardiomyopathies, respectively . Importantly, the patients may be asymptomatic or have mild non-specific symptoms. For this reason, heart failure can progress to sudden cardiac arrest in a seemingly healthy individual. Since cardiomyopathies run in families, rapid and accurate molecular diagnosis can be of great value to prevent the disease progression in individuals with a positive family history .
One of the genes associated with cardiomyopathies is myosin heavy chain gene (known as MYH7), which its mutations are reported in 14–25% of all cardiomyopathy cases . The MYH7 gene is located on the 14q11-12 chromosomal position and consisted of 40 exons. Myosin heavy chain (MyHC) protein is almost exclusively expressed in heart muscle and contributes to the formation of thick filaments in a hexamer format along with myosin light chains. The protein has 1934 amino acids and is consisted of two spherical heads followed by an extended α-helical myosin rod tail which are bonded together at the neck region .
Since primary studies in cardiomyopathies, commonly reported the mutations in the head area, the importance of rod tail region is often underestimated.
Given that the conventional study of the MYH7 gene is time-consuming and costly, regional studies have been limited to analysis exons in the MyHC head domain. Consequently, they have not had much success in mutation detection.
Iranian Genome Database (Iranome) has provided genomic information on 800 individuals regardless of their disease or health status [8, 9]. The distribution of reported variants could help to predict the occurrence of mutations in the related pathological conditions. It can be assumed that the variants reported in Iranome could also be distributed in the related patients. Although this is not a straightforward link, it can be a key to predicting pathogenic mutations. Owing to these facts, we aimed to perform further bioinformatics studies regarding MYH7 variants based on the Iranome database. The objective of our study was to identify variants that could be disease-causing. By detecting these variants, further clinical validation studies can focus on exons which probably have a higher chance of mutation in the Iranian population.
An analysis of Iranome database revealed a total of 235 variants in the MYH7 gene, 161 (68.5%) of them were predicted to be intronic (Fig. 1). Among coding variants, the highest frequency (17.4%, N = 41) was allocated to synonymous alterations. Missense substitutions accounted for 18 (7.7%) of all reported changes. As indicated in Fig. 1, the reminder included 3′ UTR (1.7%, N = 4), frameshift (4%, N = 1), splice region (1.7%, N = 4), and nonsense (9%, N = 2) variants.
When variants were analyzed based on the exon-intron distribution, it was found that intron 22 had the highest rate of changes. The synonymous alterations were located almost uniformly in all exons and two nonsense changes were reported in exons 3 and 33. Interestingly, the missense variants were mostly observed in exons 20–40 that encode MyHC α-helical rod tail (Fig. 2).
MYH7 missense variants
For further identification of the variants that could be considered as pathogenic in the Iranian population, missense substitutions were studied more precisely. The variants which were positioned on the exons and subsequently led to MyHC protein amino acid changes were then filtered. The filtering analysis found 18 missense alterations, including p.Pro211Leu, p.Arg787His, p.Val964Leu, p.Arg1277Gln, and p.Ala1603Thr which were already known to be associated with inherited cardiomyopathy. Some substitutions had previously been identified as a causative mutation in cardiomyopathies, although the subsequent studies did not confirm their pathogenesis. From this group, we can refer to p.Ala26Val and p.Arg1662His. Due to the high prevalence in the human genome databases and the results of clinical and bioinformatics studies, two variants, p.Asn1257Ser and p.Ser1491Cys, were previously considered as polymorphisms . Variants p.Ala1191Thr, p.Ser1366Leu, p.Ser1596Leu, p.Asn1824Asp, and p.Asn1824Ser were found with relatively rare allele frequencies in dbSNP or genomAD databases. However, they were not reported related to any disease and generally considered as uncertain significance (Table 1).
The majority of the variants were detected in heterozygote states in only one individual out of 800 genomes indicating that they were very rare (allele frequency of 0.000625). Three variants were found in heterozygous status, each of them in two different individuals. With Allele frequency of 0.0025, the variant p.Arg1277Gln was found in four individuals in a heterozygous manner. The most common variants were p.Ser1491Cys, with 22 reported heterozygous individuals and allele frequency of 0.01375 which implies that it is a population polymorphism.
The results of variant pathogenicity on the databases and in silico analysis are presented in Table 2. As shown in the table, the results obtained from different sources were not necessarily consistent, and the conflicting outcome was observed. Variants with the most evidence of disease-causing were p.Val964Leu, p.Arg1277Gln, and p.Ala1603Thr.
Interpretation of not annotated MYH7 missense variants
As indicated in Table 1, four reported missense variants, p.Asn1623Ser, p.Arg1588His, p.Phe1498Tyr, and p.Arg1129Ser, were not annotated on dbSNP or genomAD databases. All the four variants were located on MyHC α-helical rod tail (Fig. 3). Except for p.Asn1623Ser, the rest of the variants have not been reported on the ClinVar website.
p.Arg1129Ser (c.3387G>C) located on MYH7 exon 27 was identified as damaging by FATHMM. Another substitution in the nucleotide number 3387 (c.3387G>A) has been reported on ClinVar. This synonymous change which does not result in an amino acid change (p.Arg1129 =), has been reported in cardiomyopathy and considered as likely benign .
The p.Phe1498Tyr is located on exon 32 and has been declared as damaging by the majority of the algorithms, but not by MutationAssessor and FATHMM which interpreted this variant as tolerated (Table 3).
By the score of 34, p.Arg1588His (c.4763G>A) has the highest combined annotation-dependent depletion (CADD) score indicating that the variant is among the top 0.1% of deleterious variants in the human genome. Also, this variant has been evaluated as disease-causing in almost all in silico analyses. On ClinVar, another missense variant, i.e., p.Arg1588Pro (c.4763G>C) and the synonymous alteration of p.Arg1588 (c.4764C>T) have been reported at this position, which are related to myopathy distal 1 disease  and hypertrophic cardiomyopathy , respectively.
Asn1623Ser has been declared as pathogenic by most of the software and reported on ClinVar to be associated with cardiomyopathy phenotypes with an uncertain significance . This variant occurred in highly conserved asparagine residue located on exon 34 of the MYH7 gene.
Using various predictive algorithms, we have evaluated the MYH7 gene variants reported on the Iranome website. Following the filtering steps, 18 missense MYH7 variants were found that could be related to the pathogenesis of the cardiomyopathies. Located on the exon 3, p.Ala26Val was previously reported in HCM and DCM probands of the Asian-origin families . Further studies revealed that Alanine 26 substitution is likely benign as it occurs at poorly conserved amino acid. Furthermore, it has an allele frequency of 0.55 in the East Asian population, which based on the ClinGen Inherited Cardiomyopathy Expert Panel, is above the threshold and should be considered as benign .
Another variant, p.Pro211Leu, is identified in several studies related to cardiomyopathies [17, 18]. It has been reported in several patients as a compound heterozygous alteration along with other MYH7 missense mutations . It should be noted that adjacent mutations to Pro211Leu were reported to be involved in the disease pathogenesis. Also, its low prevalence is another reason to be considered as a disease causative mutation.
In a previous study, p.Arg787His was declared as a mutation that could cause phenotypes of varying severity . This mutation has been reported in several studies from India, while in Iranome database, it has been identified in a Persian Gulf Islander in a heterozygous status. By geographic proximity, it can be assumed that a founder effect is involved, although in studies from India, this mutation has been identified as de novo .
The variant which should be considered seriously in Iranian cardiomyopathy patients is p.Val964Leu located on exon 23. In Iranome, two individuals from Turkmen and Persian ethnicity carried this substitution. The p.Val964Leu has been reported linked to cardiomyopathies, either HCM or DCM, in numerous studies [22,23,24]. However, this variant is indicated in ClinVar with conflicting interpretations of pathogenicity because of relatively high frequency in the European population (0.08%). The Valine964 is located in the neck region of MyHC and is a highly conserved amino acid and thus the change to Leucine was predicted to be pathogenic .
Another variant of uncertain significance is p.Arg1277Gln which has changed as a semi-conservative amino acid. This substitution is located on exon 34 and has been reported from different parts of the world [26, 27].
The p.Ala1603Thr is another alteration that should be considered in Iranian studies. In silico testing, including protein predictors and evolutionary conservation, showed that p.Ala1603Thr can be pathogenic. Using high resolution melting (HRM) method, this variant was firstly reported in a cohort of HCM patients . In a recent study, p.Ala1603Thr has also been reported in an HCM patient and it has been deemed as pathogenic in the population study .
The next variant of uncertain significance is p.Arg1662His, which is found in both HCM and DCM [30, 31]. It should be noted that Histidine is the wild-type amino acid at this position, in different species.
In our study, four amino acid substitutions in MYH7 protein were taken into consideration. These variants occurred in the protein tail rod region and were reported as disease causative by most prediction software. Among them, p.Asn1623Ser was reported in ClinVar and suggested to be deleterious based on a computational algorithm that was developed to evaluate the pathogenicity of MYH7 gene variants. The other three variants were not present in dbSNP or genomAD databases and have not been reported in individuals with MYH7-related cardiomyopathy according to the literature. That could be evidence of their pathogenicity in the Iranian population. However, this finding should be confirmed by conducting molecular studies on potential patients. In summary, the Iranian patient’s studies should be prioritized to evaluate MYH7 exons 20–40.
Given the high cost of molecular diagnosis and its vital importance for many patients, the availability of national databases should be considered as a valuable opportunity. The availability of this information will also prevent blind studies and have a promising impact on the perspective of genetics research.
Based on the literature review and extensive search on available databases, MYH7 was selected due to its greatest contribution to hereditary cardiomyopathies. All national reports that had represented MYH7 mutations and their association with cardiac disease were screened. In the next step, by referring to the Iranome website, all the reported MYH7 variants were extracted and loaded onto the SPSS version 20.0 and Excel 2010 software. Iranome database includes the results of NGS analysis of 800 genomes obtained from Iranian individuals over 35 years old. The samples were collected from 8 different Iranian ethnic groups, 100 individuals from each. Iranome website provides a search tool based on the gene name, genomic region, transcript, and multi-allele variants which are continually updated with new genomic data. The majority of the reported variants are similar to other communities, while 30% (422,000) of these genetic changes are unique to the Iranian population.
To determine the MYH7 gene varieties associated with cardiomyopathies, the data in Iranome was filtered in several steps. All variants which occurred in exons and led to amino acid change were selected and studied. To identify the pathogenic effects of the variants, they were divided into two groups including previously reported and unannotated variants. An “unannotated variant” was referred to as the alterations that were not previously interpreted on the dbSNP or genomAD databases. Published articles and documents related to the reported mutations were also analyzed and variants that were associated with cardiomyopathy phenotype were identified.
Bioinformatics analysis was done on putative MYH7 nucleotide substitutions selected from filtering steps using the following databases and online resources.
Genome Aggregation Database: http://gnomad.broadinstitute.org/
The data were interpreted using various online algorithms. The software is score-based so that after the analysis, they determine a numerical value. The results were mentioned in the tables after the final interpretation. Deleterious thresholds were PolyPhen2 > 0.5, MutationTaster > 0.5, SIFT > 0.95, Mutation Assessor > 0.65, FATHMM > 0.453, and CADD ≥ 30 deleterious (in the top 0.1% of deleterious variants in the human genome).
Mutation Assessor: mutationassessor.org
Availability of data and materials
The data that support the findings of this study are available on Iranome website and on request from the author.
Single nucleotide variants
Myosin heavy chain
Myosin heavy chain 7
- Phe :
- Ser :
High resolution melting
Scale-invariant feature transform
Functional Analysis through Hidden Markov Models
Combined annotation-dependent depletion
Le Gallo M, Lozy F, Bell DW (2017) Next-generation sequencing. Adv Exp Med Biol. 943:119–148
Yohe S, Thyagarajan B (2017) Review of clinical next-generation sequencing. Arch Pathol Lab Med. 141(11):1544–1557
Burns C, Bagnall RD, Lam L, Semsarian C, Ingles J. Multiple gene variants in hypertrophic cardiomyopathy in the era of next-generation sequencing. Circ Cardiovasc Genet. 2017;10(4).
Wilcox JE, Hershberger RE (2018) Genetic cardiomyopathies. Curr Opin Cardiol. 33(3):354–362
Goff ZD (2019) Calkins H. Sudden death related cardiomyopathies - hypertrophic cardiomyopathy, Prog Cardiovasc Dis
Sedaghat-Hamedani F, Kayvanpour E, Tugrul OF, Lai A, Amr A, Haas J et al (2018) Clinical outcomes associated with sarcomere mutations in hypertrophic cardiomyopathy: a meta-analysis on 7675 individuals. Clin Res Cardiol. 107(1):30–41
Oldfors A, Lamont PJ (2008) Thick filament diseases. Adv Exp Med Biol. 642:78–91
Fattahi Z, Beheshtian M, Mohseni M, Poustchi H, Sellars E, Nezhadi SH et al (2019) Iranome: a catalog of genomic variations in the Iranian population. Hum Mutat. 40(11):1968–1984
Ng D, Johnston JJ, Teer JK, Singh LN, Peller LC, Wynter JS et al (2013) Interpreting secondary cardiac disease variants in an exome cohort. Circ Cardiovasc Genet. 6(4):337–346
Liu SX, Hu SJ, Sun J, Wang J, Wang XT, Jiang Y et al (2005) Characteristics of the beta myosin heavy chain gene Ala26Val mutation in a Chinese family with hypertrophic cardiomyopathy. Eur J Intern Med. 16(5):328–333
Kelly MA, Caleshu C, Morales A, Buchan J, Wolf Z, Harrison SM et al (2018) Adaptation and validation of the ACMG/AMP variant classification framework for MYH7-associated inherited cardiomyopathies: recommendations by ClinGen’s Inherited Cardiomyopathy Expert Panel. Genet Med. 20(3):351–359
Perrot A, Schmidt-Traub H, Hoffmann B, Prager M, Bit-Avragim N, Rudenko RI et al (2005) Prevalence of cardiac beta-myosin heavy chain gene mutations in patients with hypertrophic cardiomyopathy. J Mol Med (Berl). 83(6):468–477
Woo A, Rakowski H, Liew JC, Zhao MS, Liew CC, Parker TG et al (2003) Mutations of the beta myosin heavy chain gene in hypertrophic cardiomyopathy: critical functional sites determine prognosis. Heart. 89(10):1179–1185
Fourey D, Care M, Siminovitch KA, Weissler-Snir A, Hindieh W, Chan RH, et al. Prevalence and clinical implication of double mutations in hypertrophic cardiomyopathy: revisiting the gene-dose effect. Circ Cardiovasc Genet. 2017;10(2).
Purushotham G, Madhumohan K, Anwaruddin M, Nagarajaram H, Hariram V, Narasimhan C et al (2010) The MYH7 p.R787H mutation causes hypertrophic cardiomyopathy in two unrelated families. Exp Clin Cardiol. 15(1):e1–e4
Bashyam MD, Purushotham G, Chaudhary AK, Rao KM, Acharya V, Mohammad TA et al (2012) A low prevalence of MYH7/MYBPC3 mutations among familial hypertrophic cardiomyopathy patients in India. Mol Cell Biochem. 360(1-2):373–382
Hershberger RE, Parks SB, Kushner JD, Li D, Ludwigsen S, Jakobs P et al (2008) Coding sequence mutations identified in MYH7, TNNT2, SCN5A, CSRP3, LBD3, and TCAP from 313 patients with familial or idiopathic dilated cardiomyopathy. Clin Transl Sci. 1(1):21–26
van Spaendonck-Zwarts KY, van Rijsingen IA, van den Berg MP, Lekanne Deprez RH, Post JG, van Mil AM et al (2013) Genetic analysis in 418 index patients with idiopathic dilated cardiomyopathy: overview of 10 years' experience. Eur J Heart Fail. 15(6):628–636
Captur G, Lopes LR, Mohun TJ, Patel V, Li C, Bassett P et al (2014) Prediction of sarcomere mutations in subclinical hypertrophic cardiomyopathy. Circ Cardiovasc Imaging. 7(6):863–871
Jordan DM, Kiezun A, Baxter SM, Agarwala V, Green RC, Murray MF et al (2011) Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet. 88(2):183–192
Walsh R, Thomson KL, Ware JS, Funke BH, Woodley J, McGuire KJ et al (2017) Reassessment of Mendelian gene pathogenicity using 7,855 cardiomyopathy cases and 60,706 reference samples. Genet Med. 19(2):192–203
Zou Y, Wang J, Liu X, Wang Y, Chen Y, Sun K et al (2013) Multiple gene mutations, not the type of mutation, are the modifier of left ventricle hypertrophy in patients with hypertrophic cardiomyopathy. Mol Biol Rep. 40(6):3969–3976
Millat G, Chanavat V, Crehalet H, Rousson R (2010) Development of a high resolution melting method for the detection of genetic variations in hypertrophic cardiomyopathy. Clin Chim Acta. 411(23-24):1983–1991
Newman R, Jefferies JL, Chin C, He H, Shikany A, Miller EM et al (2018) Hypertrophic cardiomyopathy genotype prediction models in a pediatric population. Pediatr Cardiol. 39(4):709–717
Kassem H, Azer RS, Saber-Ayad M, Moharem-Elgamal S, Magdy G, Elguindy A et al (2013) Early results of sarcomeric gene screening from the Egyptian National BA-HCM Program. J Cardiovasc Transl Res. 6(1):65–80
Waldmuller S, Erdmann J, Binner P, Gelbrich G, Pankuweit S, Geier C et al (2011) Novel correlations between the genotype and the phenotype of hypertrophic and dilated cardiomyopathy: results from the German Competence Network Heart Failure. Eur J Heart Fail. 13(11):1185–1192
The author would like to express her appreciation to the founders and research staffs at Iranome genomics center for providing valuable data on Iranian population variants.
This work was supported by the Center for International Scientific Studies and Collaboration, CISSC, Iranian Ministry of Science, Research and Technology, no. 2390.
Ethics approval and consent to participate
The study was approved by the Ethics Committee of the School of Medicine, Tarbiat Modares University, Tehran, Iran. The consent to participate is not applicable in this study.
Consent for publication
The consent for publication is not applicable in this study
The author declares no conflicts of interest in this research.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Shahbazi, S. Prediction and in silico validation of MYH7 gene missense variants in the Iranian population: a bioinformatics analysis based on Iranome database. Egypt J Med Hum Genet 21, 17 (2020). https://doi.org/10.1186/s43042-020-00058-4
- Iranian population
- Inherited cardiomyopathy