In silico analysis of mutation spectrum of Ehlers–Danlos, osteogenesis imperfecta, and cutis laxa overlapping phenotypes in Iranian population

Background Ehlers–Danlos syndrome (EDS), osteogenesis imperfecta (OI), and cutis laxa (CL) are three rare and heterogeneous connective tissue disorders. Patients with these syndromes have similar manifestations and unpredictable prognosis, making a misdiagnosis highly probable. Some of their subtypes are inherited in autosomal recessive patterns, so they are expected to be prevalent in populations like Iran, where consanguineous marriages are common. In the current work, a cohort of Iranian patients with overlapping phenotypes of the EDS/OI/CL and their mutation spectrum was defined. Based on this, in silico analysis was conducted to anticipate further probable genetic variations. Pathogenicity of EDS, OI, and CL variants in Iranian patients was evaluated using Web servers. A protein interaction network was created by String database and visualized using a Python-based library. The Iranome database was used to predict other genetic mutations in all reported genes of EDS, OI, and CL syndromes. Results In the EDS/OI/CL overlap phenotype, 32 variants in 18 genes have been involved. At least 59% of patients were from families with consanguineous marriages. Interaction analysis showed that COL1A1 , COL1A2 , CRTAP , LEPRE1 , PLOD1 , and ADAMTS2 have the most significant impact within the protein network of EDS/OI/CL overlap phenotype. Analyzing the Iranome database revealed 46 variants of EDS, OI, and CL genes potentially disease causing. Conclusion The overlapping phenotype of EDS, OI, and CL syndromes requires genetic testing (e.g., whole-exome sequencing) to reveal respective variants, which helps to diagnose more accurately and manage the disease more effectively. Particularly in populations with high rates of consanguineous marriages, such as Iran, genetic screening plays a crucial role in premarital and prenatal counseling to prevent the transmission of these rare connective tissue disorders.


Background
Hereditary connective tissue disorders (HCTD) comprise a heterogeneous and pleiotropic group of genetic conditions with structural and functional disruptions in extracellular matrix (ECM) components.Dermal, ocular, and musculoskeletal manifestations, along with heart and lung defects, contribute to the burden of HCTDs [2].From an epidemiological perspective, every HCTD is a rare disease, but combined, they are a notable part of human congenital disorders [9].Studying these syndromes enhances our understanding of the nature of connective tissue (CT) and has the potential to lead to more effective treatments.Connective tissue is one of the mesodermal germ layer derivatives that exist in almost every part of the body.It connects biological structures and establishes the framework necessary for the normal functioning of organs.This tissue comprises three basic parts: soft CT, which surrounds internal organs; hard CT, including bone and cartilage; and liquid CT, which is blood.Extracellular matrix in CT consists of four components: collagens, elastic fibers, glycoproteins, and glycosaminoglycans [8,30].
Collagens are fibrillary proteins that account for onethird of the human body's total protein.There are five types of classical fibrillary collagen: types I, II, III, V, and XI, which are different helical conformations of alphachain polypeptide strands coiling around each other.The alpha chain is made of an amino acid triplet repeat glycine-X-Y, where X and Y are commonly hydroxyproline and proline [12,36].While collagen fibrils are responsible for the strength of the structures, their resiliency is provided by elastic fibers.The process of elastin formation, also referred to as elastogenesis, is complex and not yet fully understood.Microfibers are the main building blocks of elastic fibers.They are a polymerized scaffold of fibrillins, a large protein with a molecular weight of 150 kDa [45].Collagen, elastic fiber, and other ECM components like fibronectin and laminin interact to perform tissue morphogenesis, cell adhesion, migration, or differentiation.
Clinical management of HCTDs is faced with three challenges [35]: (1) Ambiguity: The ubiquitous presence of connective tissue throughout the human body contributes to the challenge of defining and observing the phenotypes of HCTDs in various organs.(2) Variability: patients with the same diagnosis of an HCTD can differ, even in intra-familial cases.(3) Unpredictability: phenotypes of an individual with an HCTD can change over the lifetime, and also they might have temporal manifestations.Therefore, a misdiagnosis at the early stages is highly probable.
Based on which component of ECM is dysregulated, HCTDs are categorized into two major classes: collagenopathies, including Ehlers-Danlos syndrome (EDS), osteogenesis imperfecta (OI), Alport syndrome, and chondrodysplasias.And elastinopathies, including cutis laxa (CL), Marfan syndrome, and pseudoxanthoma elasticum (PXE).These diseases are phenotypically varied and genetically heterogeneous.These diseases exhibit a wide range of phenotypic variations and genetic heterogeneity.A total of 20, 16, and 13 genes have been responsible for EDS, OI, and CL syndromes, up to now.
Ehlers-Danlos syndrome is a soft HTCD characterized by skin hyperextensibility, joint hypermobility, bone fragility and osteoporosis, atrophic scars, loose skin, and cardiovascular problems like mitral valve prolapse [28].The prevalence of different subtypes is about 1 in 5000 to 1 in 20,000.Based on a 2017 international classification, classical EDS, arthrocalasis EDS, and cardiac valvular EDS are the three main subtypes of the syndrome [27].
Osteogenesis imperfecta has a prevalence of 1 in 20,000 live births.It mostly manifests with growth defects, bone fragility, osteopenia, dentinogenesis imperfecta, and blueish sclera.Up to 90 percent of IO cases are due to mutations in COL1A1 and COL1A2.These two are also responsible for many EDS cases [24].The initial step in diagnosing these two syndromes is identifying their similar clinical signs, which makes it challenging to provide follow-up care and genetic counseling.There is an extremely rare condition called EDS/OI overlap, which affects approximately 1 in every 1,000,000 individuals (based on Orphanet data).It was first described in 2013 when patients with combined symptoms were reported.Molecular analysis of this overlap revealed an association with N-terminal mutations in type 1 collagen [33].
An abnormal synthesis of elastic fibers can result in CL syndrome, characterized primarily by loose and redundant skin, developmental emphysema, cardiovascular issues like aortic aneurysm, hernia, delayed growth, and fragile bones.In some CL cases, patients mimic manifestations of EDS with similar skin hyper-elasticity, scarring, and joint laxity [13].Furthermore, CL cases with mutations in RIN2 and ELN exhibit phenotypic similarities to EDS patients, including sparse hair and alopecia.[14,48].
Genetic defects in CT components mostly manifest as phenotypic traits.In these three disorders, in addition to the CT nature, intermediate clinical phenotypes (e.g., blueish sclera in EDS and IO and bone fragility in all three) increase the probability of misdiagnosis.Consanguineous marriage (marriage between relatives) is commonly performed in Iran.The general rate of that is 38.6% throughout the country [44].Thus, it has received great attention as a potential risk factor for many geneticinfluenced health outcomes, especially autosomal recessive (AR) disorders.
According to NORD's database (https:// rared iseas es.org/), CL, ESD, and OI have various subtypes and inheritance patterns.For CL, subdivisions are as follows: acquired cutis laxa, ALDH18A1-related cutis laxa, ATP6V0A2-related cutis laxa, autosomal dominant cutis laxa (ADCL), autosomal recessive cutis laxa type 1A (ARCL1A), autosomal recessive cutis laxa type 1B (ARCL1B), autosomal recessive cutis laxa type 1C (ARCL1C), autosomal recessive cutis laxa type 2A (ARCL2A), autosomal recessive cutis laxa type 2B (ARCL2B), autosomal recessive cutis laxa type 3, Debre-type cutis laxa, EFEMP2-related cutis laxa, ELNrelated cutis laxa, geroderma osteodyplasticum, LTBP4related cutis laxa, MACS syndrome, PYCR1-related cutis laxa, RIN2-related cutis laxa, Urban-Rifkin-Davis syndrome, wrinkly skin syndrome.Most cases of autosomal dominant cutis laxa are caused by mutations in the elastin (ELN) gene and are also known as ELN-related cutis laxa or autosomal dominant cutis laxa type 1 (ADCL1).One case, classified as autosomal dominant cutis laxa type 2 (ADCL2), was caused by a mutation in the fibulin-5 (FBLN5) gene.Ehlers-Danlos syndrome subdivisions are as follows: classic EDS, classical-like EDS, cardiac valvular EDS, vascular EDS, hypermobile EDS, anthrochalasia EDS, dermatosparaxis EDS, kyphoscoliotic EDS, brittle      Osteogenesis imperfecta types I to XXI are subtypes of OI.Types I to V exhibit AD inheritance, while the remaining types are inherited in an AR manner.These three rare HCTDs have multiple AR subtypes, which are another complexity next to their clinical overlap.As depicted in Fig. 1, a total of 45 genes have been identified globally to be associated with the overlapping phenotype of EDS, OI, and CL.About half are on chromosomes 1, 11, 12, and 17.Details like function, related pathway, and ontology of these genes are listed in Table 1.The aim of this study is to address the clinical and genetic complexities of the overlapping phenotypes of Ehlers-Danlos syndrome (EDS), osteogenesis imperfecta (OI), and cutis laxa (CL).The research question seeks to determine the extent to which specific genetic variants contribute to the clinical features of these disorders within the Iranian population, which is characterized by a high consanguineous marriage rate [16,49].It is hypothesized that a clear genetic basis of these diseases can aid in the development of more precise diagnostic and therapeutic strategies like whole-exome sequencing or RNA therapeutics [5,22].The genetic diversity of the Iranian population is leveraged in this study to fill a critical knowledge gap in understanding the pathogenesis of these syndromes and to propose potential targets for intervention in populations with similar genetic backgrounds.

Data collection
Investigation of EDS, IO, and CL patients in Iranian patients to draw a spectrum of their mutation was performed in a systematic search.For that purpose, the keywords 'osteogenesis imperfecta, ' 'ehlers danlos, ' and 'cutis laxa' along with 'Iran' (or Iranian) in both English and Farsi were used.PUBMED, Web of Science, Scopus, Cochrane Library databases, Google Scholar, and Scientific Information Database (SID, an Iranian medical database) were used as search engines.The search for data was up to March 2023, and there was no more restriction.Afterward, the manuscripts were filtered to reach the ones in which genetic tests reported the variants.Also, duplicated ones were removed.Moreover, the HGMD Professional 2021.4 database was utilized for each gene to evaluate the number and types of mutations.

Protein interaction analysis using NetworkX Python package
The NetworkX package (https:// github.com/ netwo rkx/ netwo rkx), a Python language-based library, was employed to explore and visualize complex networks [59].The protein interaction data set was obtained from the STRING database, according to the latest NGS panel for each syndrome.Subsequently, a list of 50 proteins was created.Our model of pairwise relations between proteins of this package was based on the graph theory perspective of NetworkX.

Prediction of probable damaging variants using Iranome database
The reported genes of CL, EDS, and OI syndromes were evaluated using the Iranome Genomic Database (http:// www.irano me.ir/) to predict probable damaging variants.This database was established by whole-exome sequencing (WES) data of 800 healthy Iranian individuals from eight major Iranian populations, including Iranian Arabs, Azeris, Persians, Lurs, Baluchs, Persian Gulf Islanders, Kurds, and Turkmen.Iranome discovered more than 1,500,000 variants, more than 300,000 of which were novel [7,31].The pathogenicity of these variants was investigated using six tools, including SIFT, Polyphen2, MutationTaster, MutationAssessor, FATHMM, and FATHMM MKL, as listed on the Iranome Website.Here, the missense heterozygous alleles of all reported EDS, OI, and CL genes in the database were obtained; then, inclusion and exclusion criteria were established to predict which variant has the most probability of causing one of the three syndromes.The criteria are as follows: A variant with A) three or more times predicted as damaging via the six mentioned Web servers.B) More than 10 number of heterozygotes.C) CADD score of 20 or more.

Data collection
Based on initial keyword research in the six search engines, 416, 1593, and 1748 manuscripts were found for CL, EDS, and IO, respectively.Further in-detailed mining of papers and removing duplicated studies revealed that in 13, 8, and 8 manuscripts, homozygous variants of cases with CL, EDS, and IO were reported (Fig. 2).Statistically, 32 variants were found in 18 genes as a result of genetic tests in 43 patients.The novelty status of variants, reported disorders, applied genetic test, and type of marriage are summarized in Table 2. HGMD Professional 2021.4 database also revealed that most of the mutations in these genes are missense/nonsense, splicing, and small deletion (Table 3).

Pathogenicity and stability of EDS/OI/CL overlap phenotype variants
The results of both pathogenicity and stability analysis by Web servers are listed in Table 4, divided into two sections-one for missense variants and another for splice site, deletion, and duplication variants.From eight OI variants, only one is missense (COL1A1: c.2298T > C).Three CL and one EDS variants were inconsistent with the reported phenotype.FBLN5: c.544G > C was found in two patients and, according to I-Mutant 2.0, has a positive effect on protein stability.The Web server also reported the same effect for PYCR1: c.722C > A. One patient with CL type 2 had PYCR1: c.797G > A, which was identified as benign by Polyphen2.It also reported FKBP14: c.143T > A as a benign variant in an EDS patient.

Protein interaction analysis using NetworkX Python package
A list of 50 proteins involved in EDS/OI/CL overlap phenotype was created using NetworkX.Figure 3 shows an undirected weighted graph in which the nodes and edges represent proteins and their interactions, respectively, so that each edge's length shows the interaction score.The degree of the graph (defined as the average number of edges connected to each node) equals 4.47.NetworkX package applied the concept of 'betweenness centrality' to the graph.It is a measure in graph theory to demonstrate which nodes are more important (or their absence causes more disruption in the network) based on the shortest paths.In this graph, the size of the nodes indicates the betweenness centrality.Also, a color range from dark green to white indicates the degree; greener nodes have more degrees (more connected edges) and bigger nodes have more impact on the network.Moreover, an edge with more width and greener color shows more interaction scores between a pair of proteins.The graph shows that COL1A1, COLA1A2, CRTAP, LEPRE1, PLOD1, and ADAMTS2 have the biggest impact on the protein network of the overlapping phenotype.

Iranome database reveals 46 probable disease-causing variants for EDS, OI and CL
A total number of 46 genes were investigated using the Iranome Database.They were previously reported to be related to CL, EDS, and OI syndromes.Searching a gene in Iranome provides all discovered variants-deletion, duplication, splice region, intronic, and single-nucleotide variants.Later on, missense variants of each gene were selected for further evaluation.Missense variants with more than 10 heterozygotes, a CADD score of at least 20 (which indicates the variant is one of the 1% most deleterious variants in the genome) [41], and at least three times reported as damaging in pathogenicity Web servers are considered as probable damaging variants.From all evaluated genes, 46 variants in 18 genes were found to have a probable damaging effect.They are listed in Table 5, along with populations with the most and the least frequency of the alleles in Iran.

Discussion
Due to the clinical overlap between CL, EDS, and OI, it is difficult to provide proper follow-up care and genetic counseling.Their similar phenotypes increase the likelihood of misdiagnosis.The phenotypic overlap is likely due to the functional roles and interactions of associated with these syndromes.Consequently, traditional clinical guidelines and methods are no longer sufficient to differentiate between them.Whole-exome sequencing (WES) has the potential to enhance diagnostic capabilities significantly.This widely used next-generation sequencing (NGS) method is cost-effective, requires fewer sequencing reagents, and enables faster bioinformatic analysis compared to whole genome sequencing.Data collection revealed that genetic tests such as WES are rarely conducted in case studies, despite their potential to facilitate more precise diagnosis and more effective patient management.This study included 43 patients exhibiting the overlap phenotype of EDS/OI/CL, with a total of 32 genetic variants.Among these unrelated families, the rate of consanguineous marriage (CMR) was approximately 59%, with 12.5% reporting non-consanguineous marriages and 28.5% not disclosing their marital status.Out of the 32 variants identified, 12 were previously unreported and considered novel.In approximately 94% of cases, a sequencing method (direct, whole exome, or whole genome) was employed, successfully identifying the genetic variant.Figure 4 provides a graphical representation of all the reported variants in this cohort.Variants of PYCR1 (a protein that helps mitochondrial proper functioning and synthesis of proline), B3GALT6 (an enzyme essential for the manufacturing of ECM components), FKBPs (a family of chaperons that perform folding on proline-containing proteins), FBLN5 (which has a variety of roles in ECM and also play a role in arteries development), and collagen genes were identified in more patients than others.There were 8, 6, 5, and 4 patients with mutations in PYCR1, B3GALT6 , FKBPs, and collagen genes, respectively.The result of NetworkX interaction analysis also showed that these genes, along with ADAMTS2, COL1A1, COLA1A2, CRTAP, LEPRE1, FBLN5, ATP6V0A2, and PLOD1, have the most impact on their protein network.In addition to the direct roles these genes play in the production and structure of connective tissues, they also exert regulatory influence over each other's expression and function.This intricate network of regulatory interactions highlights the complexity of connective tissue homeostasis and the challenges in pinpointing the specific genetic defect responsible for each case.PYCR1, a transcription factor, orchestrates the expression of genes involved in collagen synthesis and remodeling.Mutations in PYCR1 are linked to hypermobile Ehlers-Danlos syndrome (hEDS), characterized by loose joints and hyperextensibility, as well as Cutis Laxa type 2B [42].B3GALT6, encoding beta-galactoside 3-O-acetyltransferase, contributes to the synthesis of glycosaminoglycans (GAGs), essential components of the extracellular matrix.Deficiencies in B3GALT6 lead to brittle bone disease with severe skin, joint, and eye involvement (BBSJI), demonstrating the intricate relationship between GAGs and connective tissue health [37].FKBPs, a family of heat shock protein (HSP) binders, safeguard cells from stress-induced damage.Mutations in FKBP genes are associated with Ehlers-Danlos syndrome type VII, highlighting the importance of HSPs in connective tissue homeostasis [58].ADAMTS2, encoding a collagen-cleaving enzyme, regulates collagen fiber degradation, influencing tissue flexibility and strength.Mutations in ADAMTS2 are linked to classical Ehlers-Danlos syndrome (cEDS), characterized by hyperextensibility, easy bruising, and fragile skin [3].COL1A1 and COLA1A2, encoding the alpha 1 and alpha 2 chains of type I collagen, the most abundant type of collagen in the body, are essential for connective tissue integrity.
Mutations in these genes are associated with various EDS subtypes, including cEDS, dermatosparaxis, and osteogenesis imperfecta type VI, emphasizing the critical role of type I collagen in connective tissue function [18].
Moreover, our investigation of all reported EDS, OI, and CL genes using the Iranome database reveals 46 variants that are dormant in heterozygous carriers with different frequencies in each ethnic group.Considering the high CMR in the country, it is probable for these heterozygous variants to rise in the next generation as a homozygous form, especially for populations with a high frequency of disease-causing alleles [1,15].For example, Baloch, Iranian Arab, and Kurd populations have the highest allele frequency for more than 6 variants of EDS.In this regard, carrier screening could be an effective strategy to prevent the birth of affected offspring [23].Also, less than 1 percent of reported patients have undergone genetic study.This fact, which limited our cohort number, necessitates performing genetic tests on more patients in future studies.
The literature data on EDS, OI, and CL overlap phenotypes are limited and may have some biases, as studies may have been conducted in specific populations or may have focused on particular clinical presentations.Future studies should aim to recruit more diverse patient cohorts and utilize standardized clinical diagnostic criteria to enhance the generalizability of findings.
While in silico tools and databases can serve as valuable resources for identifying potential disease-causing variants, it is crucial to acknowledge their limitations.These tools are still under development and may not always accurately predict pathogenicity, particularly for rare or novel variants.The reliability of these predictions can be enhanced by validation through experimental data, such as functional studies or animal models.It is important to note that the present study primarily encompasses the Iranian population.Therefore, the findings may not be entirely representative of other ethnic or geographical groups.Future research endeavors should aim to investigate these phenotypes in a broader range of populations to enhance the universality and applicability of the results.This will contribute to a more comprehensive understanding of EDS, OI, and CL overlap phenotypes across diverse populations.

Fig. 1
Fig. 1 Genes involved in overlap phenotype of EDS, OI, and CL demonstrated on their respective chromosomal region.None are located on chromosomes 4, 13, 16, 18, 21, and Y, while chromosomes 11 and 17 host five genes ECM-receptor interaction, PI3K-Akt signaling pathway, focal adhesion, platelet activation, relaxin signaling pathway, protein digestion and absorption Identical protein binding and plateletderived growth factor binding ADAMTS2 ADAM metallopeptidase with thrombospondin type 1 motif 2 Cleaves the propeptides of type I and II collagen prior to fibril assembly Collagen synthesis pathway, o-glycozilation of TSR domain-

Fig. 3
Fig. 3 Protein Interaction Network in EDS/OI/CL Overlap Phenotype.Using NetworkX Python Library, an Undirected Unweighted Graph is Visualized Based on the Betweenness Centrality

Fig. 4
Fig. 4 Schematic Illustration of Genes and Variants Involved in EDS/OI/CL Overlapping Phenotype.(Red lines show the position of variants at the protein domains)

Table 1 (continued) Gene Full name Function Pathway Ontology
Function, pathway, and ontology of the genes involved in overlap phenotype of EDS, OI, and CL.All OI-related genes have a role in collagen biosynthesis and function or bone development.Elastic fibers biosynthesis and function, different amino acid biosynthesis, and energy production are the main roles of CL-related genes.EDS-related genes have similar functions

Table 1 (
continued) cornea syndrome, spondylodysplastic EDS, musculocontractural EDS, myopathic EDS, periodontal EDS.Among these syndromes, classical-like, cardiac valvular, dermatosparaxis, and kyphoscoliotic types are inherited in an AR manner.Myopathic EDS has both AD and AR inheritance, while the remaining types are AD.

Table 2
Review of Revealed Genes and Variants in Iranian Patients with EDS, OI, or CL Syndromes, up to November 2022

Table 3
Types of Mutations in Genes Involved in Overlap Phenotype of EDS/OI/CL in Iran, According to the HGMD database

Table 4
Evaluation of Pathogenicity and Effect on Protein Stability in Variants of Overlap Phenotype of EDS/OI/CL in Cohort of the Study

Table 5
Probable Disease-Causing Variants of EDS/OI/CL Overlap Phenotype based on Iranome Database