Skip to main content

Retinoic Acid-Induced 1 gene variants associated with Smith–Magenis syndrome circadian phenotypes enriched in autism spectrum disorder: whole-genome sequencing study



This study aimed to characterize the frequency of RAI1 genetic aberrations associated with Smith–Magenis syndrome (SMS), in a large cohort of autism spectrum disorder (ASD) whole-genome sequencing samples. We aimed to determine the frequencies of RAI1 single-nucleotide variants (SNVs) and copy number variants (CNVs).


We report a 2.5 × enrichment of the major deletion and a > 5 × enrichment of the frameshift variants as compared to the known prevalence of SMS 1/15,000. Additionally, we report a significant enrichment of RAI1 rare missense variants in ASD subjects with respect to controls (54 variants/6080 ASD subjects and 6 variants/2541 controls, p-value < 0.002, OR 3.78, CI 1.62–8–81).


The SMS phenotype including circadian dysregulation and associated sleep disturbances is mainly caused by RAI1 haploinsufficiency. Sleep disturbances as seen in SMS may overlap in ASD, especially in patients with consequential variants in RAI1 gene.


Smith–Magenis syndrome (SMS; OMIM 182290) is a rare genetic disorder that results from an interstitial deletion of 17p11.2 and, in rare cases, from a Retinoic Acid-Induced 1 (RAI1) gene variants [15]. The prevalence is estimated to be 1/15,000–25,000 [8, 16]. Haploinsufficiency of RAI1 is the primary cause of the neurobehavioral and metabolic phenotype in SMS [8, 16]. Patients with SMS are characterized by a distinct pattern of mild to moderate intellectual disability as well as delayed speech and language skills, distinctive craniofacial and skeletal abnormalities, behavioral disturbances, and with significant sleep disturbances [16]. Alterations in RAI1 copy number have been also linked to a number of neurodevelopmental disorders including ASD [9]. In fact, 90% of SMS patients meet diagnostic criteria for ASD at one point in their lives [9].

ASD comprises a complex of neurodevelopmental disorders primarily characterized by deficits in verbal communication, impaired social interactions, and repetitive behaviors [5, 6]. The profound clinical heterogeneity of ASD poses challenges in diagnosis and treatment. Heritable factors account for at up to 80% of ASD risk with the remainder attributable to environmental factors acting alone or through interaction with genetics [23]. Tremendous progress has been made in understanding the genetic underpinnings of ASD with potential variants usually covering the entire spectrum of mutations from single-nucleotide variants to loss/gain of copy number effects. In addition to inherited variants, genomes of probands are enriched in de novo genetic variants [1, 23]. Genetic studies have pointed to hundreds of presumptive causative or susceptibility variants in ASD, making it difficult to find common underlying pathogenic mechanisms and suggesting that multiple different genetic etiologies for ASDs influence a continuum of traits. Sleep problems are almost twice as common in children with ASD compared to ancestry matched controls with no diagnosis of ASD. Different aspects of sleep are aberrant including inability to initiate sleep, delayed sleep, and fragmented sleep. About 50% of ASD children meet the diagnostic criteria for insomnia as defined by sleep latency greater than 30 min.

RAI1 is a dosage-sensitive gene expressed in many tissues. It is highly conserved among species. Multiple studies have demonstrated that RAI1 and its homologs act as a transcriptional factor implicated in embryonic neurodevelopment and neuronal differentiation, as well as behavioral functions and importantly in circadian activity. Patients with RAI1 pathogenic point mutations, show some phenotypic differences when compared to those carrying the larger typical deletion; however, haploinsufficiency of RAI1 is the main cause of the neurobehavioral and metabolic phenotype in SMS [7]. Based on the current data, over 90% of cases present with a large deletion [17]. As exon 3 constitutes 95% of the coding sequence, that is where majority of the RAI1 variants have been reported to date. The purpose of this study was to examine the frequency of RAI1 consequential SNVs as well as CNVs in a large whole-genome sequencing set of ASD patients.


We conducted a large-scale association analysis of the ASD MSSNG whole-genome sequencing data to determine the frequency of RAI1 SNVs and CNVs. We accessed the MSSNG database hosting over 11,000 genomes (6080 probands) and queried both SNVs and CNVs.

Specifically, we focused on the frequency of the classic SMS large deletions, microdeletions of (exon 3) and of the rare (as defined by gnomAD [10] Max < 0.005) missense, frameshift, and splicing variants. We report a single case of classic SMS deletion spanning (17p11.2 critical region (chr17:16845401-20516200)). We also report 2 frameshifts and one known splicing variant. Given that the SMS deletion frequency is ~ 1:15,000, we observe a 2.5 × enrichment of the major deletion and 2:6080 > 5 × enrichment of the frameshift variants (Table 1). The two reported frameshifts are loss of function variants predicted to result in a premature stop codon. Table 1 supplement includes CADD scores as well as SIFT scores which predict whether an amino acid substitution is likely to affect protein function based on sequence homology and the physico-chemical similarity between the alternate amino acids. Figure 1 depicts how the identified RAI1 variants are localized across domains.

Table 1 All rare (MAF < 0.005) missense, splicing, and frameshift variants detected in the ASD WGS set
Fig. 1
figure 1

All rare (MAF < 0.005) missense, splicing, and frameshift variants detected in the ASD cases are depicted as localized across domains

In a set of 6080 probands we also observed 54 unique missense variants, which constitute 84 alleles out of 6080 individuals located within exon 3 of RAI1 gene. We observe a significant enrichment of rare RAI1 missense variants in comparison with the control dataset (confirmed lack of diagnosis SMS and ASD) (54 variants/6080 ASD subjects and 6 variants/2541 controls, p-value < 0.002, OR 3.78 CI 1.62–8–81). This effect persists when dosage effect is tested. Altogether 11% of these variants are de novo. The variants detected in the ASD set are not present in the control dataset of 2541 whole-genome sequencing mixed ancestry samples (see Methods section). The identified variants of interest are depicted in Table 1. We therefore observe enrichment of RAI1 genetic aberrations (CNVs and SNVs) as compared to the known prevalence of SMS (1:15,000) as well as enrichment of RAI1 variants as compared to a set of non-ASD non-SMS ancestry matched controls.


Both ASD patients and SMS patients suffer from sleep disturbances. Currently, the prevailing theory is that there is an underlying circadian pathophysiology causing sleep disturbances in SMS associated with RAI1 haploinsufficiency, as these patients exhibit low overall melatonin concentrations and abnormal timing of peak plasma melatonin concentrations. This abnormal inverted circadian rhythm is estimated to occur in 95% of individuals with SMS [3, 18]. Variation of sleep disturbances as seen in SMS may overlap in ASD, especially in patients with consequential variants in RAI1 gene—this could be tested as part of the future follow-up studies. ChIP-Chip and reporter studies showed that RAI1 binds, directly or in a complex, to the first intron of CLOCK gene, enhancing transcriptional activity [22]. Reduced expression of RAI1 results in reduced CLOCK expression both in the animal models and in the SMS patient-derived cell lines. This is supportive of the fact that treatment with a circadian regulator such as melatonin agonist can, in part, correct the deficiencies caused by RAI1 abnormalities, providing further evidence of RAI1 interaction with the molecular clock and the impact on circadian rhythm.

Interestingly two recent studies showed the association between methylation status of RAI1 and sleepiness scores [2, 11]. In another recent GWAS study of self-reported daytime sleepiness, authors reported a significant association of variant rs11078398 (MAF 0.04) with daytime sleepiness in the general population [20]. These reports further support the role of RAI1 gene variants in the regulation of sleep and circadian rhythms in the general population.

It is important to note there are other potential genetic causes of sleep disorders such as sleep-related breathing disorders, hypersomnolence, parasomnias, sleep-related movement disorders, and circadian rhythm sleep–wake disorders attributed to distinct variants and pathways [12]. Moreover, other genes have also been implicated as the underlying etiology behind sleep disturbances such as problems with sleep induction disturbances in healthy population as well as in individuals with neurodevelopmental disorders and comorbid sleep disturbances [14, 19]. In a study that examined 80 variants across 5 circadian clock genes in healthy Japanese individuals, the strongest association with sleep induction disorder was that of rs11113179 in CRY1 and variants rs1026071 and rs1562438 in BMAL1 [14]. Variants in melatonin receptors, specifically in MTNR1B, were observed in individuals with ASD and sleep disturbances [19]. This is suggestive of the fact the RAI1 variant carriers could be one of many underlying causes of comorbid sleep disorders and this is yet to be confirmed.

The results of this current study warrant further confirmation in ASD patients manifesting with sleep disturbances. Despite limitations and the need to confirm these findings in an independent cohort, our pilot findings offer promising insights. This studies laid the foundation for important follow-up studies such as one on ASD cases with and without sleep disturbance. Further studies will help discern the role of RAI1 variants in ASD and particularly in patients with sleep disturbances.


Datasets and ethical considerations

MSSNG [4] constitutes a large whole-genome sequencing set of samples obtained from over 10,000 individuals from families from the Autism Genetic Research Exchange (AGRE) repository and from other well-phenotyped cohorts entering into this study. Vanda control dataset is a well-phenotyped cohort of whole-genome sequencing samples obtained from consented individuals participating in Vanda sleep studies. The controls have been selected to match the demographics of the cases. All individuals have consented to participate in genetic research. The controls were selected to match cases in terms of age, sex, and ancestry. Potential biases could have been caused by the ascertainment and recruitment of the cases limited geographically and sample selection criteria such a willingness to participate in research studies.

Genetic analysis of the Vanda dataset

Incoming nucleic acid samples are quantified using fluorescent-based assays (PicoGreen) to accurately determine whether sufficient material is available for library preparation and sequencing. DNA sample size distributions are profiled by a Fragment Analyzer (Advanced Analytics) or BioAnalyzer (Agilent Technologies), to assess sample quality and integrity. HumanCoreExome 24v1.3 array was performed on all human DNA samples sequenced. Whole-genome sequencing (WGS) libraries were prepared using the Truseq DNA PCR-free Library

Preparation Kit. Whole-Genome data were processed on NYGC automated pipeline. Paired-end 150-bp reads were aligned to the GRCh37 human reference (BWA-MEM v0.7.8) and processed with GATK best-practices workflow (GATK v3.4.0). The mean coverage was 35.8; it reflects the samples average. All high-quality variants obtained from GATK were annotated for functional effects (intronic, intergenic, splicing, non-synonymous, stop-gain, and frameshifts) based on RefSeq transcripts using Annovar [21]. Additionally, Annovar was used to match general population frequencies from public databases (Exac, gnomAD, ESP6500, 1000 g) and to prioritize rare, loss-of-function variants. Linear models adjusted for PC, age, and sex were conducted in PLINK [13].

Enrichment analysis

The analysis in both datasets focused on rare (as defined by MAF < 0.005 in gnomAD) missense, frameshift, and splicing variants. Enrichment was defined and tested twofold: with regard to established SMS prevalence and with regard to ancestry matched non-SMS non-ASD set of control set of samples.

Availability of data and materials

Data are available upon request and pending application approval.



Autism spectrum disorder


Copy number variant


Minor allele frequency


Retinoic Acid-Induced 1 gene


Smith–Magenis syndrome


Single-nucleotide variant


Whole-genome sequencing


  1. Anney R, Klei L, Pinto D, Regan R, Conroy J, Magalhaes TR, Hallmayer J (2010) A genome-wide scan for common alleles affecting risk for autism. Hum Mol Genet 19(20):4072–4082.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Barfield R, Wang H, Liu Y, Brody JA, Swenson B, Li R, Sofer T (2019) Epigenome-wide association analysis of daytime sleepiness in the Multi-Ethnic Study of Atherosclerosis reveals African-American-specific associations. Sleep.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Boone PM, Reiter RJ, Glaze DG, Tan D-X, Lupski JR, Potocki L (2011) Abnormal circadian rhythm of melatonin in Smith–Magenis syndrome patients with RAI1 point mutations. Am J Med Genet.

    Article  PubMed  Google Scholar 

  4. C Yuen RK, Merico D, Bookman M, L Howe J, Thiruvahindrapuram B, Patel RV, Scherer SW (2017) Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder. Nat Neurosci 20(4):602–611.

    Article  CAS  PubMed  Google Scholar 

  5. de la Torre-Ubieta L, Won H, Stein JL, Geschwind DH (2016) Advancing the understanding of autism disease mechanisms through genetics. Nat Med 22(4):345–361.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Devlin B, Scherer SW (2012) Genetic architecture in autism spectrum disorder. Curr Opin Genet Dev 22(3):229–237.

    Article  CAS  PubMed  Google Scholar 

  7. Girirajan S, Vlangos CN, Szomju BB, Edelman E, Trevors CD, Dupuis L, Elsea SH (2006) Genotype-phenotype correlation in Smith–Magenis syndrome: evidence that multiple genes in 17p112 contribute to the clinical spectrum. Genet Med 8(7):417–427.

    Article  CAS  PubMed  Google Scholar 

  8. Greenberg F, Guzzetta V, Montes de Oca-Luna R, Magenis RE, Smith AC, Richter SF, Lupski JR (1991) Molecular analysis of the Smith–Magenis syndrome: a possible contiguous-gene syndrome associated with del(17)(p11.2). Am J Hum Genet 49(6):1207–1218

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Huang W-H, Wang DC, Allen WE, Klope M, Hu H, Shamloo M, Luo L (2018) Early adolescent Rai1 reactivation reverses transcriptional and social interaction deficits in a mouse model of Smith–Magenis syndrome. Proc Natl Acad Sci 115(42):10744–10749.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, MacArthur DG (2019) Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. BioRxiv, 531210.

  11. Lahtinen A, Puttonen S, Vanttola P, Viitasalo K, Sulkava S, Pervjakova N, Paunio T (2019) A distinctive DNA methylation pattern in insufficient sleep. Sci Rep 9(1):1193.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Mainieri G, Montini A, Nicotera A, Di Rosa G, Provini F, Loddo G (2021) The genetics of sleep disorders in children: a narrative review. Brain Sci.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Sakurada K, Konta T, Takahashi S, Murakami N, Sato H, Murakami R, Kayama T (2021) Circadian clock gene polymorphisms and sleep-onset problems in a population-based cohort study: the Yamagata study. Tohoku J Exp Med 255(4):325–331.

    Article  CAS  PubMed  Google Scholar 

  15. Slager RE, Newton TL, Vlangos CN, Finucane B, Elsea SH (2003) Mutations in RAI1 associated with Smith-Magenis syndrome. Nat Genet 33(4):466–468.

    Article  CAS  PubMed  Google Scholar 

  16. Smith ACM, Magenis RE, Elsea SH (2005) Overview of Smith–Magenis syndrome. J Assoc Genet Technol 31(4):163–167

    PubMed  Google Scholar 

  17. Smith AC, McGavran L, Robinson J, Waldstein G, Macfarlane J, Zonona J, Magenis E (1986) Interstitial deletion of (17)(p11.2p11.2) in nine patients. Am J Med Genet 24(3):393–414.

    Article  CAS  PubMed  Google Scholar 

  18. Spruyt K, Braam W, Smits M, Curfs LMG (2016) Sleep complaints and the 24-h melatonin level in individuals with Smith–Magenis syndrome: assessment for effective intervention. CNS Neurosci Ther 22(11):928–935.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Veatch OJ, Keenan BT, Gehrman PR, Malow BA, Pack AI (2017) Pleiotropic genetic effects influencing sleep and neurological disorders. Lancet Neurol 16(2):158–170.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Wang H, Lane JM, Jones SE, Dashti HS, Ollila HM, Wood AR, Saxena R (2019) Genome-wide association analysis of self-reported daytime sleepiness identifies 42 loci that suggest biological subtypes. Nat Commun 10(1):3503.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38(16):e164–e164.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Williams SR, Zies D, Mullegama SV, Grotewiel MS, Elsea SH (2012) Smith–Magenis syndrome results in disruption of CLOCK gene transcription and reveals an integral role for RAI1 in the maintenance of circadian rhythmicity. Am J Hum Genet 90(6):941–949.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Yuen RKC, Thiruvahindrapuram B, Merico D, Walker S, Tammimies K, Hoang N, Scherer SW (2015) Whole-genome sequencing of quartet families with autism spectrum disorder. Nat Med 21(2):185–191.

    Article  CAS  PubMed  Google Scholar 

Download references


The author wishes to acknowledge the resources of MSSNG and/or AGRE, Autism Speaks and The Centre for Applied Genomics at The Hospital for Sick Children, Toronto, Canada. I also thank the participating families for their time and contributions to these resources, as well as the generosity of the donors who supported these programs.


No relevant funding reported.

Author information

Authors and Affiliations



SPS performed the analysis and wrote the manuscript.

Corresponding author

Correspondence to Sandra Paulina Smieszek.

Ethics declarations

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Competing interest

The author is an employee of Vanda Pharmaceuticals Inc.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Smieszek, S.P. Retinoic Acid-Induced 1 gene variants associated with Smith–Magenis syndrome circadian phenotypes enriched in autism spectrum disorder: whole-genome sequencing study. Egypt J Med Hum Genet 25, 55 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: