Selection hub MicroRNAs as biomarkers in breast cancer stem cells in extracellular matrix using bioinformatics analyses

Background: Breast cancer is one of the most common cancers in women, and many people get it every year. The cancer stem cells are maybe crucial role to exacerbates and relapse the breast cancer. Therefore, finding biomarkers in human secretions can be an suitable solution for early detection and neo adjuvant therapy. This study aimed to investigate the molecular events related to the cancer stem cells in breast cancer, after which we nominated a suitable MicroRNAs participates in breast cancer pathogenesis. Methods: In this study, we investigated the relationship between molecular pathways using a bioinformatics approach. First, we selected the appropriate RNA-Seq datasets from the GEO database. We used Enrichr

• SPARC, INHBA, FN1, and GBA genes play a significant role in breast cancer stem cells.
• hsa miR-9.5p, hsa miR-203.3p, and hsa miR-429. associated with breast cancer stem cells can be detected in human secretions.

Background
Breast cancer is one of the most common cancers in women. Due to the high incidence of this type of cancer in women, which is progressing rapidly worldwide [1], various treatments for this cancer are performed. Despite these treatments, recurrence of the disease as a secondary tumor or more severe than breast cancer is still a major concern for all societies today [2,3]. In this case, finding more effective and important factors will lead to more helpful answers for how to treat breast cancer better. One of the most critical factors in the recurrence of breast cancer is cancer stem cells. These cells account for about 1% of the total tumor cell population [4]. With multiple treatments and even surgery, small amounts of cancer stem cells may remain in the patient [5]. This causes the cancer stem cells to redistribute, and this time the population of tumor cells formed by different therapies becomes more resistant [6]. Various studies have been performed in recent years and have tried to destroy breast cancer stem cells in different ways. Domenici et al. showed that the Sox2 and Sox9 genes could be potent markers for breast cancer stem cells [7]. Another study showed that the use of Doxycycline weakened mitochondria in breast cancer stem cells [8]. The survey by Palomeras et al. also showed that CD44 and CD24 markers are used to identify breast cancer stem cells more accurately [9]. These markers, and many of the other markers identified in various articles, are more effective in diagnosis and may not be used as a target marker to eliminate breast cancer stem cells. So, using bioinformatics analysis lets us look at the selection of miRNAs by looking at the gene expression profile of breast cancer stem cells and choosing the right candidate genes. This helps us find more solutions for both diagnosis and treatment.

Select the appropriate bioinformatics data
In this study, we selected an RNA-Seq dataset (GSE109798) from the Biojupies database. These data, which were obtained on triple-negative breast cancer patients, included a total of 6 samples, of which three samples were in the control group, and three samples were related to breast cancer. LogFC > 1,LogFC < − 1, and p value ˂ 0.05 were selected to evaluate cancer stem cells gene expression profiles more accurately. Then cluster the genes that had differential expression in an Excel file to use for other analyzes. Figures 1 and 2 indicate that the MA plot and heatmap for gene expression profiles (Additional file 1).

Investigation of gene ontology and signaling pathways
This section loaded high-expression and low-expression genes separately into the Enrichr database and used signaling pathways libraries and the ontology section to evaluate signaling pathways, molecular functions, biological processes, and cellular components. KEGG library was used for molecular pathways. In this part, the p value ˂ 0.05 was considered.

Communication network between proteins
Uploaded the genes with high expression in this step and were present in important signaling pathways in the STRING database. We then plotted the protein network of these genes. Then isolated the most closely related proteins to other proteins and were shown to play a more important role for further evaluation.

A closer look at the candidate proteins in patients with breast cancer
We used the GEPIA database to analyze this part. We placed each of the candidate genes in this database separately and measured their expression, stage plot, and survival in patients with breast cancer compared to the control group.

Examination of miRNAs associated with breast cancer stem cells
We uploaded the previously evaluated genes in patients with breast cancer to the control group to select miRNAs in the MienTURNET database and plotted the relationship between the genes and miRNAs as a network.

Breast cancer stem cells were observed in the ECM receptor, proteoglycans, and Gap junctions signaling pathways
After performing gene expression profile analysis, data on breast cancer stem cells with the control group, 510 genes with high expression, and 460 genes with low expression were obtained. High-expression genes consist of ECM receptor, proteoglycans in cancer, phagosome, focal adhesion, PI3K/AKT, adherent junctions, FoxO signaling pathways. Low-expression genes were present in ribosome, RNA transports, spliceosome, apoptosis, Gap junctions, cell cycles, and biosynthesis in amino acids signaling pathways. Also, the genes that had the highest score in terms of expression differentiation are shown in Table 1.

Gene ontology in breast cancer stem cells
We examined high-expression and low-expression genes separately for molecular functions, biological processes, and cellular components. High-expression genes were involved in regulatory of transcription factors, positive regulation of transcription, extracellular matrix disassembly, positive regulation of cell cycle, positive regulation of proliferation, and positive regulation of exocytosis pathways, in biological process. Transcription regulatory DNA binding, transcription coactivator activity, protein kinase binding, cadherin binding, collagen binding, and integrin binding in molecular functions. Low-expression genes were involved in translation, SRP co translational proteins targeting, peptides biosynthesis process, protein targeting to ER, cellular macromolecules biosynthesis process, rRNA metabolic pathways, and ribosome biogenesis pathways in biological processes. Also RNA binding, GTP binding, translation initiation factor function, ubiquitin ligase activity, and purine binding indicated that molecular functions. More information is shown in Fig. 3.

The communication network of proteins in the extracellular matrix
In this part of the study, proteins in the extracellular matrix were examined to find more accurate markers of cancer stem cells in breast cancer patients' blood or other secretions. Accordingly, the communication network between the proteins is plotted in Fig. 4. This protein network consists of 53 nodes and 93 edges. Based on the average correlation between proteins, four proteins showed significant SPARC, INHBA, FN1, and GBA compared to other proteins in this network.

Evaluation of proteins in human data in databases
In this part of the study, SPARC, INHBA, FN1, and GBA proteins in the GEPIA database were evaluated in breast cancer samples compared to controls. Accordingly, similar to bioinformatics data, SPARC, INHBA, FN1, and GBA proteins in the breast cancer sample showed a significant increase in expression compared to the control sample. In the plot diagram, higher data density is directly related to increased gene expression. In the survival chart, on average, SPARC, INHBA, FN1, and GBA proteins have reduced the survival of patients by about 60% over time, which is a significant rate (Fig. 5).

The candidacy of miRNAs associated with proteins in the extracellular matrix
Following the bioinformatics analyses performed in this step, we uploaded the SPARC, INHBA, FN1, and GBA in the MienTURNET database and selected the miRNAs related to these genes. hsa miR-9.5p, hsa miR-203.3p, hsa miR-429, hsa miR-200c, hsa miR-1, hsa miR-206, and hsa miR-613 miRNAs are significantly identified, as shown in Fig. 6 of its communication network.

Discussion
There are many challenges today in dealing with breast cancer recurrence and the increasing severity of the disease [10,11]. Cancer stem cells play a vital role in this phenomenon. Cancer stem cells have become resistant to treatments such as chemotherapy and radiotherapy. In the event of a recurrence of the disease, it becomes complicated to manage breast cancer treatment [12]. Because of this, finding biomarkers, especially miRNAs, can help make new drugs and find better ways to kill cancer stem cells.
In this study, which was performed through continuous bioinformatics analysis, after evaluating the molecular pathways associated with breast cancer stem cells, different genes, proteins, and miRNAs involved in better identification and targeting of breast cancer stem cells were chosen. For this purpose, SPARC, INHBA, FN1, GBA proteins and hsa miR-9.5p, hsa miR-203.3p, hsa miR-429, hsa miR-200c, hsa miR-1, hsa miR-206, and hsa miR-613 were selected in our study. In the following, we examined these important biomarkers is the first SPARC gene. This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and the promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion [13,14]. Various studies have shown that SPARC plays a crucial role in tumorigenesis, breast cancer progression, and other cancers. For example, a study showed that SPARC significantly increased expression in breast cancer patients compared to healthy individuals. When the gene was inhibited, the invasion of breast cancer cells decreased [15]. In the study, Sanita et al. used nanoparticles to target SPARC albumin. Inhibition of the SPARC gene in the breast cancer cell line has been shown to reduce the survival of these cells and induce apoptosis and cell invasion [16]. A study by Bawazeer et al. also showed that polymorphisms in the SPARC gene could affect VEGF and exacerbate breast cancer [17]. However, various studies have shown SPARC activity in breast cancer. But in breast cancer stem cells, examining the traces of this gene can help to better target breast cancer stem cells. On the other hand, SPARC plays a significant role in other cancers, including prostate [18], colon [19], and lung [20].
The INHBA gene encodes a member of the TGF-beta (transforming growth factor-beta) superfamily of proteins. The encoded preproprotein is proteolytically processed  to generate a subunit of the dimeric activin and inhibin protein complexes. These complexes activate and inhibit, respectively, follicle stimulating hormone secretion from the pituitary gland. The encoded protein also plays a role in eye, tooth, and testis development [21,22]. The study by Hamalian et al. showed that INHBA in the SNAI2/PEAK1/ INHBA signaling pathway plays a vital role in the invasion of HER2 + breast cancer cells. This signaling pathway is associated with the actin cytoskeleton and is in contact with the microenvironment and extracellular matrix. This signaling pathway is also involved in the integrin growth factor, in which any disruption can disrupt cell connections and initiate cell invasion [23]. Wang et al. 's study showed a clear association between circulating tumor cells and INHBA and that both were more active in breast cancer patients than in healthy individuals. But after using chemotherapy drugs, the activity of both of them decreased significantly [24]. The study by Yu et al. showed that INHBA significantly increased expression in breast cancer patients compared to healthy individuals. It was also found that INHBA activated the TGFB signal pathway, which intensified the invasion of breast cancer cells by activating the EMT pathway [22]. The study by Xueqin et al. showed that INHBA could play a major role in the division of breast cancer cells by affecting the Wnt/B catenin signaling pathway [25]. The study also showed that the function of the INHBA gene in breast cancer stem cells was not clearly defined, which could be further tested if a study shows that INHBA is involved in the invasion and division of gastric cancer stem cells. The FN1 gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in the extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes, including embryogenesis, wound healing, blood coagulation, host defense, and metastasis [26,27]. A few studies have closely examined the association of FN1 with breast cancer. The study by Yang et al. showed that FN1 was involved in the induction of the EMT pathway and the invasiveness of breast cancer cells by regulation by miR-200b [28]. The study by Wang et al. using bioinformatics analysis showed that FN1 is involved in the progression and invasion of breast cancer [29]. The survey by Hellinger et al. also showed that FN1 plays a role in the acute stage of breast cancer and plays an essential role in promoting breast cancer [30].
The GBA gene encodes a lysosomal membrane protein that cleaves the beta-glucosidic linkage of glycosylceramide, an intermediate in glycolipid metabolism. Mutations in this gene cause Gaucher disease, a lysosomal storage disorder characterized by glucocerebroside accumulation [31,32]. The study by Zhou et al. showed that GBA plays a key role in reducing the sensitivity of breast cancer tumor cells in response to chemotherapy, and by increasing GBA expression, the PI3K/AKT/mTOR signal pathway in reducing this sensitivity to drugs poses many challenges for management of breast cancer treatment [33]. The study by Moro et al. also showed that GBA plays an important role in the invasion of breast cancer cells [34]. This gene has not been studied in detail in breast cancer stem cells, which could be a good option for targeting breast cancer stem cells.
After studies performed using bioinformatics, SPARC, INHBA, FN1, and GBA were specifically selected in breast cancer stem cells in this study. As you can see in Fig. 5, the effectiveness of these genes is critical in the survival of breast cancer, and as the disease progresses to an invasive phase, the number of patients with high gene expression increases. In this regard, the study of these four genes or their protein products, especially miR-NAs in breast cancer, can be a strong point for managing breast cancer treatment.

Conclusion
Subsequently, we identified SPARC, INHBA, FN1, and GBA and their associated miRNAs in breast cancer stem cells in this study. Since these genes were studied only in breast cancer and less on cancer stem cells, the importance of these genes was investigated. Accordingly, targeting these SPARC, INHBA, FN1, and GBA genes or protein products could be used as a neoadjuvant treatment for breast cancer. Candidate miRNAs can also be evaluated better to detect breast cancer stem cells in various human secretions.  5 This section identified the proteins most associated with other proteins. In the sample of patients and healthy individuals with three approaches, we examined the difference in gene expression, gene expression at different stages, and survival. As can be seen in the box plots, the expression of genes is significantly higher in breast cancer patients than in healthy individuals. The same has been confirmed in the stage plot, which shows that the expression of these four genes is high in all the main stages of breast cancer. The survival chart also shows the mortality of people up to 40%, which is the importance of the pathogenicity of these genes in breast cancer, especially in breast cancer stem cells. BRCA (Breast invasive carcinoma).