Introduction
Embryonic development in mammals starts with the fertilization of an oocyte by a sperm cell, followed by the formation of the pluripotent zygote and differentiation into a new individual (Shpargel et al., Reference Shpargel, Starmer, Yee, Pohlers and Magnuson2014). During this process, the embryo undergoes dramatic morphological changes, coupled with widespread epigenetic reprogramming, of which maternal-to-zygotic transition (MZT) represents the most critical biological event. Notably, MZT involves two main stages, namely the degradation of maternal products and the activation of zygotic genomes. During this process, embryos complete the conversion of maternal to zygotic control (Tadros and Lipshitz, Reference Tadros and Lipshitz2009; Sha et al., Reference Sha, Zhang and Fan2019). The mature oocyte is in a transcriptional silencing state initially (Moore et al., Reference Moore, Lintern-Moore, Peters and Faber1974), once fertilized, the genome is rapidly activated. In addition, zygotic RNA accumulates following the gradual degradation of maternal products.
In the last few decades, numerous studies have revealed the importance of maternal products in embryonic development (Nüsslein-Volhard and Wieschaus, Reference Nüsslein-Volhard and Wieschaus1980; Driever and Nüsslein-Volhard, Reference Driever and Nüsslein-Volhard1988). The initiation events of early embryonic development in mammals are mainly controlled by maternal effectors, which are encoded by maternal effect factors. These factors accumulate during oogenesis, a phenomenon that makes it possible for zygotic genome activation (ZGA), embryo cleavage and blastocyst development with an inner cell mass and trophectoderm cells. To date, more than 30 mouse genes harbouring maternal mutation effects (effects caused by maternal gene mutation) have been documented including the factor Nelfa (Hu et al., Reference Hu, Tan, Chia, Tan, Leong, Chen, Lau, Tan, Bi, Yang, Ho, Wu, Bao, Wong and Tee2020), X-linked Huwe1 (Eisa et al., Reference Eisa, Bang, Crawford, Murphy, Feng, Dey, Wells, Kon, Gu, Mehlmann, Vijayaraghavan and Kurokawa2020), and Argonaute 2 (Zhang et al., Reference Zhang, Hou, Du, Zong, Zheng, Wang, Wang, Zhang, Mu, Yin, Ding, Sun, Liu and Kong2020). Functionally, these genes regulate the fusion of female and male nuclei, the elimination of maternal material, the activation of the embryonic genome, the cleavage of the zygote and the densification of the embryo (Li et al., Reference Li, Zheng and Dean2010; Zheng and Liu, Reference Zheng and Liu2012). Moreover, they have been shown to play important roles in oogenesis, meiotic maturation, preimplantation and post-implantation embryonic development (Innocenti et al., Reference Innocenti, Fiorentino, Cimadomo, Soscia, Garagna, Rienzi, Ubaldi and Zuccotti2022).
ZGA is considered the most essential step in regulating early embryo development. It has been demonstrated in previous studies that ZGA disorders result in embryo arrest beyond the 2-cell stage. For example, the growth of Btg4 knock-out embryos was arrested at the 2-cell stage (Yu et al., Reference Yu, Ji, Sha, Dang, Zhou, Zhang, Liu, Wang, Hu, Sun, Sun, Tang and Fan2016), and similar effects were observed in embryos that lacked specific factors such as Nanog, Soxb1 and Pou5f3 in zebrafish (Lee et al., Reference Lee, Bonneau, Takacs, Bazzini, DiVito, Fleming and Giraldez2013). Additional evidence has shown that Dux4, a classical ZGA gene, also plays a vital function in ZGA in 2-cell-like cells (De Iaco et al., Reference De Iaco, Planet, Coluccio, Verp, Duc and Trono2017; Hendrickson et al., Reference Hendrickson, Doráis, Grow, Whiddon, Lim, Wike, Weaver, Pflueger, Emery, Wilcox, Nix, Peterson, Tapscott, Carrell and Cairns2017; Whiddon et al., Reference Whiddon, Langford, Wong, Zhong and Tapscott2017). Although subsequent research has demonstrated that the loss of Dux4 would be dramatically compensated for by some alternative substitution factors (Chen and Zhang, Reference Chen and Zhang2019), it undoubtedly remains an essential regulatory factor in the process of ZGA in the mouse embryo.
ZGA is the critical step for the successful development of an embryo (Vastenhouw et al., Reference Vastenhouw, Cao and Lipshitz2019). To date, very large numbers of researchers have described the role of DNA methylation (Messerschmidt et al., Reference Messerschmidt, Knowles and Solter2014; Iurlaro et al., Reference Iurlaro, von Meyenn and Reik2017), post-translational modifications of histones (Dahl et al., Reference Dahl, Jung, Aanes, Greggains, Manaf, Lerdrup, Li, Kuan, Li, Lee, Preissl, Jermstad, Haugen, Suganthan, Bjørås, Hansen, Dalen, Fedorcsak, Ren and Klungland2016; Xia et al., Reference Xia, Xu, Yu, Yao, Xu, Ma, Zhang, Liu, Li, Lin, Chen, Li, Wang, Shi, Shi, Zhang, Song, Jin and Hu2019), chromatin accessibility (Wu et al., Reference Wu, Huang, Chen, Yin, Liu, Xiang, Zhang, Liu, Wang, Xia, Li, Li, Ma, Peng, Zheng, Ming, Zhang, Zhang and Tian2016) and the effect of pioneer transcription factors (Duan et al., Reference Duan, Rieder, Colonnetta, Huang, McKenney, Watters, Deshpande, Jordan, Fawzi and Larschan2021; Riesle et al., Reference Riesle, Gao, Rosenblatt, Hermes, Hass, Gebhard, Veil, Grüning, Timmer and Onichtchouk2023) and some others in the process of ZGA. Despite a series of epigenetic modifications reportedly associated with ZGA such as DNA modifications and histones, the core issue of embryo development is still gene activation, transcription and translation. Moreover, it has been reported that only a proportion (12–15%) of the genome sequence, instead of the whole genome, is activated with transcription factors in the subsequent process (Rizvi et al., Reference Rizvi, Camara, Kandror, Roberts, Schieren, Maniatis and Rabadan2017; Eckersley-Maslin et al., Reference Eckersley-Maslin, Alda-Catalinas and Reik2018). Namely, numerous genes are silenced or repressed, and only few genes are activated in the specific process. In other words, it is meaningful but difficult to determine which genes participate in this process and how they interact with each other during this regulation network.
Therefore, the present study sought to identify candidate genes in ZGA and provide a detailed list of those genes, with the aim of setting up a platform for future exploration of the underlying elusive mechanism. Additionally, we also devoted our efforts to identifying the conserved genes between human and mouse with the aim of generating relevant insights to guide future studies on the ZGA process in human embryos. Taken together, these findings are expected to broaden the knowledge of the underlying mechanisms of ZGA and deepen the understanding of the process of early embryo development.
Materials and methods
Collection of transcriptome data
We comprehensively searched PubMed, Embase and Web of Science databases. All RNA-seq data were retrieved on embryo development with the strategy ‘RNA-seq and embryo and Mus musculus or Homo sapiens’, published up to 21 January 2023 as this study aimed to screen candidate genes in ZGA, including minor ZGA and major ZGA, and to distinguish whether they were maternal factors. Therefore, each dataset should have included at least four stages: oocyte stage, zygote stage, early 2-cell stage and late 2-cell stage in mouse. Similarly, in human, the datasets needed to include the oocyte stage, zygote stage or 2-cell stage, 4-cell stage and 8-cell stage. Finally, we used a two-step filter for database retrieval. First, we searched all the embryo transcriptome datasets related to human and mouse that included all four stages mentioned to ensure reliability. Second, we selected the datasets reported in high-quality articles. In addition, the time of publication, and the sequencing platform were also taken into consideration. Finally, only the datasets GSE101571 (Wu et al., Reference Wu, Xu, Liu, Yao, Wang, Lin, Huang, Wang, Li, Shi, Zhang, Duan, Ming, Zhang, Niu, Song, Jin, Guo and Dai2018; human) and GSE71434 (Zhang et al., Reference Zhang, Zheng, Huang, Li, Xiang, Peng, Ming, Wu, Zhang, Xu, Liu, Kou, Zhao, He, Li, Chen, Li, Wang and Ma2016; mouse) were chosen for further analysis. The search and selection processes are showed in Figure 1. Also the stages involved in ZGA are presented in Figure 2.
According to our search strategy, 1524 records were searched in mouse and 469 records were searched in human; 43 datasets were included in mouse and 14 datasets were included in human after duplicates were removed. Based on the fact that the preimplantation embryo development contains during several different stages, such as oocyte (GV), zygote (PN5), 2-cell, 4-cell, 8-cell, morula, and blastocyst, in this study we focused on ZGA-related processes during mouse and human preimplantation embryo development. Therefore, the dataset must have contained the transcriptome data of each stage related to ZGA, and it was better to compare the gene expression at the same level, which resulted in only one dataset being eligible ultimately both in mouse and in human.
Identification of candidate genes during ZGA
Generally, the candidate genes were identified by comparing the calculated gene expression levels at different stages. Numerous transcription events occurred during ZGA. Here, we hypothesized that if a gene was significantly upregulated during ZGA, then this gene was highly likely to play a role in this process. Therefore its upregulation might be positively correlated with its importance. Notably, only genes with reliable sequence annotation were allowed for further analysis. The levels of gene expression were calculated based on the RPKM or FPKM values (Log2 RPKM or Log2 FPKM) as the previous study described (Sha et al., Reference Sha, Zhu, Li, Jiang, Chen, Sun, Shen, Ou and Fan2020). For genes with RPKM or FPKM values that were less than 1, we added +1 to the value of each gene to obtain positive results.
Screening criteria of candidate genes during ZGA
In mouse, minor ZGA completes in the early 2-cell (Early 2C) whereas major ZGA occurs in the late 2-cell (Late 2C). Therefore, in the minor ZGA, we selected out the genes if Expression (Early 2C) > Expression (PN5) +1; for major ZGA, genes would be selected if Expression (Late 2C) > Expression (PN5) +1. In human, ZGA occurs at the 4–8-cell stage. Therefore, we defined the 4- and 8-cell stages as minor ZGA and major ZGA, respectively. Genes for ZGA in human were screened in a similar fashion to that in mouse.
Distinction between maternal and non-maternal genes
Next, we differentiated the identified key genes into maternal and non-maternal categories. We adopted a previously described method (Sha et al., Reference Sha, Zhu, Li, Jiang, Chen, Sun, Shen, Ou and Fan2020) to stratify the mRNA as maternal if the gene had an RPKM or FPKM value of GV stage >2, whereas those with RPKM or FPKM value at GV stages < 0.5 were considered non-maternal mRNAs (Li et al., Reference Li, Zhang, Chen, Liu, Lai, Liu, Li, Liu, Xu, Dong, Wang, Duan, Tan, Zheng, Zhang, Fan, Wong, Xu and Wang2018; Wu et al., Reference Wu, Xu, Liu, Yao, Wang, Lin, Huang, Wang, Li, Shi, Zhang, Duan, Ming, Zhang, Niu, Song, Jin, Guo and Dai2018). For convenience, we defined genes corresponding to maternal and non-maternal mRNAs as maternal genes and non-maternal genes, respectively. Last, genes with RPKM or FPKM values between 0.5 and 2 were classified into the uncertain group.
Evaluation of gene age and the identification of conserved genes
New gene emergence is so far assumed to be mostly driven by duplication and divergence of existing genes. Generally, the older the gene, the more conservative it is. Therefore, to identify conserved genes between human and mouse, we first identified the orthologous genes and then assessed the gene age. Previous studies have used phylostratigraphic approaches to classify gene ages and divided human and mouse genes into 20 groups (P1–P20; Domazet-Lošo and Tautz, Reference Domazet-Lošo and Tautz2008, Reference Domazet-Lošo and Tautz2010; Neme and Tautz, Reference Neme and Tautz2013). All loci, based on Ensemble Gene ID, from mouse and human assigned to their respective phylostrata and gene age data can be obtained from published research (Neme and Tautz, Reference Neme and Tautz2013). Then the age of genes identified in ZGA in mouse and human was obtained by converting gene names. Each gene could correspond to its gene age. Next, we used gene ages to distinguish conserved genes. Specifically, based on a previous protocol (Gao et al., Reference Gao, Wu, Liu, Yao, Yuan, Tao, Yi, Yu, Hou, Fan, Tian, Liu, Chen and Liu2018), genes in the P1–P10 and P11–P20 group were considered older and relatively younger genes, respectively. Those in the P1–P10 group were regarded as conserved genes and subjected to further analysis. The gene age data published previously is presented in Table S1.
Pathway analysis
Bioinformatics analysis was mainly focused on signaling pathways. Gene ontology (GO) pathway analysis is functional analysis associating differentially expressed mRNAs with GO categories. Pathway analysis was performed using the ‘Pathview’ and ‘org. Rn.eg.’ functions (rat and human genome-wide annotation). The P-value of the enriched pathway was derived from the Metascape tool (http://metascape.org/; Zhou et al., Reference Zhou, Zhou, Pache, Chang, Khodabakhshi, Tanaseichuk, Benner and Chanda2019). which is a convenient, independent, free site providing comprehensive functional annotation analysis, with the default settings requiring enriched terms to include ≥ 3 candidates, a P-value ≤ 0.01, and enrichment factor ≥ 1.5.
Enrichment analysis
Given a gene list, pathway/process enrichment analysis applies the standard accumulative hypergeometric statistical test to identify ontology terms, in which input genes show significant presence. Compared with other GO-based enrichment analysis tools, Metascape provides additional arguably better ontology terms including ones from Broad’s Molecular Signatures Database (MSigDB), as well as automatically clusters resultant terms to reduce redundancy.
Enriched terms clustering
As ontology terms, especially within GO, heavily overlap, output terms typically show large degrees of redundancy. Metascape adopts a similar idea as the Database for Annotation, Visualization and Integrated Discovery (DAVID) and automatically clusters all resultant terms into groups based on their similarities. As a result, Metascape can review one term group at a time. Metascape can also uncheck boxes for terms that represent a biological process too broad to be useful so that they are ignored in the export. Terms are hyperlinked to web pages that give their detailed definition.
Results
Identification of candidate genes in minor ZGA and major ZGA in mouse
We identified the key genes expressed during the ZGA process by calculating expression levels [Log2 (RPKM/FPKM+1)] of all annotated genes from the aforementioned databases. Then we further analyzed which ones were involved in both minor ZGA and major ZGA processes, here termed co-expressed factors. Out of 20,000 genes screened in mice, 432 and 3829 were identified as key in minor and major ZGA, respectively. Notably, out of 432 genes in minor ZGA, 352 genes are the co-expressed factors, indicating that the majority of the 432 genes plays a certain role during both minor ZGA and major ZGA (Figure 3A). In addition, several previous studies have demonstrated that early embryonic development is entirely dependent on and driven by maternal factors (Innocenti et al., Reference Innocenti, Fiorentino, Cimadomo, Soscia, Garagna, Rienzi, Ubaldi and Zuccotti2022). Therefore, we further examined whether the factors identified in minor ZGA and major ZGA were maternal genes. So all the genes selected were classified into two groups based on their expression level in the GV oocyte [(FPKM/RPKM > 2 or FPKM/RPKM < 0.5)]. Our results showed that 196 and 85 genes fell into the maternal and non-maternal categories in minor ZGA, while 1988 and 1222 genes were maternal and non-maternal in the major ZGA process (Figure 3B).
Next, we aimed to identify the most important candidate genes in the ZGA process by calculating the difference in expression levels between the two stages in minor ZGA (Early 2C vs PN5) and major ZGA (Late 2C vs PN5). It should be noted that if the expression level of a gene is significantly increased then this gene is more likely to be critical in the ZGA process. As presented in Table 1, genes including Snora7a, Snora81, Snora74a, Rn4.5s, Amd1, Zscan4f, Zscan4a, Zscan4c, Zscan4b and Zfp352 were identified to be some of the most important candidate genes in minor ZGA, as well as Mir8099-2, Snora78, Snora81, Mt2, Mt1, Guca1a, Obox3, Obox6, Vimp, and Cdk2ap1 in major ZGA. Interestingly, we also found identical genes between minor and major ZGA gene lists, including Gcsh and Ctsl (Table S2), indicating that these genes might play critical roles in both minor ZGA and major ZGA. In addition, the top 100 candidate genes in major ZGA are additionally listed in Table S2.
* Table shows the RPKM value for each stage of the genes identified; values for the stages associated with the ZGA are bold.
Identification of candidate genes in minor ZGA and major ZGA in human
To explore the candidate genes in ZGA in human, similarly we first evaluated the expression level of each gene [Log2 (RPKM/FPKM+1)] in minor ZGA and major ZGA, then further analyzed which ones were co-expressed factors in both minor ZGA and major ZGA. As shown in Figure 4A, only 60 genes were identified as co-expressed factors, a number that was markedly lower than that identified in mouse (60 vs. 352). We attributed this discrepancy to the small number of minor ZGA (130 vs. 432). Conversely, when it came to the major ZGA, in total, 3566 genes were chosen that were not significantly different from those in mouse (3566 vs. 3829; Figures 3A and 4A).
Next, we also characterized whether the genes selected were maternal genes. As depicted in Figure 4B, indications implied that there were 85 maternal genes and 27 non-maternal genes in minor ZGA, while 1971 and 1130 genes were classified as maternal genes and non-maternal genes, respectively, in major ZGA. Notably, more maternal than non-maternal genes were recorded in both minor ZGA and major ZGA, indicating that the former category plays a certain role at the stage of zygotic genome activation stage, followed by degradation of maternal factors. Then, in major ZGA, we found that the number of both gene categories in human was comparable with that in mouse (1971–1988 for maternal and 1130–1222 for non-maternal genes), indicating that the number of key transcripts across species is approximately consistent during major ZGA, regardless of whether they were from maternal or the non-maternal groups.
Next, we also identified the most important candidate genes in the ZGA process in human. As shown in Table 2, the top 10 candidate genes identified in minor ZGA and major ZGA were outlined. In summary, SNAR-C3, S100A1,TMSB10, MBD3L2, LOC101927482, FRG2C, G0S2, RRAD, MBD3L3 and LXN in minor ZGA, and KHDC1L, H2AFZ, MBD3L2, DUXA, LOC100506790, MBD3L3, BIK, TCEAL9, ZSCAN4 and NANOGNB in major ZGA were standout examples. The top 100 candidate genes in major ZGA are additionally listed in Table S3.
* Table shows the RPKM value for each stage of the genes identified; values for the stages associated with the ZGA are bold.
Identification of genes conserved between mouse and human
Generally, different genes play unique roles in biological evolution, while conserved genes play roughly the same role. Next, we screened the lists of candidate genes related to the ZGA process in both human and mouse and explored the conserved genes. Genes in major ZGA were markedly upregulated but only slightly upregulated in minor ZGA, and more genes were involved in major than minor ZGA; we only included genes identified in major ZGA in subsequent analysis. In addition, we set the fold change for each gene to three-fold to screen out the most significant candidate genes.
As shown in Figure 5A, we compared the candidate genes identified in the two lists, of which 135 genes were considered to be orthologous genes, while 887 and 760 genes were thought to be specific to human and mouse, respectively. Next, we explored the age of all the 135 genes and used the resulting ages to determine which ones were conserved. Considering that the age data (Neme and Tautz, Reference Neme and Tautz2013; see Materials and methods for more details) of mouse and human are not identical, we investigated the gene age of the 135 genes with both the two gene age datasets. In total, 115 and 116 genes in mouse and human, respectively, were successfully mapped, while the remainder failed to match. Most of the genes (109 in mouse and 107 in human) were grouped as older genes, with only a few falling into the younger category (Figure 5B), and the identified common genes had a high degree of conservation. A summary of the conserved genes identified here is provided in Table S4.
Signal pathways analyses of the genes identified in the major ZGA
Biological development comprises a complex network of regulatory mechanisms, in which different genes play specific roles. To obtain a more comprehensive understanding of the roles of these genes in major ZGA, we conducted GO and pathway enrichment analyses on all conserved genes (n = 135) using Metascape. Results revealed the enrichment of 20 signal pathways, among which those regulating ‘rRNA processing in the nucleus and cytosol (n = 21)’, ‘ribonucleoprotein complex biogenesis (n = 26)’, ‘ribonucleoprotein complex assembly (n = 12)’ and ‘ribosome large subunit biogenesis (n = 9)’ were significantly enriched (Figure 6A). Interestingly, all these pathways were related to the ribosome, and many more genes were contained in these pathways than the others, indicating that the genes associated with the ribosome and rRNA had made a large contribution to major ZGA; this observation is consistent with results previously reported (Shen et al., Reference Shen, Gong, Xing, Zhang, Sun, Chen, Yang, Yan, Chen, Yao, Li, Deng, Wu and Meng2022).
In addition, GO enrichment analysis of the specific genes (n = 760) identified in mouse were significantly enriched in ‘ribonucleoprotein complex biogenesis’, ‘ribonucleoprotein complex subunit organization’, and ‘mRNA processing’ (Figure 6B). In human, the specific genes (n = 887) identified are predominately enriched in ‘metabolism of RNA’, ‘ribonucleoprotein complex biogenesis’, ‘mitochondrial gene expression’, and ‘mitochondrion organization’ (Figure 6C). Overall, it is worth suggesting that the ribosome-related signal pathway is a major signal pathway in major ZGA in both human and mouse.
Discussion
Recent research evidence has confirmed that activation of the zygotic genome is not a single event, but a process through which embryo transcripts are constantly activated. The ZGA process is divided into two stages based on the level of RNA synthesis. The first large-scale synthesis of RNA during embryonic development is called major ZGA, which is followed by a preceding small wave named minor ZGA (Hamatani et al., Reference Hamatani, Carter, Sharov and Ko2004). The two stages differ in both the number and content of transcripts. Researchers have, for a long time, focused on major ZGA while ignoring the role of minor ZGA. However, it should be noted that major ZGA cannot be successfully activated by inhibition of minor ZGA, namely the inhibition of minor ZGA prevents major ZGA. However, when the inhibition is reversed, transcription activities are observed and characterized as minor rather than major ZGA (Abe et al., Reference Abe, Funaya, Tsukioka, Kawamura, Suzuki, Suzuki, Schultz and Aoki2018). Consequently, both ZGA’s main and secondary waves play a crucial role in embryo development.
How does the ZGA program occur? How many genes are important for the ZGA? We are trying to answer these questions. In the present study, we identified the key genes expressed in minor and major ZGA across both human and mouse systems. It should be emphasized that 1222 non-maternal genes were screened from more than 20,000 genes in mouse in our study, which is roughly the same as the number (n = 1312; Li et al., Reference Li, Zhang, Chen, Liu, Lai, Liu, Li, Liu, Xu, Dong, Wang, Duan, Tan, Zheng, Zhang, Fan, Wong, Xu and Wang2018) of the genes identified in the previous work, affirming the reliability of our results. Compared with that research, we additionally identified both non-maternal and maternal genes across minor ZGA stages (top 10 on the list) and major ZGA stages (top 100 on the list). Conversely, to further confirm the reliability of our results, we also compared the candidate genes identified during major ZGA with two additional datasets, GSE53386 (Fan et al., Reference Fan, Zhang, Wu, Guo, Hu, Tang and Huang2015) and GSE71257 (Yu et al., Reference Yu, Ji, Sha, Dang, Zhou, Zhang, Liu, Wang, Hu, Sun, Sun, Tang and Fan2016). It is worth noting that the majority of the top 100 genes (Table S2) could be successfully matched in Table S5 based on datasets GSE53386 and GSE71257, with the expression levels increased significantly. The reason these datasets were not included in this comprehensive analysis is because they only contained data for three stages in mouse, including oocytes, zygotes and late 2-cell stages and not for the early 2-cell stages. Considering the possible limitations of the results based on a single dataset, with the same method mentioned in this paper, we next calculated and compared the gene expression level of each gene at different stages in the other two datasets (GSE53386 and GSE71257) and screened out the upregulated genes in major ZGA in mouse. The results showed that 3987 and 3541 genes were upregulated in major ZGA, which is similar to the results in our study (3987 in GSE53386, 3541 in GSE71257, and 3829 in our study), and most of the top 100 genes identified in major ZGA can be mapped with the genes identified based on the other two datasets (Table S5).
Generally, most of the current knowledge regarding early mammalian development mainly comes from mouse, due to this system’s characteristics of rapid reproduction, easy obtainment, as well as fewer ethical concerns. However, biological development is stage specific and the timing of the ZGA is not exactly the same. For example, ZGA occurs at the 4–8-cell stage in human (Li et al., Reference Li, Zheng and Dean2010; Wu et al., Reference Wu, Huang, Chen, Yin, Liu, Xiang, Zhang, Liu, Wang, Xia, Li, Li, Ma, Peng, Zheng, Ming, Zhang, Zhang and Tian2016), but at the 8–16-cell stage in cattle and sheep (Schultz, Reference Schultz2002; Chen et al., Reference Chen, Chang, Liu, Su, Shyue, Cheng, Chen, Wu, Du, Sung and Xu2012). This indicates that the process in human cannot be directly inferred from that in mouse. However, extensive experiments were limited to be carried out due to the preciousness and scarcity of the embryos. To overcome this problem, we identified the conserved genes in ZGA and presented the gene list with the aim of pinpointing targets for future exploration of the underlying mechanisms in mouse, and prediction of their role in human embryonic development. In a word, our findings provide relevant insights to guide further explorations on human embryonic development.
Although our findings are encouraging, the study had some limitations. First, we analyzed gene expression using the FPKM or RPKM values provided in the database, which might introduce some errors. However, as more than 40,000 genes were obtained in this study, it is difficult to experimentally analyze gene expression using methods such as quantitative real-time polymerase chain reaction (qPCR). Second, as some genes on our list have demonstrated their important roles in recently published work, such as the Mt1/Mt2 (Shi et al., Reference Shi, Li, Wang, Chen and Zhang2018), Obox (Ji et al., Reference Ji, Chen, Stein, Wang, Zhou, Wang, Zhao, Lin, Liu, Xu, Lai, Xiong, Hu, Kong, Kong, Huang, Wang, Xu and Fan2023), Zfp352 (Mwalilino et al., Reference Mwalilino, Yamane, Ishiguro, Usuki, Endoh and Niwa2023), Zscan5b (Ogawa et al., Reference Ogawa, Yamada, Nakamura, Sugawara, Nakamura, Miyajima, Harada, Ooka, Okawa, Miyauchi, Tsumura, Yoshimura, Miyado, Akutsu, Tanaka, Umezawa and Hamatani2019), Zscan4 (Cheng et al., Reference Cheng, Zhang, Lin, Gao, Song, Zheng, Li, Zhang, Shen, Zhang, Huang, Zhan, Zhang, Hu, Sun, Jiang, Sun, Xu and Yang2020; Srinivasan et al., Reference Srinivasan, Nady, Arora, Hsieh, Swigut, Narlikar, Wossidlo and Wysocka2020) in mouse and DUX4 (Vuoristo et al., Reference Vuoristo, Bhagat, Hydén-Granskog, Yoshihara, Gawriyski, Jouhilahti, Ranga, Tamirat, Huhtala, Kirjanov, Nykänen, Krjutškov, Damdimopoulos, Weltner, Hashimoto, Recher, Ezer, Paluoja and Paloviita2022), ZSCAN4 (Vuoristo et al., Reference Vuoristo, Bhagat, Hydén-Granskog, Yoshihara, Gawriyski, Jouhilahti, Ranga, Tamirat, Huhtala, Kirjanov, Nykänen, Krjutškov, Damdimopoulos, Weltner, Hashimoto, Recher, Ezer, Paluoja and Paloviita2022), and NANOGNB (Dunwell and Holland, Reference Dunwell and Holland2017) in human (Table 3), we did not pick any other candidate genes for functional validation. Moreover, it was unreliable to validate one by one. Third, as mentioned above, ZGA proceeds in two phases, minor and major ZGA, and the pattern of gene expression is dramatically changed between these two phases (Abe et al., Reference Abe, Yamamoto, Franke, Cao, Suzuki, Suzuki, Vlahovicek, Svoboda, Schultz and Aoki2015; Yamamoto and Aoki, Reference Yamamoto and Aoki2017). So, in this research, if the expression level of a gene was significantly increased, this gene was considered to be potentially a key factor in the ZGA process. However, some factors must be admitted, such as the expression level showing just a slight change during ZGA, that may also be of great importance to ZGA. Last, the candidate genes identified may not only play a critical role in ZGA, but also may be of great significance in early embryonic development, such as the formation of totipotent blastocysts. For instance, genes Mga (Washkowitz et al., Reference Washkowitz, Schall, Zhang, Wurst, Floss, Mager and Papaioannou2015) and Myc (Wang et al., Reference Wang, Zhang, Zhang, Kou, Han, Chen, Sun and Gao2010; Wan et al., Reference Wan, Liang, Xiong, Shi, Zhang, Lu, He, Yang, Chen, Liu, Barton and Songyang2013) whose expression levels were increased in ZGA, were also reported to have roles in maintaining pluripotency in the mammalian embryo.
Overall, we identified potential key regulators in minor ZGA and major ZGA both in human and mouse and generated two respective lists of those genes. Moreover, we also made a list of conserved genes in major ZGA, and revealed that their functions were mainly related to ribosomal RNA in the biological process. In summary, our findings provided a platform for future studies on ZGA, and made it more convenient, rapid and easier for other researchers to select one or several genes from the whole genome for subsequent research, and contribute to revealing candidate genes and the regulatory mechanisms in this special process.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0967199423000631
Acknowledgements
We thank all participants for the work, members of Jiawei Xu for the research design, Wenbo Li, Shuxia Ma, and Dong Fang for database supervision and validation. Other authors contributed to the manuscript revision, read it, and all the authors approved the submitted version.
Funding statement
This study was supported by the National Natural Science Foundation of China (31870817), the National Key R&D Programme of China (2019YFA0110900 and 2019YFA0802200), and the Science and Natural Science Foundation of Henan Province (22100026/20).
Competing interests
The authors declare that they have no competing interests.