Author Correspondence author
Cotton Genomics and Genetics, 2024, Vol. 15, No. 1
Received: 29 Dec., 2023 Accepted: 08 Feb., 2024 Published: 19 Feb., 2024
Whole genome duplications (WGDs) have played a significant role in the evolution and diversification of the Gossypium genus. This study delves into the intricate processes and outcomes of genome duplications in Gossypium, highlighting the evolutionary, genetic, and functional implications. WGDs have been pivotal in shaping the genetic architecture of cotton, leading to increased complexity and adaptability. This study synthesizes findings from various studies to provide a comprehensive understanding of how WGDs contribute to the accumulation of deleterious mutations, structural variations, and species-specific traits. Additionally, it explores the evolutionary advantages and challenges associated with polyploidy, including the potential for increased genetic diversity and resilience in stressful environments. This study aims to elucidate the multifaceted impacts of genome duplications on the Gossypium genus, offering insights that could inform future research and breeding programs.
1 Introduction
The Gossypium genus, commonly known as cotton, encompasses a diverse group of species that are economically significant for their natural fiber production. This genus includes both diploid and allotetraploid species, with the latter being particularly important in agriculture due to their superior fiber qualities and adaptability to various environmental conditions. Notable species within this genus include Gossypium hirsutum and Gossypium barbadense, which are widely cultivated for their high-quality fibers (Wang et al., 2018; Hu et al., 2019). The genus also includes wild species such as Gossypium longicalyx and Gossypium australe, which possess traits valuable for breeding programs, such as disease resistance and nematode immunity (Cai et al., 2019; Grover et al., 2020).
Genome duplication, or polyploidization, is a pivotal evolutionary process that has significantly shaped the genomes of many plant species, including those within the Gossypium genus. Polyploidization events can lead to increased genetic diversity, novel gene functions, and enhanced adaptability, which are crucial for the survival and evolution of species (Nardeli et al., 2018; Hu et al., 2019). In Gossypium, the allotetraploid species have undergone extensive structural variations and gene duplications, which have contributed to their superior agronomic traits and fiber qualities (Wang et al., 2018; Hu et al., 2019). Understanding the mechanisms and outcomes of these genome duplications is essential for advancing cotton breeding programs and improving crop resilience to environmental stresses (Nardeli et al., 2018; Huang et al., 2020).
This study aims to provide an in-depth analysis of genome duplications within the Gossypium genus and their evolutionary and functional outcomes. This study elucidates the evolutionary history and mechanisms of genome duplications in Gossypium species, with a focus on both diploid and allotetraploid genomes; explores the structural variations and gene duplications that have occurred post-polyploidization and their implications for cotton evolution and domestication; and discusses the potential applications of these genomic insights in cotton breeding programs aimed at improving fiber quality, disease resistance, and environmental adaptability.
2 Evolutionary Background of Gossypium
2.1 Phylogenetic relationships within the genus
The Gossypium genus, which includes economically significant cotton species, exhibits a complex phylogenetic structure. The phylogenetic relationships within Gossypium have been elucidated through comprehensive genomic studies. For instance, the genome sequence of Gossypium herbaceum and updates to the genomes of Gossypium arboreum and Gossypium hirsutum have provided insights into the evolutionary history of the A-genome lineage, suggesting that all existing A-genomes may have originated from a common ancestor (Huang et al., 2020). Additionally, the genome sequence of Gossypioides kirkii has illustrated the descending dysploidy in plants, highlighting structural rearrangements that have occurred over time (Udall et al., 2019).
2.2 Historical events leading to genome duplications
Genome duplications, particularly whole-genome duplications (WGDs), have played a pivotal role in the evolution of Gossypium. The allotetraploid cotton species, Gossypium hirsutum and Gossypium barbadense, are products of such duplications. Comparative genomic analyses have revealed extensive structural variations and gene family expansions that occurred post-polyploidization, contributing to the speciation and evolutionary history of these species (Wang et al., 2018; Hu et al., 2019). Historical WGD events are not unique to Gossypium but are widespread among angiosperms, often correlating with periods of global environmental changes and contributing to species diversification (Ren et al., 2018; Wu et al., 2020).
2.3 Implications of polyploidy in plant evolution
Polyploidy, or the condition of having multiple sets of chromosomes, has significant implications for plant evolution. It is associated with increased genetic diversity, which can enhance adaptability to environmental stresses. Polyploid plants often exhibit novel traits and improved resilience, which can be advantageous in harsh or changing environments (Yao et al., 2019; Peer et al., 2020). In Gossypium, polyploidy has facilitated the development of superior fiber qualities and better survival in diverse environments (Hu et al., 2019). Moreover, polyploidy has been linked to the retention of gene duplicates that play critical roles in stress responses, further supporting the adaptive potential of polyploid plants (Wu et al., 2020).
3 Mechanisms of Genome Duplication
3.1 Types of genome duplication: whole-genome, segmental, and tandem duplications
Genome duplications can be broadly categorized into three types: whole-genome duplications (WGDs), segmental duplications, and tandem duplications. Whole-genome duplications, also known as polyploidy, involve the duplication of the entire genome and are a significant evolutionary force in plants, including Gossypium species (Figure 1) (Conover and Wendel, 2021; Johri et al., 2022). Segmental duplications refer to the duplication of large segments of the genome, which can include multiple genes and regulatory elements. These duplications can contribute to genetic diversity and complexity within the genome (Vollger et al., 2018; Li et al., 2020). Tandem duplications involve the duplication of small segments of DNA that are adjacent to each other on the chromosome. These duplications can lead to variations in gene expression and function, as seen in studies of Drosophila melanogaster and other species (Loehlin et al., 2021; Schimmel et al., 2022).
Figure 1 Phylogeny and biogeography of Gossypium allopolyploids and progenitor diploids (Adopted from Conover and Wendel, 2021) Image caption: Diploid species of Gossypium are sorted into eight genome groups. Groups A (exemplified by G. herbaceum) and D (exemplified by G. raimondii) diverged around 5 million years ago, originating from different hemispheres. Allopolyploids emerged around 1-1.6 million years ago after a transoceanic journey brought an ancestor of the A genome (represented by G. herbaceum [A1]) to the Americas, where it hybridized with a local D genome species (represented by G. raimondii [D5]). This event led to the diversification and evolution of seven recognized allopolyploid species that now span the Americas and the Pacific islands. Each species' flower and fruit morphology, along with their island locations and geographic extents, are depicted. The phylogenetic tree's branch lengths are illustrative, with significant divergence times marked (Adapted from Conover and Wendel, 2021) |
Conover and Wendel (2021) provides a detailed visualization of the phylogeny and biogeography of Gossypium allopolyploids and their diploid ancestors, showcasing the evolutionary trajectory of cotton species. It illustrates the diversification of allopolyploids like G. hirsutum (AD1) and G. barbadense (AD2), following a key allopolyploidization event between 1~1.6 million years ago. This figure pairs flower and fruit morphologies with geographic distributions across the Americas and Pacific islands, emphasizing ecological adaptations and speciation. It supports research findings that polyploidy influences mutation rates, affecting genetic diversity within the genus. The figure serves as a crucial tool in understanding the evolutionary dynamics of Gossypium.
3.2 Molecular mechanisms underlying duplications
The molecular mechanisms underlying genome duplications vary depending on the type of duplication. Whole-genome duplications often result from errors during meiosis or mitosis, leading to the duplication of the entire set of chromosomes (Conover and Wendel, 2021). Segmental duplications can occur through mechanisms such as unequal crossing over during meiosis, replication slippage, or the activity of transposable elements (Vollger et al., 2018; Li et al., 2020). Tandem duplications are frequently the result of errors in DNA replication or repair processes, such as non-homologous end joining (NHEJ) or microhomology-mediated end joining (MMEJ) (Loehlin et al., 2021; Schimmel et al., 2022). These mechanisms can lead to the creation of new gene copies that may acquire new functions (neofunctionalization), divide the original function (subfunctionalization), or maintain the same function to increase gene dosage (Birchler and Yang, 2022).
3.3 Detection and analysis of duplications in Gossypium
Detecting and analyzing duplications in the Gossypium genome involves various genomic and bioinformatic techniques. Comparative genomic analyses have been used to identify repetitive elements and segmental duplications between different Gossypium species, such as Gossypium raimondii and Gossypium arboretum. Fluorescence in situ hybridization (FISH) has been employed to visualize the distribution of specific repetitive elements across chromosomes, aiding in the identification of duplicated regions (Lu et al., 2019). High-throughput sequencing technologies and advanced computational tools, such as the Segmental Duplication Assembler (SDA) and TARDIS, have been developed to resolve complex duplications and accurately characterize structural variations in the genome (Soylev et al., 2018; Vollger et al., 2018). These methods enable researchers to better understand the evolutionary history and functional implications of genome duplications in Gossypium.
4 Genomic Consequences of Duplications
4.1 Gene retention and loss post-duplication
Gene duplication is a prevalent phenomenon across the tree of life, and the processes that lead to the retention of duplicated genes are complex and multifaceted. Retention of duplicated genes can occur through various mechanisms, including neofunctionalization, subfunctionalization, back-up compensation, and dosage amplification (Kuzmin et al., 2021). Whole genome duplications (WGDs) are particularly significant, as they result in the retention of hundreds to thousands of gene duplicates, which can contribute to genome complexity and species diversity. However, not all duplicated genes are retained; many are lost over time. The rate of gene loss following WGD events is exponential, with an estimated half-life of approximately 21.6 million years (Ren et al., 2018). Additionally, the retention of duplicated genes is influenced by their functional roles, with genes involved in essential cellular processes being preferentially retained (Ren et al., 2018).
4.2 Functional divergence and subfunctionalization
Following duplication, gene copies can undergo functional divergence, leading to subfunctionalization or neofunctionalization. Subfunctionalization occurs when duplicated genes partition the original functions between them, while neofunctionalization involves one copy acquiring a new function (Kuzmin et al., 2021; Birchler and Yang, 2022). In the context of Gossypium, duplicated genes may also exhibit tissue-specific expression evolution, reflecting strong selection pressures to maintain genome stability and adapt to new functional roles. The divergence in gene expression can be driven by various factors, including transposable element insertions in promoters, which can create novel binding sites for transcription factors and drive tissue-specific gains in expression (Gillard et al., 2021).
4.3 Pseudogenization and gene redundancy
Pseudogenization is a common fate for duplicated genes, where one copy becomes a non-functional pseudogene due to the accumulation of deleterious mutations (Zachariah and Gray, 2019; Birchler and Yang, 2022). In Gossypium, the accumulation of deleterious mutations is faster in allopolyploids compared to diploids, likely due to the masking effect of redundant gene copies. This process can lead to unequal accumulation of deleterious mutations between subgenomes, further contributing to genomic divergence (Conover and Wendel, 2021). Additionally, the fractionation process following WGD can result in the loss of syntenic paralogs, with few relics of the missing sequence remaining (Yu et al., 2020). This highlights the dynamic nature of genome evolution post-duplication, where gene redundancy and pseudogenization play significant roles in shaping the genomic landscape.
5 Phenotypic Outcomes of Genome Duplications
5.1 Impact on morphological traits
Genome duplications in Gossypium species have led to significant changes in morphological traits. Polyploidy, or whole genome duplication (WGD), is a common phenomenon in plants and has been shown to affect various phenotypic traits. For instance, a meta-analysis of WGD effects on flowering traits in plants revealed that floral morphology traits generally increased in size following genome duplication, although reproductive output decreased and flowering phenology remained unaffected (Porturas et al., 2019). In Gossypium, the allotetraploid species Gossypium hirsutum and Gossypium barbadense have evolved distinct morphological traits post-polyploidization, with G. hirsutum producing higher fiber yields and better surviving harsh environments compared to G. barbadense, which produces superior-quality fibers (Hu et al., 2019).
5.2 Influence on fiber quality and yield
Genome duplications have had a profound impact on fiber quality and yield in Gossypium species. The genetic variation and structural changes resulting from polyploidization have been harnessed to improve these traits. For example, a genome-wide association study (GWAS) on Gossypium barbadense identified several candidate genes associated with fiber strength and lint percentage, which are crucial for fiber quality (Yu et al., 2021). Additionally, introgression lines developed from crosses between G. hirsutum and G. tomentosum have been used to map quantitative trait loci (QTLs) for fiber quality and yield traits, identifying numerous QTLs that contribute to these important agricultural traits (Keerio et al., 2018). Comparative genomics analyses have also highlighted structural variations and gene family expansions that have contributed to the superior fiber quality and yield in G. hirsutum (Wang et al., 2018; Hu et al., 2019).
5.3 Adaptation to environmental stresses
Genome duplications have also played a critical role in the adaptation of Gossypium species to environmental stresses. The allotetraploid nature of Gossypium hirsutum has endowed it with greater resilience to environmental challenges compared to its diploid progenitors and other cotton species. This resilience is partly due to the extensive structural variations and gene family expansions that have occurred post-polyploidization, which have enhanced the species' ability to survive in diverse and harsh environments (Hu et al., 2019). Furthermore, the introgression of favorable chromosome segments from G. barbadense to G. hirsutum has been shown to improve fiber quality and stress resilience, providing valuable genetic resources for breeding programs aimed at enhancing cotton's adaptability to changing environmental conditions (Wang et al., 2018).
Genome duplications in Gossypium species have led to significant phenotypic outcomes, including changes in morphological traits, improvements in fiber quality and yield, and enhanced adaptation to environmental stresses. These changes have been driven by genetic variations, structural modifications, and gene family expansions that have occurred post-polyploidization, providing valuable insights for cotton breeding and improvement programs.
6 Comparative Genomics and Duplication Events
6.1 Comparative analysis of diploid and tetraploid Gossypium species
The comparative analysis of diploid and tetraploid Gossypium species reveals significant insights into the genomic changes and evolutionary processes that have occurred due to polyploidization. Allotetraploid Gossypium species, such as Gossypium hirsutum and Gossypium barbadense, originated from an allopolyploidization event involving an A-genome diploid species and a D-genome diploid species (Cheng et al., 2019). This event has led to the expansion and diversification of transposable elements (TEs) in the tetraploid genomes, with Gypsy being the most abundant TE type (Cheng et al., 2019). The Dof (DNA-binding one zinc finger) transcription factor family has expanded significantly in tetraploid species compared to their diploid progenitors, primarily through segmental duplication (Figure 2) (Li et al., 2020).
Figure 2 Expression patterns of Dof genes under different tissue (Adopted from Li et al., 2020) Image caption: A: G. hirsutum, B: G. barbadense (Adopted from Li et al., 2020) |
Li et al. (2020) depicted in the figure examines the expression patterns of Dof genes across various tissues in Gossypium hirsutum and Gossypium barbadense using transcriptome data. It reveals that the expression of Dof genes varies significantly across different tissues, indicating diverse biological roles. Similar to patterns observed in millet, certain Dof genes show high expression universally, suggesting essential regulatory functions, while others are expressed minimally, hinting at more specialized roles. This variability highlights the complexity of the Dof gene family's involvement in tissue-specific regulatory mechanisms, contributing to the understanding of their functional diversification in cotton plant development and stress responses.
The whole genome resequencing of multiple Gossypium species has demonstrated that deleterious mutations accumulate faster in allopolyploids than in their diploid progenitors, likely due to the masking effect of redundant gene copies. This accumulation is uneven between the subgenomes, suggesting differential selective pressures and evolutionary trajectories for the A and D subgenomes (Conover and Wendel, 2021). Furthermore, the genome of the diploid wild species Gossypium australe shows closer collinear relationships with G. arboreum than with G. raimondii, indicating less extensive genome reorganization (Cai et al., 2019).
6.2 Cross-species comparisons with other crop genomes
Comparative genomic studies between Gossypium species and other crop genomes provide valuable insights into the evolutionary dynamics of polyploidy and genome duplication. For instance, the rice genus Oryza, which includes both recently formed and older allopolyploid species, serves as an excellent model for studying the temporal progression of genomic responses to allopolyploidy (Zou et al., 2020). In Oryza, the process of diploidization and the expression divergence driven by changes in selective constraints have been well documented, offering parallels to the evolutionary processes observed in Gossypium (Zou et al., 2020).
The identification of genome-specific repetitive elements, such as the ICRd motif in the Gossypium D genome, highlights the role of repetitive sequences in genome variation and evolution. These elements are common in the D5 genome but rare in the A2 genome, providing a basis for understanding the differences between the A and D genomes and facilitating research on Gossypium genome evolution (Lu et al., 2019).
6.3 Insights from comparative genomic studies
Comparative genomic studies have significantly advanced our understanding of the evolutionary history and functional genomics of Gossypium species. The improved genome assemblies of Gossypium hirsutum and Gossypium barbadense have revealed extensive structural variations, including large paracentric and pericentric inversions, that likely occurred after polyploidization. These structural variations have important implications for cotton breeding programs, as they can be associated with traits such as fiber quality (Wang et al., 2018).
The construction of genetic linkage maps and the identification of key genes involved in plant hormone signaling, development, and defense reactions have provided valuable resources for cotton improvement (Kirungu et al., 2018). The development of introgression lines and the identification of quantitative trait loci (QTL) associated with superior fiber quality further underscore the potential of comparative genomics in enhancing cotton breeding efforts (Wang et al., 2018).
The integration of high-throughput sequencing technologies and bioinformatics analyses has ushered in a new era of Gossypium genomics, enabling a deeper understanding of the genomic basis of fiber biogenesis and the landscape of cotton functional genomics. These advancements pave the way for multidisciplinary genomics-enabled breeding strategies aimed at achieving high fiber yield, quality, and environmental resilience in future cotton breeding programs (Yang et al., 2020).
7 Genomic Tools and Techniques
7.1 Advances in sequencing technologies
Recent advancements in sequencing technologies have significantly enhanced our understanding of the Gossypium genome. The integration of single-molecule real-time sequencing, BioNano optical mapping, and high-throughput chromosome conformation capture techniques has led to the development of reference-grade genome assemblies for Gossypium hirsutum and Gossypium barbadense. These assemblies exhibit improved contiguity and completeness, particularly in regions with high repeat content such as centromeres, compared to previous draft genomes (Wang et al., 2018). Similarly, the genome of Gossypium australe was sequenced using a combination of PacBio, Illumina short read, BioNano (DLS), and Hi-C technologies, resulting in a high-quality reference genome (Cai et al., 2019). These technological advancements have provided valuable genomic resources for cotton research and breeding programs.
7.2 Bioinformatics approaches for duplication analysis
Bioinformatics tools have been pivotal in analyzing gene duplications within the Gossypium genome. For instance, the MCScanX tool was employed to identify segmental duplications in the Dof transcription factor gene family in Gossypium hirsutum, revealing that these genes expanded due to segmental duplications (Li et al., 2018). Comparative genomics analyses have also been used to identify structural variations and transposable elements (TEs) that influence gene expression. In Gossypium hirsutum, TEs, particularly those targeted by 24-nt small interfering RNAs (siRNAs), were associated with reduced gene expression, highlighting the role of TEs in regulating gene expression (Cheng et al., 2019). Additionally, the identification of genome-specific repetitive elements, such as the ICRd motif in the Gossypium D genome, has facilitated the study of genome evolution and subgenome identification (Lu et al., 2019).
7.3 Functional genomics tools for studying duplicated genes
Functional genomics tools, including quantitative real-time polymerase chain reaction (qRT-PCR) and transcriptomics analyses, have been employed to study the expression and functional divergence of duplicated genes in Gossypium. For example, qRT-PCR analysis of the Dof transcription factor gene family in Gossypium hirsutum revealed differential expression patterns under various stress conditions, indicating their roles in stress responses (Li et al., 2018). Similarly, transcriptomics analyses in Gossypium australe identified multiple genes involved in disease resistance responses, with experiments confirming the induction of these genes by various plant hormones and pathogens (Cai et al., 2019). The study of the PEPC gene family in Gossypium also demonstrated that duplicated genes displayed diverse expression patterns, suggesting functional divergence and their roles in abiotic stress responses (Zhao et al., 2019). These functional genomics tools are essential for understanding the roles of duplicated genes in cotton evolution and adaptation.
8 Applications in Cotton Breeding
8.1 Exploiting duplications for trait improvement
Genomic duplications in cotton have provided a wealth of genetic material that can be harnessed for trait improvement. The reference-grade genome assemblies of Gossypium hirsutum and Gossypium barbadense have identified extensive structural variations, including large paracentric and pericentric inversions, which are crucial for understanding the genetic basis of fiber quality and other agronomic traits (Figure 3) (Wang et al., 2018). These duplications and structural variations can be exploited to introduce favorable traits from one species to another, as demonstrated by the identification of 13 quantitative trait loci (QTL) associated with superior fiber quality through the construction of introgression lines (Wang et al., 2018).
Figure 3 Identification of favorable chromosome segments controlling fiber quality by using introgression lines (Adopted from Wang et al., 2018) Image caption: a: Development of an introgression line population using G. hirsutum Emain22 (recurrent parent) and G. barbadense 3-79 (donor parent). The top track displays the fiber characteristics of both cotton varieties. The bottom track depicts the distribution of introgression segments across the 26 chromosomes in G. hirsutum, identified from 168 introgression lines; the x-axis represents the 26 chromosomes, while the y-axis represents the 168 lines. b: Fiber traits of introgression line N29, with a specific introgression segment on chromosome D12 (from 47.4 Mb to 54.6 Mb) highlighted in red. c: SNP ratio analysis by mapping-by-sequencing for the Xuzhou142fl variant, comparing SNPs on chromosomes A12 and D12 between two F2 population pools (one mirroring the fiber characteristics of Xuzhou142, the other showing the Xuzhou142fl phenotype). d: Localization of fiber-quality quantitative trait loci (QTLs) within G. hirsutum chromosomes, marked by red boxes; locations detailed in Supplementary Table 30. e: QTL analysis for fiber length (mm). f: QTL analysis for fiber elongation rate (%). Both e and f plot introgression segments on the x-axis and LOD scores on the y-axis, with QTL locations indicated by arrows (Adapted from Wang et al., 2018) |
The study by Wang et al. (2018) provides key insights into cotton genetics and fiber quality enhancement using introgression lines between Gossypium hirsutum and G. barbadense. By transferring advantageous traits from G. barbadense, researchers identified crucial genetic segments on chromosome D12 that significantly impact fiber characteristics. This investigation not only verifies the utility of introgression lines in cotton breeding but also isolates specific genetic loci for targeted improvement. Through extensive QTL analysis, the study unveils new loci that affect fiber traits, thereby establishing a foundation for advanced breeding strategies. This research demonstrates the potential for genetic advancements in cotton, suggesting sophisticated breeding methods for superior fiber quality.
The whole-genome resequencing of G. barbadense has revealed significant genetic variation, including single-nucleotide polymorphisms (SNPs) and insertion-deletions (indels), which are associated with important traits such as fiber strength and lint percentage (Yu et al., 2021). These genetic variations provide a rich resource for selecting and breeding cotton varieties with improved fiber quality.
8.2 Breeding strategies integrating genomic duplication insights
Breeding strategies that integrate insights from genomic duplications involve the use of high-quality genome assemblies and comparative genomics to identify and select for beneficial traits. The high-quality de novo-assembled genomes of G. hirsutum and G. barbadense have provided a detailed understanding of species-specific alterations in gene expression and structural variations, which are essential for speciation and evolutionary history (Hu et al., 2019). This information can be used to develop breeding programs that focus on improving fiber quality and resilience to environmental stress.
The construction of ultra-dense genetic maps using re-sequencing data has also facilitated the identification of QTL associated with important traits such as crossover frequency and floral transition (Shen et al., 2021). These maps provide a valuable resource for understanding the recombination landscape and for developing breeding strategies that incorporate genomic duplication insights to enhance cotton traits.
8.3 Case studies of successful breeding programs
Several successful breeding programs have leveraged genomic duplications to improve cotton traits. For instance, the introgression of favorable chromosome segments from G. barbadense to G. hirsutum has led to the identification of QTL associated with superior fiber quality, demonstrating the practical application of genomic duplications in breeding (Wang et al., 2018).
Another example is the use of whole-genome resequencing to identify candidate genes associated with fiber strength and lint percentage in G. barbadense. The identified genes, such as HD16 orthology, WDL2 orthology, and TUBA1 orthology, serve as promising targets for genetic engineering and breeding programs aimed at improving these traits (Yu et al., 2021).
Furthermore, the high-quality genome assembly of Gossypium tomentosum has provided insights into the genetic variations and recombination landscape, which are essential for understanding interspecific crosses and for developing breeding strategies that enhance genetic improvement in cotton (Shen et al., 2021). By integrating genomic duplication insights into breeding programs, researchers and breeders can develop cotton varieties with improved fiber quality, resilience to environmental stress, and other desirable traits, ultimately contributing to the advancement of cotton breeding and production.
9 Future Perspectives and Challenges
9.1 Emerging trends in genome duplication research
Recent advancements in genome sequencing technologies have significantly enhanced our understanding of genome duplications and their evolutionary implications. The integration of single-molecule real-time sequencing, BioNano optical mapping, and high-throughput chromosome conformation capture techniques has led to the development of high-quality genome assemblies for Gossypium species, providing deeper insights into their evolutionary history and structural variations (Wang et al., 2018). Comparative genomic and phylogenomic analyses have revealed widespread whole genome duplications (WGDs) across angiosperms, highlighting their role in species diversification and adaptation to environmental changes (Ren et al., 2018). Additionally, the use of digital organisms to simulate the evolutionary consequences of WGDs under different environmental scenarios has provided new perspectives on the adaptive potential of polyploids (Yao et al., 2019).
9.2 Potential challenges in studying complex genomes
Despite these advancements, several challenges remain in studying complex genomes like those of Gossypium species. One major challenge is the accumulation of deleterious mutations in allopolyploids, which can complicate the interpretation of evolutionary dynamics and the functional significance of duplicated genes (Conover and Wendel, 2021). Another challenge is the high level of genome-specific repetitive elements, which can cause genome variation and complicate genome assembly and annotation processes (Lu et al., 2019). Furthermore, the narrow genetic base of tetraploid cotton cultivars poses a bottleneck for genetic improvement, necessitating the development of interspecific hybrids and the introduction of wild germplasm (Kirungu et al., 2018).
9.3 Future directions for Gossypium genomics
Future research in Gossypium genomics should focus on several key areas to overcome these challenges and leverage the potential of genome duplications. There is a need for more comprehensive and high-resolution genetic maps to facilitate the identification of quantitative trait loci (QTL) associated with important agronomic traits, such as fiber quality and stress resistance (Wang et al., 2018; Shen et al., 2021). Further studies should investigate the functional roles of duplicated genes and their contributions to plant development and stress responses, as exemplified by the Dof transcription factor gene family in Gossypium hirsutum (Li et al., 2018). Third, systematic meta-analyses of the impact of genome duplication on secondary metabolite composition can provide valuable insights into the ecological and physiological consequences of polyploidy (Gaynor et al., 2020). The development of innovative computational models and bioinformatics tools will be crucial for dissecting the complex interactions between duplicated genomes and their environments, ultimately informing breeding programs and conservation strategies for Gossypium species (Yao et al., 2019). By addressing these future directions, researchers can enhance our understanding of genome duplications and their outcomes, paving the way for improved cotton breeding and sustainable agricultural practices.
10 Concluding Remarks
The in-depth analysis of Gossypium genome duplications has revealed several critical insights into the evolutionary history and functional outcomes of polyploidization in cotton species. The reference-grade genome assemblies of Gossypium hirsutum and Gossypium barbadense have significantly improved our understanding of cotton evolution, highlighting extensive structural variations and the identification of quantitative trait loci associated with superior fiber quality. Comparative genomics analyses have elucidated species-specific alterations in gene expression and structural variations that underpin the speciation and evolutionary history of these allotetraploid species. Additionally, the assembly of the Gossypium herbaceum genome and updates to Gossypium arboreum and Gossypium hirsutum genomes have provided valuable insights into the phylogenetic relationships and origins of cotton A-genomes. The accumulation of deleterious mutations in allopolyploid cotton compared to diploid progenitors has also been demonstrated, providing a genome-wide perspective on the evolutionary fate of gene duplications.
The findings from these studies have profound implications for cotton science and agriculture. The improved genome assemblies and comparative analyses offer a robust framework for future breeding programs aimed at enhancing fiber quality and environmental resilience. The identification of structural variations and quantitative trait loci associated with fiber quality can inform targeted breeding strategies to introduce favorable traits from G. barbadense to G. hirsutum, potentially leading to the development of superior cotton cultivars. Furthermore, understanding the evolutionary dynamics of gene duplications and the accumulation of deleterious mutations in allopolyploid cotton can guide the management of genetic diversity and the selection of beneficial traits in breeding programs. The insights gained from the phylogenetic relationships and origins of cotton genomes also provide a foundation for exploring the domestication history and genetic improvement of other economically important crops.
Genome duplications have played a pivotal role in the evolution and domestication of cotton species, leading to the diversification of traits and the enhancement of fiber quality. The comprehensive genomic resources and analyses presented in these studies underscore the complexity and significance of polyploidization in shaping the genetic architecture of cotton. As we continue to unravel the intricacies of genome duplications, it is essential to leverage these insights to drive innovation in cotton breeding and to address the challenges posed by changing environmental conditions. The integration of advanced genomic technologies and multidisciplinary approaches will be crucial in harnessing the full potential of cotton genomes for sustainable agricultural practices and the continued improvement of this vital crop.
Acknowledgments
The authors extend sincere thanks to two anonymous peer reviewers for their invaluable feedback on the manuscript of this paper, whose evaluations and suggestions have greatly contributed to the improvement of the manuscript.
Conflict of Interest Disclosure
The authors affirm that this research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.
Birchler J., and Yang H., 2022, The multiple fates of gene duplications: deletion, hypofunctionalization, subfunctionalization, neofunctionalization, dosage balance constraints, and neutral variation, The Plant Cell, 34: 2466-2474.
https://doi.org/10.1093/plcell/koac076
Cai Y., Cai X., Wang Q., Wang P., Zhang Y., Cai C., Xu Y., Wang K., Zhou Z., Wang C., Geng S., Li B., Dong Q., Hou Y., Wang H., Ai P., Liu Z., Yi F., Sun M., An G., Cheng J., Zhang Y., Shi Q., Xie Y., Shi X., Chang Y., Huang F., Chen Y., Hong S., Mi L., Sun Q., Zhang L., Zhou B., Peng R., Zhang X., and Liu, F., 2019, Genome sequencing of the Australian wild diploid species Gossypium australe highlights disease resistance and delayed gland morphogenesis, Plant Biotechnology Journal, 18: 814-828.
https://doi.org/10.1111/pbi.13249
Cheng H., Sun G., He S., Gong W., Peng Z., Wang R., Lin Z., and Du X., 2019, Comparative effect of allopolyploidy on transposable element composition and gene expression between Gossypium hirsutum and its two diploid progenitors, Journal of Integrative Plant Biology, 61(1): 45-59.
https://doi.org/10.1111/jipb.12763
Conover J., and Wendel J., 2021, Deleterious mutations accumulate faster in allopolyploid than diploid cotton (Gossypium) and unequally between subgenomes, Molecular Biology and Evolution, 39(2): msac024.
https://doi.org/10.1093/molbev/msac024
Gaynor M., Lim-Hing S., and Mason C., 2020, Impact of genome duplication on secondary metabolite composition in non-cultivated species: a systematic meta-analysis, Annals of Botany, 126(3): 363-376.
https://doi.org/10.1093/aob/mcaa107
Gillard G., Grønvold L., Røsæg L., Holen M., Monsen Ø., Koop B., Rondeau E., Gundappa M., Mendoza J., Macqueen D., Rohlfs R., Sandve S., and Hvidsten T., 2021, Comparative regulomics supports pervasive selection on gene dosage following whole genome duplication, Genome Biology, 22: 1-18.
https://doi.org/10.1186/s13059-021-02323-0
Grover C., Pan M., Yuan D., Arick M., Hu G., Brase L., Stelly D., Lu Z., Schmitz R., Peterson D., Wendel J., and Udall J., 2020, The Gossypium longicalyx genome as a resource for cotton breeding and evolution, G3: Genes|Genomes|Genetics, 10: 1457-1467.
https://doi.org/10.1534/g3.120.401050
Hu Y., Chen J., Fang L., Zhang Z., Ma W., Niu Y., Ju L., Deng J., Zhao T., Lian J., Baruch K., Fang D., Liu X., Ruan Y., Rahman M., Han J., Wang K., Wang Q., Wu H., Mei G., Zang Y., Han Z., Xu C., Shen W., Yang D., Si Z., Dai F., Zou L., Huang F., Bai Y., Zhang Y., Brodt A., Ben-Hamo H., Zhu X., Zhou B., Guan X., Zhu S., Chen X., and Zhang T., 2019, Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton, Nature Genetics, 51: 739-748.
https://doi.org/10.1038/s41588-019-0371-5
Huang G., Wu Z., Percy R., Bai M., Li Y., Frelichowski J., Hu J., Wang K., Yu J., and Zhu Y., 2020, Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution, Nature Genetics, 52: 516-524.
https://doi.org/10.1038/s41588-020-0607-4
Johri P., Gout J., Doak T., and Lynch M., 2022, A population-genetic lens into the process of gene loss following whole-genome duplication, Molecular Biology and Evolution, 39(6): msac118.
https://doi.org/10.1093/molbev/msac118
Keerio A., Shen C., Nie Y., Ahmed M., Zhang X., and Lin Z., 2018, QTL mapping for fiber quality and yield traits based on introgression lines derived from Gossypium hirsutum × G. tomentosum, International Journal of Molecular Sciences, 19(1): 243.
https://doi.org/10.3390/ijms19010243
Kirungu J., Deng Y., Cai X., Magwanga R., Zhou Z., Wang X., Wang Y., Zhang Z., Wang K., and Liu F., 2018, Simple Sequence Repeat (SSR) genetic linkage map of D genome diploid cotton derived from an interspecific cross between Gossypium davidsonii and Gossypium klotzschianum, International Journal of Molecular Sciences, 19(1): 204.
https://doi.org/10.3390/ijms19010204
Kuzmin E., Taylor J., and Boone C., 2021, Retention of duplicated genes in evolution, Trends in Genetics: TIG, 38(1): 59-72.
https://doi.org/10.1016/j.tig.2021.06.016
Li H., Dou L., Li W., Wang P., Zhao Q., Xi R., Pei X., Liu Y., and Ren Z., 2018, Genome-wide identification and expression analysis of the Dof transcription factor gene family in Gossypium hirsutum L., Agronomy, 8(9): 186.
https://doi.org/10.3390/agronomy8090186
Li Y., Liu Z., Zhang K., Chen S., Liu M., and Zhang Q., 2020, Genome-wide analysis and comparison of the DNA-binding one zinc finger gene family in diploid and tetraploid cotton (Gossypium), PLoS One, 15(6): e0235317.
https://doi.org/10.1371/journal.pone.0235317
Loehlin D., Kim J., and Paster C., 2021, A tandem duplication in Drosophila melanogaster shows enhanced expression beyond the gene copy number, Genetics, 220(3): iyab231.
https://doi.org/10.1093/genetics/iyab231
Lu H., Cui X., Zhao Y., Magwang R., Li P., Cai X., Zhou Z., Wang X., Liu Y., Xu Y., Hou Y., Peng R., Wang K., and Liu F., 2019, Identification of a genome-specific repetitive element in the Gossypium D genome, PeerJ, 8: e8344.
https://doi.org/10.7717/peerj.8344
Nardeli S., Artico S., Aoyagi G., Moura S., Silva T., Grossi-de-Sá M., Romanel E., and Alves-Ferreira M., 2018, Genome-wide analysis of the MADS-box gene family in polyploid cotton (Gossypium hirsutum) and in its diploid parental species (Gossypium arboreum and Gossypium raimondii), Plant Physiology and Biochemistry: PPB, 127: 169-184.
https://doi.org/10.1016/j.plaphy.2018.03.019
Peer Y., Ashman T., Soltis P., and Soltis D., 2020, Polyploidy: an evolutionary and ecological force in stressful times, The Plant Cell, 33: 11-26.
https://doi.org/10.1093/plcell/koaa015
Porturas L., Anneberg T., Curé A., Wang S., Althoff D., and Segraves K., 2019, A meta-analysis of whole genome duplication and the effects on flowering traits in plants, American Journal of Botany, 106(3): 469-476.
https://doi.org/10.1002/ajb2.1258
Ren R., Wang H., Guo C., Zhang N., Zeng L., Chen Y., Ma H., and Qi J., 2018, Widespread whole genome duplications contribute to genome complexity and species diversity in Angiosperms, Molecular Plant, 11(3): 414-428.
https://doi.org/10.1016/j.molp.2018.01.002
Schimmel J., Wezel M., Schendel R., and Tijsterman M., 2022, Chromosomal breaks at the origin of small tandem DNA duplications, BioEssays, 45(1): 2200168.
https://doi.org/10.1002/bies.202200168
Shen C., Wang N., Zhu D., Wang P., Wang M., Wen T., Le Y., Wu M., Yao T., Zhang X., and Lin Z., 2021, Gossypium tomentosum genome and interspecific ultra-dense genetic maps reveal genomic structures, recombination landscape and flowering depression in cotton, Genomics, 113(4): 1999-2009.
https://doi.org/10.1016/j.ygeno.2021.04.036
Soylev A., Le T., Amini H., Alkan C., and Hormozdiari F., 2018, Discovery of tandem and interspersed segmental duplications using high throughput sequencing, BioRxiv, 35(20): 3923-3930.
https://doi.org/10.1093/bioinformatics/btz237
Udall J., Long E., Ramaraj T., Conover J., Yuan D., Grover C., Gong L., Arick M., Masonbrink R., Peterson D., and Wendel J., 2019, The genome sequence of Gossypioides kirkii illustrates a descending dysploidy in plants, Frontiers in Plant Science, 10: 472622.
https://doi.org/10.3389/fpls.2019.01541
Vollger M., Dishuck P., Sorensen M., Welch A., Dang V., Dougherty M., Graves-Lindsay T., Wilson R., Chaisson M., and Eichler E., 2018, Long-read sequence and assembly of segmental duplications, Nature Methods, 16: 88-94.
https://doi.org/10.1038/s41592-018-0236-3
Wang M., Tu L., Yuan D., Zhu D., Shen C., Li J., Liu F., Pei L., Wang P., Zhao G., Ye Z., Huang H., Yan F., Ma Y., Zhang L., Liu M., You J., Yang Y., Liu Z., Huang F., Li B., Qiu P., Zhang Q., Zhu L., Jin S., Yang X., Min L., Li G., Chen L., Zheng H., Lindsey K., Lin Z., Udall J., and Zhang X., 2018, Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense, Nature Genetics, 51: 224-229.
https://doi.org/10.1038/s41588-018-0282-x
Wu S., Han B., and Jiao Y., 2020, Genetic contribution of paleopolyploidy to adaptive evolution in angiosperms, Molecular Plant, 13(1): 59-71.
https://doi.org/10.1016/j.molp.2019.10.012
Yang Z., Qanmber G., Wang Z., Yang Z., and Li F., 2020, Gossypium genomics: trends, scope, and utilization for cotton improvement, Trends in Plant Science, 25(5): 488-500.
https://doi.org/10.1016/j.tplants.2019.12.011
Yao Y., Carretero-Paulet L., and Peer Y., 2019, Using digital organisms to study the evolutionary consequences of whole genome duplication and polyploidy, PLoS One, 14(7): e0220257.
https://doi.org/10.1371/journal.pone.0220257
Yu J., Hui Y., Chen J., Yu H., Gao X., Zhang Z., Li Q., Zhu S., and Zhao T., 2021, Whole-genome resequencing of 240 Gossypium barbadense accessions reveals genetic variation and genes associated with fiber strength and lint percentage, Theoretical and Applied Genetics, 134: 3249-3261.
https://doi.org/10.1007/s00122-021-03889-w
Yu Z., Zheng C., Albert V., and Sankoff D., 2020, Excision dominates pseudogenization during fractionation after whole genome duplication and in gene loss after speciation in plants, Frontiers in Genetics, 11: 603056.
https://doi.org/10.3389/fgene.2020.603056
Zachariah S., and Gray D., 2019, Deubiquitinating enzymes in model systems and therapy: redundancy and compensation have implications, BioEssays, 41(11): 1900112.
https://doi.org/10.1002/bies.201900112
Zhao Y., Guo A., Wang Y., and Hua J., 2019, Evolution of PEPC gene family in Gossypium reveals functional diversification and GhPEPC genes responding to abiotic stresses, Gene, 698: 61-71.
https://doi.org/10.1016/j.gene.2019.02.061
Zou X., Du Y., Wang X., Wang Q., Zhang B., Chen J., Chen M., Doyle J., and Ge S., 2020, Genome evolution in Oryza allopolyploids of various ages: insights into the process of diploidization, The Plant Journal, 105(3): 721-735.
https://doi.org/10.1111/tpj.15066
. HTML
Associated material
. Readers' comments
Other articles by authors
. Xiaojing Yang
. Xiaoyan Chen
Related articles
. Gossypium
. Whole genome duplication
. Polyploidy
. Genetic diversity
. Evolutionary adaptation
Tools
. Post a comment