Comprehensive analysis of phylogenetic relationships within the genus Coffea, tracing the evolutionary origins of cultivated species, wild relatives, and the remarkable diversification across Africa, Madagascar, and the Mascarene Islands.
The genus Coffea comprises approximately 124 species distributed across tropical Africa, Madagascar, and the Mascarene Islands. Phylogenetic analysis reveals the evolutionary relationships among these species and illuminates the origins of cultivated coffee [2][5][7][9].
Phylogenetic studies using molecular markers including chloroplast DNA (trnL-trnF, accD, rpl16), nuclear ribosomal DNA, and genome-wide SNPs have revolutionized our understanding of coffee evolution. These analyses have [1][7][10]:
Recent genomic studies have provided chromosome-level assemblies for C. arabica and its diploid progenitors, enabling detailed analysis of subgenome evolution and the genetic basis of important traits [3][4][8].
Schematic phylogenetic tree based on chloroplast and nuclear markers. Bootstrap values and posterior probabilities support major clades [1][7][10].
Phylogenetic analysis of chloroplast DNA (trnL-trnF intergenic spacer) from 38 tree samples representing 23 Coffea taxa reveals several major clades with strong geographical correspondence [1][7][10].
Distribution: Guinea, Côte d'Ivoire, Liberia, Sierra Leone
Representative species: C. liberica, C. dewevrei, C. abeokutae
Key characteristics: Largest coffee species, disease resistance sources
Notes: Evidence of introgressive hybridization among West African taxa [1][10]
Distribution: Congo Basin, Cameroon, Central African Republic
Representative species: C. canephora, C. congensis, C. klainii
Key characteristics: Source of Robusta coffee, high disease resistance, self-incompatible
Robusta subgroups: Congolese (SG1, SG2, E, C, O, R) and Guinean (D) [7]
Distribution: Ethiopia, Kenya, Tanzania, Uganda, South Sudan
Representative species: C. arabica, C. eugenioides, C. racemosa, C. zanguebariae
Key characteristics: Center of origin for C. arabica, drought tolerance, rapid fruit development [6][9]
Special note: C. arabica is the only allotetraploid in this clade [3][4][8]
Distribution: Madagascar, Mauritius, Réunion, Comoros
Representative species: C. mogeneti, C. perrieri, C. vianneyi (50+ species)
Key characteristics: Highest species diversity in Madagascar, island endemics
Notes: These species are not commercially cultivated but represent important genetic resources
Allotetraploid formation through natural hybridization between Coffea canephora and Coffea eugenioides in the highlands of Ethiopia or South Sudan [3][4][8]
Genomic evidence: Chromosome-level assemblies reveal largely conserved genome structure between diploid parents and descendant subgenomes, with no obvious global subgenome dominance [3][4]
Analysis reveals a split between wild accessions and cultivar progenitors occurred ~30.5 thousand years ago, followed by a period of migration between the two populations [3][4][8]
Several pre-domestication bottlenecks resulted in narrow genetic variation in cultivated arabica [3][4][8]
The low genetic divergence found between the two constitutive genomes of C. arabica and those of its progenitor species support the hypothesis that C. arabica resulted from a very recent speciation event occurring between 10,000 and 50,000 years BC [7]
From its center of origin in Ethiopia, coffee spread to Yemen, where it underwent further domestication and selection [7][8]
Coffee spread from Yemen to Europe and throughout the tropical world, with the journey shaping the worldwide varietal landscape and consequences on breeding strategies [7][8]
The landmark phylogenetic analysis of chloroplast DNA variation in Coffea (1998) sequenced the trnL-trnF intergenic spacer from 38 tree samples representing 23 Coffea taxa and the related genus Psilanthus [1][7][10].
A recent study of the Coffeeae tribe based on three chloroplast markers (accD, rpl16, and trnL-trnF) identified the Argocoffeopsis-Calycosiphonia (AC) clade and formally described Calycosiphonia albertina sp. nov. from the Albertine Rift [1]. Bayesian posterior probability values are indicated at the nodes.
The study showed that molecular data advocated for a position in Calycosiphonia, weakening the morphological distinction between Calycosiphonia and Kupeantha. The previous inclusion of Calycosiphonia pentamera in Kupeantha based on morphology was corroborated by molecular analyses [1].
Phylogenetic analysis of specific gene families provides insights into functional diversification and evolution of important traits [2].
Phylogenetic tree of PPOs revealed:
Phylogenetic tree of DDCs revealed:
Node numbers in phylogenetic trees correspond to the sum of occurrences of pairs of groups or individual sequences that clustered together in a total of 1000 bootstraps; dashed lines represent nodes in which group pairs were clustered together in less than 500 (50%) of the bootstraps [2].
Recent research using DNA sequencing and morphology has elucidated the phylogenetic relationships of two minor coffee crop species from East Africa: Coffea racemosa and C. zanguebariae [6][9].
Distribution: Central and southern Mozambique, northern South Africa (KwaZulu-Natal), eastern Zimbabwe; elevation 0-500 m [6][9]
Habitat: Coastal forest, riverine forest, deciduous woodland, bushland [6][9]
Phylogenetic position: East African clade, closely related to but distinct from C. zanguebariae
Useful traits: Heat tolerance, low precipitation requirement (700-1,600 mm), high precipitation seasonality (dry season tolerance), rapid fruit development (approx. 4 months flowering to mature fruit) [6][9]
Distribution: Southern Tanzania, northern Zimbabwe, northern Mozambique; elevation 10-350 m [6][9]
Habitat: Dry deciduous forest, riverine and coastal thicket [6][9]
Phylogenetic position: East African clade, historically confused with C. racemosa; confirmed as distinct species through molecular analysis
Note: Bridson (1988, 2003) resolved the taxonomic confusion and demonstrated they are separate species [6][9]
Multi-omics analysis using SLAF-seq and transcriptome sequencing of 20 C. arabica accessions revealed genetic relationships within Chinese coffee germplasm [2][5].
| Method | Accessions | Key Findings |
|---|---|---|
| SLAF-seq analysis | 20 C. arabica + 2 C. canephora + 1 C. liberica + 1 C. racemosa | 3,347,069 SLAF tags; 198,955 high-quality SNPs used for phylogenetic tree construction [5] |
| Transcriptome analysis | 20 C. arabica accessions | 128.50 Gb clean reads; 25,872 genes' expression levels used for correlation analysis [5] |
MoccaDB is an integrative database for functional, comparative and diversity studies in the Rubiaceae family, containing phylogenetic information for Coffea species [7].
The database includes a schematic phylogenetic tree adapted from [21] showing the number of successfully amplified/tested markers (percentage) observed for each species. Names of Coffea species follow [22, 23] [7].
Access: Phylogenetic data and marker information are available for comparative studies across the genus Coffea and related genera.
Phylogenetic diversity (PD) provides a framework for prioritizing conservation of coffee genetic resources. Species representing deep branches in the phylogenetic tree (e.g., Malagasy species, West African lineages) contribute unique genetic diversity not found elsewhere.
Cros J., et al. (1998). Molecular Phylogenetics and Evolution 9(1):109-117 [1][10]
trnL-trnF sequencing of 38 samples (23 Coffea taxa). Four major clades with geographic correspondence (West, Central, East Africa, Madagascar). Evidence of introgressive hybridization in West Africa.
View AbstractSalojärvi J., Rambani A., Yu Z., et al. (2024). Nature Genetics [3][4][8]
Chromosome-level assemblies; polyploidy 600k years ago; split between wild/cultivar progenitors ~30.5k years ago; conserved genome structure.
View Abstract(2024). Forests 15(1):163 [2][5]
SLAF-seq and transcriptome analysis of 20 C. arabica accessions; 198,955 SNPs; clear distinction between Typica and Bourbon types; classification of local selections.
View AbstractDavis A.P., et al. (2021). Frontiers in Sustainable Food Systems 5:740137 [6][9]
DNA sequencing and morphology confirm two distinct species; heat tolerance, low precipitation requirement, rapid fruit development.
View AbstractNtore S., Robbrecht E., et al. (2024). European Journal of Taxonomy [1]
Phylogenetic tree based on accD, rpl16, and trnL-trnF markers; AC clade; Bayesian posterior probability values at nodes.
View Abstract(2023). International Journal of Molecular Sciences 24(13):12466 [2]
Neighbor-joining clustering method; 1000 bootstrap replicates; PPO.CAR7/8 and DDC.CCA3/CAR6 show functional diversification.
View FigurePeer-reviewed sources and official reports cited in this research
* Additional references available in the complete Publications Database. All sources have been peer-reviewed and are accessible through academic databases.