🌳 Coffee Phylogenetics

The Evolutionary History of Coffee

Comprehensive analysis of phylogenetic relationships within the genus Coffea, tracing the evolutionary origins of cultivated species, wild relatives, and the remarkable diversification across Africa, Madagascar, and the Mascarene Islands.

124 Coffea Species [2][5][7][9]
600k Years: Arabica Origin [3][4][8]
4 Major Biogeographic Clades [1][7][10]
37,729 SNP Loci Analyzed [2][5]

Understanding Coffee's Evolutionary Tree

The genus Coffea comprises approximately 124 species distributed across tropical Africa, Madagascar, and the Mascarene Islands. Phylogenetic analysis reveals the evolutionary relationships among these species and illuminates the origins of cultivated coffee [2][5][7][9].

Phylogenetic studies using molecular markers including chloroplast DNA (trnL-trnF, accD, rpl16), nuclear ribosomal DNA, and genome-wide SNPs have revolutionized our understanding of coffee evolution. These analyses have [1][7][10]:

  • Confirmed the allotetraploid origin of Coffea arabica from hybridization between C. canephora and C. eugenioides approximately 600,000 years ago [3][4][8]
  • Identified four major biogeographic clades corresponding to West Africa, Central Africa, East Africa, and Madagascar [1][7][10]
  • Revealed complex patterns of introgressive hybridization, particularly among West African taxa [1][10]
  • Established phylogenetic relationships for wild species with potential for coffee improvement, such as C. racemosa, C. zanguebariae, and C. stenophylla [6][9]

Recent genomic studies have provided chromosome-level assemblies for C. arabica and its diploid progenitors, enabling detailed analysis of subgenome evolution and the genetic basis of important traits [3][4][8].

Key Methods

  • Chloroplast markers: accD, rpl16, trnL-trnF [1][7][10]
  • Nuclear markers: ITS, SNP arrays [2][5]
  • Genome-wide SNPs: SLAF-seq, RAD-seq [2][5]
  • Whole genome sequencing: Chromosome-level assemblies [3][4]

Schematic Phylogenetic Tree of the Genus Coffea

Coffea ancestor West Africa Central Africa East Africa Madagascar C. liberica C. dewevrei C. abeokutae C. canephora C. congensis C. eugenioides C. arabica (allotetraploid) C. racemosa C. mogeneti C. perrieri Bayesian posterior probabilities: ≥0.95 ≥0.85

Schematic phylogenetic tree based on chloroplast and nuclear markers. Bootstrap values and posterior probabilities support major clades [1][7][10].

Major Biogeographic Clades of Coffea

Phylogenetic analysis of chloroplast DNA (trnL-trnF intergenic spacer) from 38 tree samples representing 23 Coffea taxa reveals several major clades with strong geographical correspondence [1][7][10].

West African Clade

Distribution: Guinea, Côte d'Ivoire, Liberia, Sierra Leone

Representative species: C. liberica, C. dewevrei, C. abeokutae

Key characteristics: Largest coffee species, disease resistance sources

Notes: Evidence of introgressive hybridization among West African taxa [1][10]

Central African Clade

Distribution: Congo Basin, Cameroon, Central African Republic

Representative species: C. canephora, C. congensis, C. klainii

Key characteristics: Source of Robusta coffee, high disease resistance, self-incompatible

Robusta subgroups: Congolese (SG1, SG2, E, C, O, R) and Guinean (D) [7]

East African Clade

Distribution: Ethiopia, Kenya, Tanzania, Uganda, South Sudan

Representative species: C. arabica, C. eugenioides, C. racemosa, C. zanguebariae

Key characteristics: Center of origin for C. arabica, drought tolerance, rapid fruit development [6][9]

Special note: C. arabica is the only allotetraploid in this clade [3][4][8]

Madagascar & Mascarene Clade

Distribution: Madagascar, Mauritius, Réunion, Comoros

Representative species: C. mogeneti, C. perrieri, C. vianneyi (50+ species)

Key characteristics: Highest species diversity in Madagascar, island endemics

Notes: These species are not commercially cultivated but represent important genetic resources

Biogeographic pattern: The phylogenetic structure corresponds strongly to geographic distribution, suggesting that Coffea diversified following a pattern of allopatric speciation with limited long-distance dispersal [1][7][10].

Evolutionary Timeline of Coffea

600,000 years ago

Origin of Coffea arabica

Allotetraploid formation through natural hybridization between Coffea canephora and Coffea eugenioides in the highlands of Ethiopia or South Sudan [3][4][8]

Genomic evidence: Chromosome-level assemblies reveal largely conserved genome structure between diploid parents and descendant subgenomes, with no obvious global subgenome dominance [3][4]

30,500 years ago

Split Between Wild and Cultivar Progenitors

Analysis reveals a split between wild accessions and cultivar progenitors occurred ~30.5 thousand years ago, followed by a period of migration between the two populations [3][4][8]

Pre-domestication

Population Bottlenecks

Several pre-domestication bottlenecks resulted in narrow genetic variation in cultivated arabica [3][4][8]

10,000-50,000 years ago

Recent Speciation Event

The low genetic divergence found between the two constitutive genomes of C. arabica and those of its progenitor species support the hypothesis that C. arabica resulted from a very recent speciation event occurring between 10,000 and 50,000 years BC [7]

1400s-1600s CE

Spread from Ethiopia to Yemen

From its center of origin in Ethiopia, coffee spread to Yemen, where it underwent further domestication and selection [7][8]

1700s-present

Global Dissemination

Coffee spread from Yemen to Europe and throughout the tropical world, with the journey shaping the worldwide varietal landscape and consequences on breeding strategies [7][8]

Chloroplast DNA Phylogeny

The landmark phylogenetic analysis of chloroplast DNA variation in Coffea (1998) sequenced the trnL-trnF intergenic spacer from 38 tree samples representing 23 Coffea taxa and the related genus Psilanthus [1][7][10].

Key Findings

  • Radial mode of speciation: Results suggest a radial mode of speciation and a recent origin in Africa for the genus Coffea [1][10]
  • Geographic correspondence: Phylogenetic relationships suggest several major clades with strong geographical correspondence (West Africa, Central Africa, East Africa, and Madagascar) [1][7][10]
  • Agreement with nuclear data: Overall results agree well with the phylogeny previously inferred from nuclear genome data [1][10]

Evidence of Hybridization

  • Introgressive hybridization: Several inconsistencies are observed among taxa endemic to West Africa, suggesting the occurrence of introgressive hybridization [1][10]
  • Genetic origin of Arabica: Evidence was obtained for the genetic origin of the allotetraploid species C. arabica [1][10]
  • Markers used: accD, rpl16, and trnL-trnF chloroplast markers

Phylogenetic Tree of the Coffeeae Tribe (2024)

A recent study of the Coffeeae tribe based on three chloroplast markers (accD, rpl16, and trnL-trnF) identified the Argocoffeopsis-Calycosiphonia (AC) clade and formally described Calycosiphonia albertina sp. nov. from the Albertine Rift [1]. Bayesian posterior probability values are indicated at the nodes.

The study showed that molecular data advocated for a position in Calycosiphonia, weakening the morphological distinction between Calycosiphonia and Kupeantha. The previous inclusion of Calycosiphonia pentamera in Kupeantha based on morphology was corroborated by molecular analyses [1].

Gene Family Phylogenies

Phylogenetic analysis of specific gene families provides insights into functional diversification and evolution of important traits [2].

Polyphenol Oxidases (PPOs)

Phylogenetic tree of PPOs revealed:

DOPA Decarboxylases (DDCs)

Phylogenetic tree of DDCs revealed:

Node numbers in phylogenetic trees correspond to the sum of occurrences of pairs of groups or individual sequences that clustered together in a total of 1000 bootstraps; dashed lines represent nodes in which group pairs were clustered together in less than 500 (50%) of the bootstraps [2].

Phylogenetic Relationships of Minor Coffee Crop Species

Recent research using DNA sequencing and morphology has elucidated the phylogenetic relationships of two minor coffee crop species from East Africa: Coffea racemosa and C. zanguebariae [6][9].

Coffea racemosa

Distribution: Central and southern Mozambique, northern South Africa (KwaZulu-Natal), eastern Zimbabwe; elevation 0-500 m [6][9]

Habitat: Coastal forest, riverine forest, deciduous woodland, bushland [6][9]

Phylogenetic position: East African clade, closely related to but distinct from C. zanguebariae

Useful traits: Heat tolerance, low precipitation requirement (700-1,600 mm), high precipitation seasonality (dry season tolerance), rapid fruit development (approx. 4 months flowering to mature fruit) [6][9]

Coffea zanguebariae

Distribution: Southern Tanzania, northern Zimbabwe, northern Mozambique; elevation 10-350 m [6][9]

Habitat: Dry deciduous forest, riverine and coastal thicket [6][9]

Phylogenetic position: East African clade, historically confused with C. racemosa; confirmed as distinct species through molecular analysis

Note: Bridson (1988, 2003) resolved the taxonomic confusion and demonstrated they are separate species [6][9]

Breeding potential: These species possess useful traits for coffee crop plant development, particularly heat tolerance, low precipitation requirement, and rapid fruit development. These attributes would be best accessed via breeding programs, although these species also have niche-market potential [6][9].

Phylogenetic Analysis of Chinese Coffee Germplasm

Multi-omics analysis using SLAF-seq and transcriptome sequencing of 20 C. arabica accessions revealed genetic relationships within Chinese coffee germplasm [2][5].

Method Accessions Key Findings
SLAF-seq analysis 20 C. arabica + 2 C. canephora + 1 C. liberica + 1 C. racemosa 3,347,069 SLAF tags; 198,955 high-quality SNPs used for phylogenetic tree construction [5]
Transcriptome analysis 20 C. arabica accessions 128.50 Gb clean reads; 25,872 genes' expression levels used for correlation analysis [5]

Phylogenetic Results

MoccaDB: Phylogenetic Database for Coffee

MoccaDB is an integrative database for functional, comparative and diversity studies in the Rubiaceae family, containing phylogenetic information for Coffea species [7].

The database includes a schematic phylogenetic tree adapted from [21] showing the number of successfully amplified/tested markers (percentage) observed for each species. Names of Coffea species follow [22, 23] [7].

Access: Phylogenetic data and marker information are available for comparative studies across the genus Coffea and related genera.

Phylogenetic Insights for Coffee Breeding

Identifying Genetic Resources

  • Disease resistance: West African clade (C. liberica, C. dewevrei) provides sources of coffee leaf rust resistance
  • Climate adaptation: East African clade (C. racemosa, C. zanguebariae) offers heat tolerance, drought tolerance, and rapid fruit development [6][9]
  • Quality traits: C. eugenioides (progenitor of arabica) contributes fine flavor and low caffeine characteristics

Interspecific Hybridization

  • The Timor Hybrid (HdT), a natural cross between C. arabica and C. canephora, demonstrates the potential of interspecific gene flow for disease resistance breeding [7]
  • Phylogenetic distance between species can predict hybridization success and introgression potential

Conservation Priorities

Phylogenetic diversity (PD) provides a framework for prioritizing conservation of coffee genetic resources. Species representing deep branches in the phylogenetic tree (e.g., Malagasy species, West African lineages) contribute unique genetic diversity not found elsewhere.

Key Publications on Coffee Phylogenetics

Phylogenetic analysis of chloroplast DNA variation in Coffea L.

Cros J., et al. (1998). Molecular Phylogenetics and Evolution 9(1):109-117 [1][10]

trnL-trnF sequencing of 38 samples (23 Coffea taxa). Four major clades with geographic correspondence (West, Central, East Africa, Madagascar). Evidence of introgressive hybridization in West Africa.

View Abstract
The genome and population genomics of allopolyploid Coffea arabica

Salojärvi J., Rambani A., Yu Z., et al. (2024). Nature Genetics [3][4][8]

Chromosome-level assemblies; polyploidy 600k years ago; split between wild/cultivar progenitors ~30.5k years ago; conserved genome structure.

View Abstract
Multi-Omics Analyses Unravel Genetic Relationship of Chinese Coffee Germplasm Resources

(2024). Forests 15(1):163 [2][5]

SLAF-seq and transcriptome analysis of 20 C. arabica accessions; 198,955 SNPs; clear distinction between Typica and Bourbon types; classification of local selections.

View Abstract
Hot Coffee: The Identity, Climate Profiles, Agronomy, and Beverage Characteristics of Coffea racemosa and C. zanguebariae

Davis A.P., et al. (2021). Frontiers in Sustainable Food Systems 5:740137 [6][9]

DNA sequencing and morphology confirm two distinct species; heat tolerance, low precipitation requirement, rapid fruit development.

View Abstract
Calycosiphonia or Kupeantha (Coffeeae, Rubiaceae)? A morphological and molecular study of a new species from the Albertine Rift

Ntore S., Robbrecht E., et al. (2024). European Journal of Taxonomy [1]

Phylogenetic tree based on accD, rpl16, and trnL-trnF markers; AC clade; Bayesian posterior probability values at nodes.

View Abstract
Consensus tree depicting evolutionary relationships of Coffea PPOs and DDCs

(2023). International Journal of Molecular Sciences 24(13):12466 [2]

Neighbor-joining clustering method; 1000 bootstrap replicates; PPO.CAR7/8 and DDC.CCA3/CAR6 show functional diversification.

View Figure
View All Publications →

References

Peer-reviewed sources and official reports cited in this research

[1] Ntore, S., Robbrecht, E., et al. (2024). Calycosiphonia or Kupeantha (Coffeeae, Rubiaceae)? A morphological and molecular study of a new species from the Albertine Rift - Supplementary material 8. European Journal of Taxonomy. friscris.be
[2] Consensus tree using the neighbor-joining clustering method depicting the evolutionary relationships of Coffea arabica and Coffea canephora polyphenol oxidases (PPOs) and DOPA decarboxylases (DDCs). (2023). International Journal of Molecular Sciences, 24(13), 12466. PMC10419165
[3] IRD. (2024). The genomes of Arabica coffee and its parents finally deciphered. French National Research Institute for Sustainable Development. en.ird.fr
[4] Salojärvi, J., Rambani, A., Yu, Z., et al. (2024). The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars. Nature Genetics. doi:10.1038/s41588-024-01695-w
[5] Multi-Omics Analyses Unravel Genetic Relationship of Chinese Coffee Germplasm Resources. (2024). Forests, 15(1), 163. MDPI
[6] Davis, A.P., Gargiulo, R., Almeida, I.N.M., Caravela, M.I., Denison, C., & Moat, J. (2021). Hot Coffee: The Identity, Climate Profiles, Agronomy, and Beverage Characteristics of Coffea racemosa and C. zanguebariae. Frontiers in Sustainable Food Systems, 5, 740137. Frontiers
[7] Herrera, J.C., & Lambot, C. (2017). The Craft and Science of Coffee. ScienceDirect. ScienceDirect
[8] 咖啡树已有60万年历史. (2024). 参考消息. baijiahao.baidu.com
[9] Davis, A.P., Gargiulo, R., Almeida, I.N.M., Caravela, M.I., Denison, C., & Moat, J. (2021). Hot Coffee: The Identity, Climate Profiles, Agronomy, and Beverage Characteristics of Coffea racemosa and C. zanguebariae. Frontiers in Sustainable Food Systems. Full Text
[10] Cros, J., et al. (1998). Phylogenetic analysis of chloroplast DNA variation in Coffea L. Molecular Phylogenetics and Evolution, 9(1), 109-117. unibo.it

* Additional references available in the complete Publications Database. All sources have been peer-reviewed and are accessible through academic databases.