What follows is the first in a series of two essays, covering human haplogroups, their origins and distributions. According to geneticists, no more than 90% of existing human mtDNA and yDNA haplogroups have been precisely identified. One theory is that the missing 10% were acquired through the archaic interbreeding between humans and at least two, non-human species. To me, this conjecture smacks of an argument from ignorance, i.e. we do not know where these haplogroups arose; therefore, we (sorta) know where they came from. But, the speculation remains fodder for thought.
- yDNA haplogroups
- clusters of non-recombinant DNA from the Y chromosome passed down the male line
yDNA haplogroups are used as genetic markers - in tracing the ancestry of male individuals to geographically distributed populations. Haplogroups are NOT known to be visible to selection; that is, they are traits, carried by individuals, which do NOT confer either survival or reproductive advantages. (Nor are they known to confer survival or reproductive disadvantages.) Their frequencies are driven by genetic drift. All identified Y haplogroups are the results of down-stream mutations altering the original human haplogroup (A), now estimated to have arisen 140Kyr ago in one male, Adam, the most recent common male ancestor.
In the image below are striking clues pointing towards drift. The pie-charts represent the relative frequency of a haplogroup (or haplogroups) in a given region. As examples - in the Americas among male Amer-Indians, haplogroup Q (light purple) is the most common, and in sub-Saharan West Africa, E1b1a (light blue) is the most common Y haplogroup. The high frequency distributions of these haplogroups on both continents are expectable - in that the Americas and sub-Saharan Africa were (largely) reproductively isolated for tens of thousands of years from the rest of the world.
To view the full size image: click.
The remaining account will be a compact summary of Y haplogroups with my comments and extensions - where necessary.
140Kyr in North East or South West Africa
Namibia (San 66%), Khoisan
44%, Mbuti ("pygmy"), Namibia (Nama 64%), Sudan (Dinka, Shilluk and Nuba) and Ethiopian Jews
The sub-clades of A are:
current populations: Cameroon (Bakola) and Algeria (Berbers)
current populations: Guinea-Bissau, Senegambia (Mandinka) and Mali (Dogon)
current populations: Khoisan and Nama
current populations: Sudan (Nuba and Hausa), Ethiopia (Amhara)
current populations: Eastern and Southern Africa
In Africa, A1a-M32 is found at high frequency in large populations, whose male members carry A, but outside of Africa - in Turkey, Egypt, Palestine, Jordan, Oman, Yemen (Jews) and Sardinia, A1a-M32 shows up at low frequencies in small (localized) populations.
Complete Khoisan and Bantu genomes from southern Africa
70-80Kyr in North West or central West Africa
BT has not been found in any current population; No male has been shown to carry BT (BT*).
In Y haplogroups, paragroups are represented by an asterisk " * ", placed after the main haplogroup nomenclature. Paragroups contain the mutations which define the parent haplogroup, but they do not have any further (known) unique markers. Without these unique markers, they do not form truly independent sub-clades.
60-65Kyr in Central Africa
B is localized among the Baka and Mbuti peoples of the tropical forests of West-Central Africa and the Hadza
of Tanzania. 2.3% of African-American males carry B.
B is the second oldest and a very diverse Y haplogroup, but it is scattered widely and thinly in Africa, suggesting that the carriers of B were displaced by later (5Kyr) flows of people and events. A competing hypothesis runs that the sub-Saharan population dwindled (to ~2K persons at 35Kyr) and that there were few remaining carriers of B around to have been displaced by even much later migrations of people.
Some of the sub-clades of B are:
current population: southern Cameroon (Bamileke 4%)
current population: Burkina Faso (Mossi 2%)
current populations: Congo (Mbuti), southern Cameroon (Bakola), Namibia (Dama) and Central African Republic (Biaka "pygmy")
current populations: Congo (Mbuti 8%), Cameroon (Tupuri 11%), Mali (Dogon 6%) and Kenya (Kikuyu and Kamba 2%)
current population: northern Cameroon
current populations: Cameroon, Central African Republic, Tanzania, Kenya, Ethiopia, South Africa, Zimbabwe, Sudan, Egypt (2%), Southern Iran (3%), African-Americans (1.5%), Pakistan and India
current populations: Central African Republic (Baka 67%), Tanzania (Hadza 51%), Congo (Mbuti 43%), Namibia (San 31%)
current populations: Central African Republic (Baka 67% and Biaka 45%) and Congo (Mbuti 21%)
current populations: Central African Republic (Biaka 20%)
68.5Kyr in East Africa
CT is often referred to as "Eurasian Adam" - the most recent common ancestor of all non-Africans males. This hypothetical male is supposed to have existed in Africa, immediately prior to the exodus of Anatomically Modern Humans. CT is the considered the common ancestral lineage of most men living today, though no male has been shown to carry CT (CT*). However, mutations M168, P9.1 and M294 have been found in all males tested with the exception of those exclusively carrying A and B (Sub-Saharan) haplogroups.
Paragroup (CT*) contains the mutations which define the parent haplogroup (M168, P9.1 and M294, but it does not have any further (known) unique markers.
55-65Kyr in Southwest Asia
No male has been shown to carry CF (CF*). However, it is the hypothetical ancestor of haplogroups C and F.
50-60Kyr in the Middle East or South Asia
Northern Eurasia, Eastern Eurasia, Oceania and the Americas
C appeared shortly after humans expanded from Africa, and it may have originated and diversified in India or along the coasts of South Asia. After the African exodus, C (as with D below) was spread as a "Great Coastal Migration"
from Arabia to Southeast Asia then northward into East Asia.
current populations: India, Sri Lanka and South East Asia
The sub-clades of C are:
current population: the rarest lineage of C. Found in Japan.
current populations: New Guinea, Melanesia and Polynesia
current populations: high frequency in Polynesia
current populations: Northern Asia, Japan (Ainu 15%), the Americas and Eastern and Central Europe
C3 was probably spread to Europe by the Huns in the Middle ages.
current populations: Amer-Indians (Na-Dene
, Algonquian and Siouan-speaking populations)
current populations: Siberia, Mongolia and Central Asia
current population: aboriginal Australians
current population: India
65Kyr in Africa or Asia
Africa and East Asia (very rare)
65-70Kyr in Asia
Central Asia, Southeast Asia and Japan
D like C represents the "Great Coastal Migration" from Arabia to Southeast Asia then northward into East Asia. Found today at high frequency in Tibet, Japan and the Andaman Islands.
The sub-clades of D are:
current populations: Central Asia, East Asia, and Southeast Asia
Tibet (12.5%) and China (Quang 23%)
current populations: Ainu, Japanese (35%) and Ryukyuans
current populations: China (Pumi and Naxi) and Tibet
50-55Kyr in East Africa or the Middle East
sub-Saharan, North and North East Africa, the Near East and Europe
Originating in North East Africa or the Middle East, E was later introduced to West Africa, where it spread (5Kyr ago) to Central, Southern and South Eastern Africa with the Bantu
migrations, swamping (or displacing) populations whose members carried A and B.
The sub-clades of E are:
origin: 45-55Kyr in Africa
mutations: L504, L511, P147
mutations: L633, M33, M132
current populations: sub-Saharan Africa, North Africa, the Near East and Europe
current populations: E1b1a is the most common Y haplogroup in sub-Saharan Africa, reaching a frequency of ~99% in Central West Africa.
current populations: About 14Kyr, E1b1b spread throughout North and North East Africa, the Near East and later Europe. E1b1b is the third most frequent Y haplogroup in Europe.
current populations: E2 is found in East, Southern, Central and West Africa. The highest frequencies are among Bantu males of Kenya and South Africa.
45Kyr in North Africa, the Middle East or South Asia
North Africa, the Near East, Europe and South Asia
F is frequently referred to as "the second-wave out of Africa"
. F is the parent of all Y haplogroups G through T, and it contains more than 90% of the world's non-sub-Saharan male population. Some male populations carrying F later migrated back to North Africa. For a discussion of the bi-directional gene flow
between North Africa and the Near East - see: The Levant versus the Horn of Africa:
Evidence for Bidirectional Corridors of Human Migrations.
The sub-clades of F are:
15-35Kyr in Near East or Southern Asia
Iran, the Caucasus (~60% of Ossetian males), ~10% of Jewish males, Turkey and Pakistan
G was one of the "F-scale" haplogroups "injected" into the (R1b and I) populations of Old Europe during the Neolithic expansion of peoples from the Near East about 10Kyr.
The sub-clades of G are:
origin: 5Kyr in Iran
current populations: Iran, Turkey, Kazakhstan and the southern and Northern Caucasus
G1 is relatively rare in Europe.
origin: possibly 3Kyr in Anatolia
current populations: Caucasus, Southwest and Southern Asia
G2 is more common than G1.
current populations: Caucasus, Eastern Europe and Ashkenazi Jews
current populations: Southwest and southern Asia, Corsica and Sardinia
, the Iceman preserved for over 5Kyr in the icy Italian Alps, belongs to G2a1b.
current populations: Turkey (5%), Greece (5%), Iraq (Kurds), Italy, Spain, Netherlands and Switzerland
current populations: Europe and Turkey (Armenia)
Haplogroup G Project
25-45Kyr in India, Iran or the Middle East
Europe (Romani), India and Sri Lanka
current populations: India (Dravidians 33%), Sri Lanka (Sinhalese) and Nepal
current populations: Europe (Romani people), India and Cambodia
35-40Kyr Southwest Asia
IJ is a hypothetical haplogroup, considered to have given rise to I and J. Some speculate that when Cro-Magnon males entered Europe (~40Kyr), they carried IJ; Under the theory - after it picked up a mutation, the dominant halplogroup in the male population of Europe (prior to the Last Glacial Maximum, LGM) ~25Kyr was I1.
25-30Kyr in Europe or the Middle East
I is carried by the descendants of men who are believed
to have arrived in Europe from the Middle East 20-25Kyr ago; They were associated with the Gravettian
culture (22-28Kyr). I is the second most common Y haplogroup in North West Europe after R1b. 25% of males in Europe: the Balkans, Germany, Scandinavia and North Western Europe carry I. (Bosnia and Herzegovina 65%, Norway 40%, Denmark, 39%, Germany 24% and England 20%)
However, a competing theory runs that I was the oldest
Y haplogroup to appear in Europe, and that it (not R1b) was carried by the descendants of Cro-Magnon
- at about 25Kyr.
The sub-clades of I are:
origin: 15-25Kyr in Europe
current populations: found in 35% of the Scandinavian population (Southern Norway, South Western Sweden and Denmark), Iceland and Northwestern Europe
I1 is associated with the Viking conquest of Britian.
origin: 15Kyr in Poland or south eastern Europe
current populations: Bosnia and Herzegovina, Croatia, Serbia, Sardinia, Spain (Basques), Denmark, Germany and Sweden
Phylogeography of Y-Chromosome Haplogroup I
Reveals Distinct Domains of Prehistoric Gene Flow in Europe
30Kyr in Southwest Asia
Arabia, the Near East, Southern Europe, Central Asia, South Asia, North Africa and the Horn of Africa
The distribution of haplogroups J, R1b and T among the ancient (pre-Western colonial) populations of Africa is closely correlated with the language distribution of the Afro-Asiatic superfamily.
The sub-clades of J are:
origin: 15-24Kyr in Western Asia
current populations: Southwest Asia, North Africa and Ethiopia
origin: 18.5Kyr in Turkey or Fertile Cresent
current populations: Turkey, the Levant, Mesopotamia, the South Caucasus, Iran, Central Asia and South Asia
J2 spread into the Mediterranean area with the expansion of agricultural peoples from the Near East during the early Neolithic (~10Kyr). 29% of Sephardic Jews and 23% of Ashkenazi Jews carry J2.
47Kyr in South Western or Central Asia
Asia, Europe and the Americas
25-30Kyr in Iran or Southern Central Asia
India (Dravidian upper and middle castes), Pakistan, the Near East and Europe
L may have been (with the exception of J2) the original Y haplogroup of the creators of the Indus Valley Civilization.
32-47Kyr in Southeast Asia
Indonesia, Melanesia, Micronesia and Polynesia
In Western New Guinea, M is the most frequent male haplogroup.
15-20Kyr in Southeast Asia
Siberia, Eurasia and Europe (Finland 60%, Latvia and Lithuania 40%, Russia 20%)
The sub-clade of N is:
current population: Siberia and northern Europe
current populations: Kazakhstan, Korea and China
35Kyr in Siberia or Central Asia
80-90% of all men in East and Southeast Asia carry O.
The sub-clades of 0 are:
current populations: Malaysia, Vietnam, Indonesia and southern China
current populations: Japan and Korea
current populations: China
27-41Kyr in Central Asia or Southern Siberia
P is the parent haplogroup of R and Q. It contains the patrilineal ancestors of most Europeans and most Amer-Indians.
15-20k in Siberia
Humans colonized Siberia by ~45Kyr. It is striking
that 10-20Kyr after the African exodus, some of its descendants appear
to have made a bee-line
straight to (and successfully inhabited) Siberia - one of the coldest regions on the planet. Routinely - during Siberian winters, temperatures drop to -60F. Haplogroup Q (and possibly O and P) arose in Siberia.
Amer-Indians and North Eurasians
Q is the most common haplogroup among Amer-Indian males.
The sub-clade of Q is:
origin: Beringia 10-15Kyr
current population: Q1a3a1 is almost
exclusively associated with the Amer-Indian population. Though found in Siberia at low frequencies, it may have been the result of ancient back-flow from North America, before it became isolated from the rest of the world.
THE PEOPLING OF THE NEW WORLD:
Perspectives from Molecular Anthropology
20-35Kyr in Central or South Asia
Europe, Central and South Asia, the Middle East and Africa
The sub-clades of R are:
origin: 12-25Kyr in Central or South Asia
current populations: Europe, Western Asia, Africa, Siberia and the Americas
R1 is relatively common among male Amer-Indians - in North Eastern Canada and the US, triggering speculation that R1 was brought to the Americas recently during the time of the European Conquest.
R1 is believed to have existed long before the end of last Ice Age. It has been associated with the Aurignacian culture (32-21Kyr). Archeological evidence supports the view that the Aurignacian culture arrived from Anatolia during the Upper Paleolithic (rather than earlier theories which tied this culture to the Iranian plateau). The Aurignacian culture and Cro-Magnon, the first modern humans to enter Europe 35-40Kyr, are linked. However, the contention that Cro-Magnon males carried R1 has been challenged. Any link, connecting R1 to the Aurignacian culture, is weak as some
estimates suggest that R1 arose only 18.5Kyr ago.
origin: 18.5Kyr in Asia, South Asia, Central Asia, Middle East or Eastern Europe
current populations: Its distribution is associated with the re-settlement of Eurasia following the LGM - 18-22Kyr.
origin: 18.5Kyr in the Eurasian Steppes
current populations: R1a1a, common in Europe, is associated with the expansion of the Kurgan
people who spread Indo-European languages to Central Asia, India, Sri Lanka, Central, Northern and Eastern Europe. The Kurgan's were pastoral nomads, who rode the horse and chariot, shot a compound bow, smelted bronze and worshipped the sky god. They conquered (or co-opted) many cultures, notably Greece and the Indus Valley civilization. They also invaded Babylon, establishing the 500 year long Kassite dynasty.
origin: 18.5Kyr in Western Asia
current populations: R1b is the most common Y haplogroup in Western Europe. The present-day male population of Western Europe, carrying R1b, is believed
to have descended from a "refugium
" in the Iberian Peninsula (Portugal and Spain) during the LGM, where the R1b1b2 haplogroup achieved a "genetic homogeneity". After the ice sheets receded in Europe, these R1b carrying males (in part) re-colonized Europe. However, see the contrary discussion regarding R1b here
. It is speculated that in Old Europe the dominant Paleolithic (pre-LGM) Y haplogroup was I (not R1b).
origin: 18.5Kyr(?) in Central Asia or South Central Siberia
mutation: P25 and M269
current populations: Most of the present-day European males carrying the M343 (R1b) marker also have the P25 and M269 markers, which define the R1b1b2 as a subclade.
current populations: Northern Cameroon
R1b1* represents archaic gene flow from Eurasia into sub-Saharan Africa (~22Kyr). Modern-day populations of Northern Cameroon speak Chadic languages, which are classified as an ancient branch of the Afro-Asiatic superfamily of languages. The extinct language of the Ancient Egyptians belonged to this superfamily.
A Back Migration from Asia to Sub-Saharan Africa
Is Supported by High-Resolution Analysis of Human Y-Chromosome Haplotypes
origin: 25Kyr in South Central Asia
current populations: India, Pakistan and Sri Lanka
The Genetic Legacy of Paleolithic Homo sapiens sapiens in Extant Europeans
28-41Kyr in Southeast Asia (New Guinea)
New Guinea (~50%), Indonesia and Melanesia
The sub-clade of S is:
19-34Kyr in Western Asia
India, Egypt, Oman, Tanzania, Ethiopia and Morocco
T is found at low frequencies in Europe and the Middle East. Thomas Jefferson
, the 3rd President of the US, carried T.
The sub-clade of T is: