6
Genetic Stratigraphy of Key Demographic Events in Arabia
Verónica Fernandes,
Petr Triska,
Joana B. Pereira,
Farida Alshamali,
Teresa Rito,
Alison Machado,
Zuzana Fajkošová,
Bruno Cavadas,
Viktor Černý,
Pedro Soares,
Martin B. Richards ,
Luísa Pereira
PLOS
Published: March 4, 2015
https://doi.org/10.1371/journal.pone.0118625
The obvious missing element in those studies was the whole-mtDNA sequencing of Arabian JT lineages, which we have performed here, providing a detailed phylogeographic analysis in Supplemental Material (outline topology in S1 and S2 Figs.; S1 Text). Following the pattern for the remaining N lineages, the frequency and diversity maps (S3, S4, S5, S7, S12, S13, S16 and S19 Figs.; S3 and S4 Tables) of JT lineages, displaying similarity across the Near East and Arabian Peninsula, as well as the many basal Arabian lineages (S8, S9, S10, S11, S14, S15, S17, S18, S20, S21, S22 and S23 Figs.), suggest that both regions were in close contact throughout the late Pleistocene and Holocene. Haplogroup J assumes a more important role in Arabia overall than haplogroup T, as testified by frequencies (between 7.7–20.6% and 3.2–10.2%, respectively) and the many star-like J sub-clades observed in Arabia, dating to ∼6–7 ka. These expansions in haplogroup J are reflected in the BSP analysis (S6 Fig.), for which the main increase in effective size was between 8–12 ka in Arabia (S6A Fig.), after the expansion observed in the Near East around 11–15 ka (S6B Fig.). Haplogroup J also shows signs of having crossed into eastern Africa, particularly the sub-clade J1d1a1, necessarily after its emergence in Arabia at ∼7.1 ka (S14 Fig.). Thus haplogroup JT indicates that demographic expansion in Southwest Asia was a continuous phenomenon from the Late Glacial period to the Neolithic period.
At the Younger Dryas/Neolithic boundary, 34–41% of lineages, mainly unclassified HV, R0a, J1b, T1a and M1 migrated to Arabia. The remaining 12–19% moved very recently, ∼1 ka, and consists of derived lineages, (including J1d1a, K1, HV8 and N1a3). Although it is hard to discriminate clearly between the Near Eastern and Pakistan/Iranian influences, due to their largely shared mtDNA pool, the results suggest a higher Pakistan/Iranian impact in the east (41%) than in the west (25%) of Arabia for private founders, but just 14% and 11%, respectively, when considering the overall pool. This seems to indicate that the Pakistan/Iranian contribution was recent, as the lineages introduced from this region did not reach high frequencies, and as expected its impact was higher in the eastern Arabian countries.
Although it is not possible to date securely events as old as the ones occurring in the Pleistocene/Holocene transition based on genome-wide data alone, it is interesting to observe how the patterns of shared genome-wide ancestry support the inferences made for the mtDNA. All the Arabian populations form a close group with Near East populations in PC analysis (Fig. 3), with the first component explaining 44% of the diversity and partitioning populations along a west–east axis, and the second component explaining 8% and organising populations on a north–south axis. A few individuals in Arabian populations most probably had recent ancestry within Africa (especially for Yemen) or Pakistan (in the United Arab Emirates; UAE). Yemen shows the highest dispersion along the first axis, testifying again the higher African input in the closest country to the Horn of Africa. We confirmed the clustering of Yemeni Jews with Bedouin and Saudi Arabians, already identified previously [23], and probably indicating that they were less open to recent admixture with non-Arabian populations than their Yemeni Arab/Muslims neighbours.
The ADMIXTURE results indicate that K = 6 (Fig. 4 and Table 1; other K plots are displayed in S38 Fig.) is the number of clusters that best represents the population structure of the analysed populations. Here it is already possible to distinguish between a Southwest Asian/Caucasian and an Arabian/North African component; these two components have similar proportions of ∼30% each in Yemen and UAE, but the Arabian/North African proportion increases to 52–60% in Saudi and Bedouin. In Near Eastern populations, correspondingly, the Southwest Asian/Caucasian component rises to ∼50% and the Arabian/North African cluster decreases to ∼20–30%, even in Palestinians (similar to the Samaritans and some of the Druze), highlighting their primarily indigenous origin, with the most extreme values for the Druze, carrying the Southwest Asian/Caucasian component at ∼80%.
European background is higher in Near Eastern populations (around 9–15%) than in Arabia (1.5–5%) while the African ancestry is ∼25% in Yemen, and then 4–8% in all Arabian and Near East populations except in Samaritans and Druze, with 0–2%. The UAE has a substantial pool from South Asia (21%) similar to the proportion displayed in Iran (24%), which falls to below 10% in all other Arabian and Near Eastern populations, except Turkey (18%).
ADMIXTURE allows us to calculate FST values between the components in order to quantify their similarity (Fig. 5A). For K = 6, Arabia showed a lower distance from the Near East (0.046), than from Europe (0.052), eastern Africa (0.098) and finally western Africa (0.140). Arabia and the Near East have similar genetic distances from eastern African (0.098 and 0.097, respectively), double that of the value between western and eastern Africa (0.046). When evaluating FST values in pairwise comparisons between Arabian and Near Eastern populations (Fig. 5B), we see that FST values are higher between Yemen and all other populations (and also for comparisons with Samaritans, but these results may be biased by low sample size). The UAE is closer to Jordan, Syria and Lebanon than Saudi Arabia is; while Saudi are closer to Palestinians, Druze and Samaritans than UAE. Thus, FST values support lower or similar genetic distances between UAE and Near Eastern populations as between Saudi and Near Eastern populations, while Yemen is clearly more divergent.
The genome-wide analyses performed here on the available data from Arabian populations provide estimates of African admixture, with disentanglement between western and eastern African gene pool contributions (Table 1). The eastern African background is around 4.0% in Saudi and Bedouin, ∼7.7% in Yemen (although Yemen Jews have a lower admixture of 5.1%), and 1.8% in UAE; this input decreases beyond Jordan, and is negligible in Samaritans, Druze, Turks and Iranians. The western African component also varies between 2.0 and 6.4%, except for Yemen (16.9%) where it has likely been inflated due to indirect recent migration (the Bantu component which is present in many eastern African populations). The ROLLOFF estimates for the event of admixture were 8–27 generations ago when using eastern Africa as parental population, and 8–37 generations using a western African source.
Both date estimates are compatible with the Arab slave trade, which operated between the 6th and 19th centuries AD, mainly from eastern Africa (from Nubia to Zanzibar), although many of these populations bear a significant western African component (as shown in Fig. 4). These values are in agreement with the estimates of Moorjani et al. [1] for Levantine groups, showing a 4–15% African ancestry and about 32 generations ago for the event of admixture, interpreted as consistent with close political, economic, and cultural links with Egypt in the late Middle Ages. They also estimated 72 generations ago for the event leading to 3–5% sub-Saharan ancestry in diverse Jewish populations, arguing that this reflecting descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diaspora.
The phylogenetic analyses for N(xR) lineages performed by Fernandes et al. [24] also provided insights into back-to-Africa movements, evidently at various time periods. Some lineages (I, N1a and N1f) displayed deep branches in eastern Africa, a sign of introduction in Africa which could have begun as early as ∼40 ka (the upper bound defined by the TMRCA of the founder clades) and extending till ∼15 ka (the lower bound defined by the TMRCA of the derived African clades). The migration of J1d1a lineages into eastern Africa in the Neolithic period is confirmed in the whole-mtDNA sequencing (S14 Fig.) and complemented by the frequency interpolation and founder analysis (S13 Fig.) performed here.
From the genome-wide results, we can infer this back-to-Africa migration was considerable, leading to a proportion of 12% of Near Eastern and 26% Arabian ancestry in Ethiopia (Table 1). The ROLLOFF estimate for the date of admixture was 93 generations ago—twice as old as the time of African admixture in Arabia and Near East. For comparison, in the Maasai from Kenya and Tanzania, the Eurasian component is an order of magnitude lower (4.5%), and the time of admixture is 47 generations, reflecting most probably later admixture events.
The parallel introduction of Eurasian lineages from the Near East, Iran and Arabia into North Africa through the Sinai Peninsula revealed two well-defined peaks (Fig. 1G) at ∼2.4 ka and 6.8 ka with the f1 criterion, and two peaks at ∼9.0 ka and ∼12.4 ka when using the f2 criterion. This seems to point to a significant role for dispersal in the Neolithic period, consistent with results obtained for the North African MSY pool, interpreted as suggesting a large Neolithic origin [51]. A major Neolithic impact is supported when imposing periods for the migration of founders (Fig. 1H), leading to: 7–16% at ∼2 ka, mainly HV1 and other undefined HV lineages, M1 and U (U6a1, K1a1); 52–58% at ∼10 ka for most of HV, U (U5b, U5 and K), T (some T2c1 and T2b), J (J1d1a, J2a2b and other undefined J), and X; and 26%–41% at ∼16 ka for some HV, T (T1a, T2) and U (U3, U3a, U5b1b, U5a, U6a) lineages (S1 Text, S36 and S37 Figs.). It seems likely that some JT lineages, especially T ones, were introduced into Northeast Africa before the Neolithic, following Late Glacial population expansions in the Near East/Arabia. Then, locally they could have been involved in population expansions in the Neolithic period, leading to signs of autochthonous founder effects, such as the one detected in the El-Hayez oasis (400 km southwest of Cairo) for sub-haplogroup T1a2a [52].
At the genome-wide level, Egypt is quite similar to its Levantine neighbours, displaying a mainly Near Eastern (39.8%) and Arabian/North African (30.5%) background, with slightly higher western (5.6%) and eastern (15.1%) African proportions, and lower European (8.4%) and South Asian (0.6%) proportions. The ROLLOFF estimate for admixture in Egypt (using Africans and Europeans as ancestral populations) was 30 generations, predictably young due to continuous gene flow between the two regions. Morocco and Tunisia presented similar western (9.8–12.2%) and eastern African (10.4–12.1%) components and roughly twice the magnitude for each of the European (22.8–25.5%), Near Eastern (21.4–26.0%) and Arabian (28.9–31.0%) pools. Again these young dates show that simple genome-wide dating approaches based on linkage disequilibrium decay must be applied cautiously in complex scenarios of several migrations occurring over a long span of time, such as the ones which took place across the Red Sea, North Africa [56] and Iberia [57].
The detailed evaluation of the Arabian and neighbouring mtDNA pools has allowed us to establish a genetic stratigraphy of Arabia’s maternal line of descent, testifying to the pivotal role of the Peninsula at the crossroads between Africa and Eurasia. The successful out-of-Africa migration led to continuous settlement of parts of the Peninsula, most probably centred on the Gulf Oasis, which likely functioned as the cradle for the emergence of the haplogroup N lineages. No haplogroup L(xMN) relicts of this migration into Arabia are detected in mtDNA founder analysis and we have confirmed their absence by whole-mtDNA sequencing of lineages from L3 [16] and its sister clades L4 and L6.
Although it is likely that the Gulf Oasis region eventually formed part of an extended source region together with the Near East, if we assume that the Near East was the main source population for current Arabian diversity, the Late Glacial period was responsible for the introduction of 40–54% of lineages, the Younger Dryas/Neolithic for 34–41%, and recent times (at 1.0 ka) for the remaining 12–19%. The Neolithic in Arabia was more characterised by the expansion in effective size of local haplogroup N lineages, mostly within R0a and J, than by the entrance of new lineages. Arabia, together with the Near East and Iran, was involved in the “back-to-Africa” migration of Eurasian lineages, beginning in the Pleistocene but becoming more significant with the establishment of maritime commercial routes. The Late Glacial period was more important for bringing Eurasian lineages into eastern Africa, probably reflecting the higher impact of this period in the expansion of Arabian populations, while the Neolithic, especially linked to the Near East, affected to a greater extent the dispersals towards North Africa. The biparental genome averaged the African input to 6–25% of the Arabian pool, concordant with the 35% female and 0% male inputs estimated from uniparental systems. ROLLOFF dating of admixture events across the Red Sea suggested recent ages of 8–37 generations for the African input into Arabia, 93 generations for the Arabian/Near Eastern input into eastern Africa and 30 generations for North Africa.
http://journals.plos.org/plosone/art...l.pone.0118625
Bookmarks