PDA

View Full Version : The genetics of East African populations: a Nilo-Saharan component in the African genetic landsca



Kamal900
06-18-2015, 10:54 PM
East Africa is a strategic region to study human genetic diversity due to the presence of ethnically, linguistically, and geographically diverse populations. Here, we provide new insight into the genetic history of populations living in the Sudanese region of East Africa by analysing nine ethnic groups belonging to three African linguistic families: Niger-Kordofanian, Nilo-Saharan and Afro-Asiatic. A total of 500 individuals were genotyped for 200,000 single-nucleotide polymorphisms. Principal component analysis, clustering analysis using ADMIXTURE, FST statistics, and the three-population test were used to investigate the underlying genetic structure and ancestry of the different ethno-linguistic groups. Our analyses revealed a genetic component for Sudanese Nilo-Saharan speaking groups (Darfurians and part of Nuba populations) related to Nilotes of South Sudan, but not to other Sudanese populations or other sub-Saharan populations. Populations inhabiting the North of the region showed close genetic affinities with North Africa, with a component that could be remnant of North Africans before the migrations of Arabs from Arabia. In addition, we found very low genetic distances between populations in genes important for anti-malarial and anti-bacterial host defence, suggesting similar selective pressures on these genes and stressing the importance of considering functional pathways to understand the evolutionary history of populations.

We applied a principal component analysis (PCA) to investigate the population structure of the new populations genotyped in this study from the Sudanese region (Supplementary Fig. S1a). PC1 (3.56% of the variation) follows a North-South cline and separates populations inhabiting the region between the Nile River and the Red Sea (Nubians and Arabs along the Nile, Beja and Ethiopians along the coast) from Darfurians and Nuba of South-West Sudan, and Nilotes of South Sudan. Copts are a separated group close to the North-East populations, in a more outlier position: they are the extreme of the northern genetic component. PC2 (0.7%) separates the nomadic Fulani from the other populations.

Next, we combined our new populations (140K data set) with previously studied populations of special interest for this analysis: Qatar 12, Egypt
13, and three sub-Saharan populations (Luhya, Yoruba and Maasai) from 1000 Genomes Project 14 to have external references both in the north and south of the Sudanese region. This new data set contains 14,343 SNPs (14K data set). Even if the number of SNPs in this second set is small, it is enough to differentiate components in the African genetic landscape
15. Fig. 2 shows a PCA of this extended data set, where East African populations are distinct from both sub-Saharan and North African populations. PC1 (6.08%) separates between populations from North Africa/Middle East and sub-Saharan Africa (Fig. 2a). Copts are closer to North African and Middle East populations but remain as a separate cluster when PC2 is considered. PC2 (1.46%) along with PC1 separate the two homogeneous clusters of North-East and South-West populations: Nubians, Arabs, Beja and Ethiopians on one hand, and Nuba, Darfurians and Nilotes on the other. PC2 separates all Sudanese and Ethiopian populations from the rest. PC3 (0.56%) differentiates West-African populations (Fulani and Yoruba) from Sub-Saharan East African populations (Maasai) (Fig. 2b). Both PC analysis using data sets with different number of SNPs preserve the topology of the populations. As expected, with a low number of SNPs we observe a higher intra-population variation (Supplementary Fig. S1b)

To infer the ancestral populations of the East African individuals, we run ADMIXTURE from k = 2 to k = 10 in the 14 populations (the analysis for the internal nine populations is presented in Supplementary Fig. S7,S10). We analysed the results from k = 2 to k = 5 as higher numbers of ancestral components do not have a clear origin. A complex pattern of admixture is observed in East African populations (Fig. 3). At k = 2, we already detect different ancestries in the Sudanese populations. Copts show a common ancestry with North African and Middle Eastern populations (dark blue), whereas the South-West cluster (Darfurians, Nuba and Nilotes) share an ancestry component (light blue) with sub–Saharan samples. The North-East cluster (Beja, Ethiopians, Arabs and Nubians) shows both components, although the main component (~70%) is that detected in North Africa and Middle East (Fig. 3). At k = 3 (best statistically supported model, see Supplementary Fig. S8b), a new component (light green) appears, well differentiated from other South Saharan or North Africa and Middle East populations. This component defines South-West Sudanese populations (Nuba and Darfurians) and Nilotes of South Sudan and is different from the main sub-Saharan component as seen in Yoruba and Luhya.

This Nilo-Saharan component, which is also found at lower percentage in the North-East cluster and Maasai, will be outlined in the discussion.
Copts share the same main ancestral component than North African and Middle East populations (dark blue), supporting a common origin with Egypt (or other North African/Middle Eastern populations). They are known to be the most ancient population of Egypt and at k = 4 (Fig.3), they show their own component (dark green) different from the current Egyptian population which is closer to the Arabic population of Qatar.
It is noteworthy the case of the Fulani, which feature more Sudanese ancestry (>45%) than North African (<40%) or sub–Saharan (<15%) and at
k = 5 show their own component (Fig.3). They have a high individual component variance suggesting a recent admixture event in this population.To formally test the results of the admixture analysis, we applied the three-population test (f3 statistics)16. We used all possible pairs of populations as surrogates of the ancestral populations of each ethno-linguistic group. All populations that have a complex pattern of admixture (Fig. 3) showed statistically significant results (Z-score <−4, p-value <3.2×10−5): those of the North-East cluster (Beja, Ethiopians, Arabs and Nubians) and Fulani. Populations from the North-East cluster: Beja, Ethiopians, Arabs and Nubians (Table 2) may be explained as admixture products of an ancestral North African population (similar to Copts) and an ancestral South-West population (Nuba, even if in one case Darfurians
have better fit). These four populations had an intermediate position between Copts and South-West Sudanese populations both in the PC and admixture analyses. Fulani, who are known to have West-African ancestry, have a negative f3 with Copts and Yoruba as source populations (Table 2). As they have a complex history and present high levels of admixture with different populations and high individual variance, this three-population phylogeny seems naïve to explain their complex population history. None of the South-West populations (Darfurians, Nuba and Nilotes) appear as admixed in the three-population test. This result fits the ADMIXTURE analysis (Fig. 3 and Supplementary Fig. S10) and it confirms a specific ancestral component for these populations.

In this study we present an extensive genome-wide data set characterizing East African human genetic diversity in populations from Sudan, South Sudan and Ethiopia. We further analyse the Nilo-Saharan ancestral component within the variation of South-Saharan Africans. This component belongs linguistically to Eastern Sudanic languages and geographically to South and West of Sudan and South Sudan, including highly diverse ethnic groups in a similar genetic background. This component was identified in previous studies using Nilotic populations, but it was not analysed in other Nilo-Saharan populations, such as Darfurians or the Nuba people. In addition, we show convergent evolutionary pressures exerted
on genes involved in anti-malaria and anti-bacterial host defence processes. Africa genetic landscape is shaped by geographic barriers19, but the forces clustering populations vary depending on the scale. On a regional scale, East Africa populations cluster mainly by linguistic affiliation 5. However, it has been previously reported that language plays a lesser role in the genetic clustering of Sudanese populations, as geography is the main factor that groups them 10. This observation is supported by our data, as shown in the PCA (Fig. 2.), where PC1 represents a north-east to south-west axis delimited by the Nile River and its main tributaries: the Blue Nile and the White Nile. Genetic and geographic distances between populations of the Sudanese region are positively correlated (Mantel test; r = 0.5105, p-value < 0.0001), with Sudanese populations clustering in four groups according to their geographic location (Supplementary Fig. S1).Nubians are the only Nilo-Saharan speaking group that does not cluster with groups of the same linguistic affiliation, but with Sudanese Afro-Asiatic speaking groups (Arabs and Beja) and Afro-Asiatic Ethiopians (Supplementary Fig. S1a). Y-chromosome and mitochondrial DNA studies reported Nubians to be more similar to Egyptians than to other Nilo-Saharan populations1,8: Nubians were influenced by Arabs as a direct result of the penetration of large numbers of Arabs into the Nile Valley over long periods of time following the arrival of Islam around 651 A.D 20.

We also found this relationship of Nilo-Saharan Sudanese populations with other Nilo-Saharan populations from Kenya (Maasai), but not as strong, as Maasai show their own genetic component at k = 6, which is different from the Sudanese component (Supplementary Fig. S7) and do not cluster with our Nilo-Saharan speaking populations. In a previous Y-chromosome study 8, most Nilo-Saharan speaking populations, except Nubians, showed little evidence of gene flow with other Sudanese populations.

The presence of the core of Nilo-Saharan languages in the confluence of the two Nile rivers suggests that the Sudanese region is the place of origin of the Nilo-Saharan linguistic family despite their fragmented distribution, as shown by the location of the Nubian language 21,22. It is interesting to note that Nuba populations constitute an homogeneous group, even if some speak Kordofanian (of the Niger-Kordofanian family) and others different languages of two branches of the Nilo-Saharan family. Their genetic composition denotes their Nilo-Saharan origin, with linguistic replacements in some groups. Population displacement, whether it is followed with cultural or genetic exchange with local populations, would explain why not every Nilo-Saharan speaking group has this genetic component (as is the case of Nubians) and not every population that has it is mainly formed by Nilo-Saharan speakers (as is the case of Niger-Kordofanian speaking Nuba). The North African/Middle Eastern genetic component is identified especially in Copts. The Coptic population present in Sudan is an example of a recent migration from Egypt over the past two centuries. They are close to Egyptians in the PCA, but remain a differentiated cluster, showing their own component at k = 4 (Fig. 3). Copts lack the influence found in Egyptians from Qatar, an Arabic population. It may suggest that Copts have a genetic composition that could resemble the ancestral Egyptian population, without the present strong Arab influence.

http://www.nature.com/srep/2015/150528/srep09996/pdf/srep09996.pdf

What do you think guys?

Yuffayur
06-18-2015, 11:03 PM
I have seen this study something like 2 weeks, it's non sense saying that Copts have more NA than Egyptians; and Egyptians and Qataris are the same.

anyway check this thread you will find some interesting PCA plots.

http://www.forumbiodiversity.com/showthread.php/44363-Sudan-autosomal-study

Kamal900
06-18-2015, 11:08 PM
I have seen this study something like 2 weeks, it's non sense saying that Copts have more NA than Egyptians; and Egyptians and Qataris are the same.

anyway check this thread you will find some interesting PCA plots.

http://www.forumbiodiversity.com/showthread.php/44363-Sudan-autosomal-study

Indeed. I find that study to be quite odd if you ask me. How do you place the Egyptians genetically in your part?

Yuffayur
06-18-2015, 11:15 PM
Indeed. I find that study to be quite odd if you ask me. How do you place the Egyptians genetically in your part?

Well from this study we can have two conclusions:

1/ Ancient Egyptians were close to southern Levant and Bedouins.

2/ Coptic are outlier NE Africans, and may have recent Near eastern origin.

I go with the second one, since we know that ancient Egyptian had clear East African influence (5 to 25%).

Yuffayur
06-18-2015, 11:17 PM
Also some months ago I have seen a study about Sudanese DNA (Y and Mt) and Coptic had an interesting results, their Mtdna were like (U6a3 up to 25%, and M1 more than 20%), while their Y-DNA were mostly E-M78, J1 and T.

Kamal900
06-18-2015, 11:24 PM
Also some months ago I have seen a study about Sudanese DNA (Y and Mt) and Coptic had an interesting results, their Mtdna were like (U6a3 up to 25%, and M1 more than 20%), while their Y-DNA were mostly E-M78, J1 and T.

So, you think that the reason why Copts are outlier Egyptians is because they tend to have more west Asian affinity than their Muslim counterparts. We have a Coptic member here, Oblivion, scores like over 40 percent Anatolian & Caucasus with some north African admixture(15%). I think it does make a lot of sense if that was the case.

Yuffayur
06-18-2015, 11:47 PM
So, you think that the reason why Copts are outlier Egyptians is because they tend to have more west Asian affinity than their Muslim counterparts. We have a Coptic member here, Oblivion, scores like over 40 percent Anatolian & Caucasus with some north African admixture(15%). I think it does make a lot of sense if that was the case.

Yup, Copts are West Asian from this results, but by Eurogenes k7 they lack of ANE, unlike other west Asian (ex Sinai Bedouins), as for NA we don't have ANE, also Copts show minor WHG (7 to 10%) more than West asian..

Kamal900
06-18-2015, 11:51 PM
Yup, Copts are West Asian from this results, but by Eurogenes k7 they lack of ANE, unlike other west Asian (ex Sinai Bedouins), as for NA we don't have ANE, also Copts show minor WHG (7 to 10%) more than West asian..

Whats WHG?

Here's my results:
ANE 8.76%
ASE 2.25%
WHG-UHG 11.38%
East_Eurasian 0.70%
West_African 0.97%
East_African 4.09%
ENF 71.84%

What do you think?

Yuffayur
06-18-2015, 11:54 PM
Whats WHG?

Western Hunter-Gatherer, European Mesolithic.

http://i61.tinypic.com/4zvlh3.jpg

credit of the map to Longbowman.

Kamal900
06-18-2015, 11:58 PM
Western Hunter-Gatherer, European Mesolithic.

http://i61.tinypic.com/4zvlh3.jpg

credit of the map to Longbowman.

Oh, okay. Have you read my results?

ANE 8.76%
ASE 2.25%
WHG-UHG 11.38%
East_Eurasian 0.70%
West_African 0.97%
East_African 4.09%
ENF 71.84%

It seems i score around 11.38 percent western hunter gatherer. Whats ENF? its seems i score 71 percent.

Yuffayur
06-19-2015, 12:10 AM
Oh, okay. Have you read my results?

ANE 8.76%
ASE 2.25%
WHG-UHG 11.38%
East_Eurasian 0.70%
West_African 0.97%
East_African 4.09%
ENF 71.84%

It seems i score around 11.38 percent western hunter gatherer. Whats ENF? its seems i score 71 percent.


Yes but you will probably loose half of it with the Polako's new Eurogenes K8, In K7 I'm 0.3% ANE, 0% ASE, 27% WHG, 3% East Eurasian, 0% West African, 9% East African, and the rest is ENF 60.%

ENF is Early Neolithic Farmers, the people who enterred europe (from Near east) some 7000ybp.


http://i60.tinypic.com/2zfqxpt.jpg

Longbowman
06-19-2015, 12:29 AM
Oh, okay. Have you read my results?

ANE 8.76%
ASE 2.25%
WHG-UHG 11.38%
East_Eurasian 0.70%
West_African 0.97%
East_African 4.09%
ENF 71.84%

It seems i score around 11.38 percent western hunter gatherer. Whats ENF? its seems i score 71 percent.

WHG is 'Western Hunter Gatherer,' a mesolithic European component. ENF is 'Early Neolithic Farmer,' ie, Middle Eastern.

The maps Yuffayur is posting were done, however, for the more accurate K8, whereas you are using ANE K7. Trust me, I made them ;)

WHG-UHG isn't the same as WHG, it includes a WHG-like subcomponent found in the Middle Eastern component.

You'd probably get very little WHG.

Kamal900
06-19-2015, 12:34 AM
WHG is 'Western Hunter Gatherer,' a mesolithic European component. ENF is 'Early Neolithic Farmer,' ie, Middle Eastern.

The maps Yuffayur is posting were done, however, for the more accurate K8, whereas you are using ANE K7. Trust me, I made them ;)

WHG-UHG isn't the same as WHG, it includes a WHG-like subcomponent found in the Middle Eastern component.

You'd probably get very little WHG.

I couldnt find the calculator in the gedmatch website. Can you give me the link to the calculator or something?

Longbowman
06-19-2015, 12:36 AM
I couldnt find the calculator in the gedmatch website. Can you give me the link to the calculator or something?

You have to donate $20 to Eurogenes, on the Eurogenes blog, and send them your 23andme data, and Davidski will run the K8, K6 and perhaps another run for you, and give you charts.

Yuffayur
06-25-2015, 01:13 PM
Copts have also a lot of B and A, I forgot to write it.