View Full Version : Detailed specific formal study of IA populations affecting Iranics
Background:
About a year ago I decided to commission a serious detailed and specific study of Iron Age and Bronze Age populations affecting Iranics using formal methods. I was tired of amateurish calculators giving results that many times didn't make much sense.
I commissioned Eurasian DNA to do this study because of their proficiency in DNA processing and innovation in adapting formal methods to specific purposes and because they do more customized studies on West, Central, and South Asia than other outfits.
One of the first things Dilawer from Eurasian DNA told me is that if I wanted to get as accurate as possible results I would have to whole genome sequence myself since the 1240K panels ancients are genotyped have low overlap with 23andme, FTDNA, and AncestryDNA panels. Also he mentioned that he has noticed some genotyping errors with the commercial companies when he has compared their samples with the same sampled done with 30X sequencing.
So I decided to do whole genome sequencing for me and my mom so that Dilawer could use our data with the 1240K ancients for maximum accuracy and SNP overlap.
The other thing Dilawer cautioned was that there was a limited number of modern samples and populations that had maximum overlap with the 1240K SNP ancients. I told him I was fine with that since I just wanted Kurds compared with a couple of other West Asians and maybe a couple of E Europeans and S Asians.
Methods Used:
Dilawer adapted the ADMIXTOOLS qpWave method using the following 11 outgroups to get rid of junk SNPs that were shared by many due to very deep ancestry:
South_Africa_2000BP.SG
Russia_Ust_Ishim.DG
Russia_Shamanka_Eneolithic
Onge_1000G
Russia_HG_Karelia.SG
Serbia_Mesolithic_IronGates
DevilsCave_N.SG
Anatolia_N
Karitiana.DG
Iran_GanjDareh_N
Georgia_Kotias.SG
So in other words these outgroups would help base the analysis on derived alleles and not shared ancestral alleles which skew alot of calculators.
Dilawer is also very strict with the quality of ancient samples so in instances where there are many samples of a population he only used the highest quality ones and discarded the others.
Here are the quantities of samples from each outgroup population (n):
2 South_Africa_2000BP.SG 3
3 Russia_Ust_Ishim.DG 1
4 Russia_Shamanka_Eneolithic 10
5 Onge_1000G 6
6 Russia_HG_Karelia.SG 1
7 Serbia_Mesolithic_IronGates 29
8 DevilsCave_N.SG 4
9 Anatolia_N 29
10 Karitiana.DG 3
11 Iran_GanjDareh_N 4
12 Georgia_Kotias.SG 1
Formal methods used
I found this regarding the methods Dilawer used https://bodkan.net/admixr/articles/tutorial.html
To run qpWave, you must provide a list of left and right populations (using the terminology of Haak et al. 2015 above). The aim of the method is to get an idea about the number of migration waves from right to left (with no back-migration from left to right!). This is done by estimating the rank of a matrix of all possible F4 statistics
F4(left1,lefti;right1,rightj),
where left1 and right1 are some fixed populations and the i and j indices run over all other possible choices of populations.
The qpWave() function returns a data frame which shows the results of a series of matrix rank tests. The rank column is the matrix rank tested, df, chisq and tail give the degrees of freedom, X2 value and p-value for the comparison with the saturated model (the p-value then indicates which matrix rank is consistent with the data - see example below), and dfdiff, chisqdiff and taildiff give the same, but always comparing a model to the model with one rank less.
Targeted Ancients:
<colgroup width="221"></colgroup> <tbody>
POPULATION
Turkmenistan_IA ; n=1
Sarmatian-Kazakhstan ; n=2
Kura-Araxes-EBA ; n=4
Iran_C_HajjiFiruz ; n=4
Iran_IA_Hasanlu ; n=4
Iran-BA-HajjiFiruz ; n=1
Turkmenistan_Gonur_BA1 ; n=8
Alan-Russia : n=5
Saka-TianShan-IA : n=6
Iran-IA-HajjiFiruz : n=1
Iran-ShahrISokhta-BA1 : n=5
Sarmatian-Russia : n=6
Pakistan-Barikot-IA : n=3
Sintashta-MLBA ; n=16
</tbody>
<style type="text/css"> body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small } a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em; } a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em; } comment { display:none; }</style>
Voskos
03-29-2020, 02:44 PM
Cool.Where's the results?
Cool.Where's the results?
I was waiting for some further explanation from Dilawer on the technical aspect of the methodology because it's difficult to understand so that I can explain it better. I'll start posting charts as soon as he gets back to me.
I think I understand the methodology now. First this is what he said
With a sufficient number of outgroups which outnumber sources (n) the qpWave matrix has maximum n-1 independent columns (= rank) and minimum zero nontrivial column (i.e. zero matrix). For example, two unadmixed West Asian left populations are a sister group to each other against all non-West Asian right populations. Then, any ��4(WAsian1, WAsian2;EAsian1,EAsian2)will be zero.
As per Dilawer he is adapting the rank=0 matrix in qpWave which would have a passing p-value for 2 source populations which are very related to each other to infer degree of relatedness for any 2 source populations.
As per Dilawer if 2 sources are symmetrically related to ALL the outgroups the p-value would be high and X2 would be close to 0.
If they are very differentially related to ALL the OUTGROUPS Chi-squared (X2) will be very high
Only they most densely genotyped samples were used in the study to get to 700,000 overlapping SNPs with the ancients. That's why there is a limited number of moderns used.
I'll start with Chalcolithic Iranian Zagrosian HajiFiruz from the Kurdistan area. Looking at the results it looks like there is no modern population in the area that would be extremely similar because of the Indo-Europeanization of the area after his time.
1- Haji-Firuz-CHl immediate relatives if they still existed would score 1400 in this chart. Even though Kurmanji Kurds and Armenians are close they are sufficiently different due to IE of the region after his time.
2- Notice the distance between Estonians and Indo-Iranians is very large at this time. You will see that this distance decreases with the Iron Age HajiFiruz after there is influx of steppe into this region.
https://i.imgur.com/3JUDOQx.jpg
Here is Iron Age Hasanlu. Here you see Iranics are getting much closer to E Europeans after Indo-Europeanization (look at the scale 600 instead of 1400)
The maximum Chi-sq spread between Iranics and Estonians is only about 550 vs about 1200 in the Chalcolithic period before Iran had influx of steppe
https://i.imgur.com/MbgBYSr.jpg
With respect to the relatedness to the higher quality 2 Sarmatian Kazakhstan samples N Ossetians and Kurmanji Kurds top the list and are shoulder to shoulder. However you'll see later that the gap between N Ossetians and Kurmanji Kurds grows a little with the 6 Alan samples. N Ossetians take the lead by a little.
I was surprised to see Armenians that far down.
https://i.imgur.com/Fe406qM.jpg
Here is Iron Age Hasanlu. Here you see Iranics are getting much closer to E Europeans after Indo-Europeanization (look at the scale 600 instead of 1400)
The maximum Chi-sq spread between Iranics and Estonians is only about 550 vs about 1200 in the Chalcolithic period before Iran had influx of steppe
https://i.imgur.com/MbgBYSr.jpg
Okay so Armenians are the real Kurds/Medes/Aryans according to MS85/Eline?
Here we see that BMAC people were very interesting and sort of different from modern of today. The most similar would be Kurmanji Kurds and Baloch but even they are not exactly their immediate relatives otherwise their scores would have been 1800
Also notice how far away Estonians are from BMAC folks
https://i.imgur.com/uxQpYkv.jpg
Okay so Armenians are the real Kurds according to MS85/Eline?
lol no. They are not significantly ahead. I think Kurds got a little more Sarmatian/Turkic after IA which diluted their Hasanlu a bit. You can see the Sarmatian chart from the previous page where Kurds are quite a bit more related to Sarmatians than Armenians are
lol no. Neither Armenians nor Kurds score 1400 (they were quite different from any moderns in that area because of all the steppe that came later so I suppose Armenians did not receive as much steppe as Kurds so I suppose their HajiFiruz Chl was diluted by a little less)
Yes but Eline/MS85 claimed IA Hasanlu is a Mede/Aryan.
But again it shows talking about Iranics (Persians, Kurds etc) without migration from North East is pointless. The native element is which could be part Proto-Armenians is shared with Western Iranians, but it's not Iranic.
Turkmenistan-IA is very Iranic as you may have read in other places
Also notice the gap between Iranics and Estonians is small compared to the other charts I posted
https://i.imgur.com/GZwhc3L.jpg
lol no. Neither Armenians nor Kurds score 1400 (they were quite different from any moderns in that area because of all the steppe that came later so I suppose Armenians did not receive as much steppe as Kurds so I suppose their HajiFiruz Chl was diluted by a little less)
Yes but Eline/MS85 claimed IA Hasanlu is a Mede/Aryan.
But again it shows talking about Iranics (Persians, Kurds etc) without migration from North East is pointless. The native element which could be part Proto-Armenian is shared with Western Iranians, but it's not Iranic.
Yes but Eline/MS85 claimed IA Hasanlu is a Mede/Aryan.
But again it shows talking about Iranics (Persians, Kurds etc) without migration from North East is pointless. The native element which could be part Proto-Armenian is shared with Western Iranians, but it's not Iranic.
True
With Haji-Firuz-IA Kurds with an insignificant lead over Armenians but as Kyp mentioned one should characterize IA Haji Firuz and Hasanlu as general NW Iranic/Armenian instead of assign him to a specific modern group
https://i.imgur.com/KS5Anf0.jpg
I'll post the others later. Sintashta's chart is interested with Estonians with a huge lead and so is Pakistan-Barikot-IA
Voskos
03-29-2020, 07:20 PM
It would've been nice to include Iranians among the populations. Cool study.
It would've been nice to include Iranians among the populations. Cool study.
I asked him to include a couple of Persians but he said he only had 2 Persian samples with 700,000 SNP overlap with the ancients but they were ambigous because they were classified as Persian migrants to Kuwait. The HGDP and 23andme and FTDNA and Ancestry Persian samples unfortunately don't have the required SNP overlap with these ancient samples and we wanted maximum accuracy.
Turkmenistan-IA is very Iranic as you may have read in other places
Also notice the gap between Iranics and Estonians is small compared to the other charts I posted
https://i.imgur.com/GZwhc3L.jpg
Compare the previously posted Turkmenistan-IA to this Sintashta-MLBA chart. I think the reason there is a bigger spread between Estonians and Iranics in the Sintashta chart is because whereas Turkmenistan-IA had BOTH Iranic and steppe admixture in decent quantities, Sintashta lacked the Iranic admixture putting it further from Iranics and increasing the spread between Estonians and Iranics.
These are some of the things that amateur calculators don't readily reveal just like variation between Armenians and Kurds when it comes to Central Asian admixture.
https://i.imgur.com/WFx37OY.jpg
From all the charts I have Estonians are furthest away from Shahr-e-Sokhteh and BMAC. They are infact closer to even Pakistan-Barikot-IA probably because it had some steppe admixture whereas the other 2 didn't.
This one is for the least ASI Shahr-e-Sokhteh group. Here again we see substantial differences between Kurds and Armenians not accurately portrayed by amateur calculators.
https://i.imgur.com/f2NgDyN.jpg
This one is for the Pakistan-Barikot-IA group. Notice how close to max the Pak Balochi samples are. If a test sample were a close relative of the Barikot samples they would score about 1250.
https://i.imgur.com/HMnMXQD.jpg
The 5 highest quality Russian Alan samples. As expected N Ossetians show the highest relatedness to them followed by Kurds. The reason formal methods don't show E Europeans as high as amateur calculators is because they take into account relatedness due to very ancient alleles. Ancient EHG and WHG which are higher in Europeans than in W Asians increase relatedness to Sarmatians and Alans in amateur calculators but these ancestral SNPs are discounted using formal methods that's why you see a more accurate picture with formal methods.
These tests use 11 ancestral outgroups for this purpose. They are shown on page 1.
https://i.imgur.com/rFIQcrT.jpg
With respect to the relatedness to the higher quality 2 Sarmatian Kazakhstan samples N Ossetians and Kurmanji Kurds top the list and are shoulder to shoulder. However you'll see later that the gap between N Ossetians and Kurmanji Kurds grows a little with the 6 Alan samples. N Ossetians take the lead by a little.
I was surprised to see Armenians that far down.
https://i.imgur.com/Fe406qM.jpg
I just noticed something looking at the Sarmatian-Kazakhstan relatedness to the Alan relatedness chart. Armenian positons are quite different in the 2 charts. With the Sarmatian chart Kurd-Armenian spread is about 350 chi-squared but with the Alans chart the spread is only 80 chi squared.
I'm inclined to think that Alans have some sort of Anatolian/Kura-Araxes admixture that puts them closer to Armenians than Sarmatians are to Armenians or maybe Kazakh Sarmatians have more of the E Asian that puts them further away from Armenians. Feel free to comment.
Also notice the very different Estonian position in the Sarmatian vs Alan chart.
Eline
04-04-2020, 06:37 AM
The reason formal methods don't show E Europeans as high as amateur calculators is because they take into account relatedness due to very ancient alleles. Ancient EHG and WHG which are higher in Europeans than in W Asians increase relatedness to Sarmatians and Alans in amateur calculators but these ancestral SNPs are discounted using formal methods that's why you see a more accurate picture with formal methods. Please elaborate more. Why do you think EHG in Alans is ancient and why dont you discount ancient CHG? Doesnt make any sense.
Eline
04-04-2020, 08:38 AM
Okay so Armenians are the real Kurds/Medes/Aryans according to MS85/Eline?Not really.
I don't know how this strange calculators work. Something is wrong with it.
Ancient Iranians were much more related to Kurds than the Armenians. Ancient Iranians had much more 'Gedrosia' DNA in them.
That Hasanlu guy (F38) had 28,81 % Gedrosia. As much as modern Kurds. Armenians have much less of it.
https://i.postimg.cc/KcCrzYGs/Iranians.png
Not really.
I don't know how this strange calculators work. Something is wrong with it.
Ancient Iranians were much more related to Kurds than the Armenians. Ancient Iranians had much more 'Gedrosia' DNA in them.
That Hasanlu guy (F38) had 28,81 % Gedrosia. As much as modern Kurds. Armenians have much less of it.
https://i.postimg.cc/KcCrzYGs/Iranians.png
Looks more like a mutt to me. Closeness to Kurds is probably just because you guys are mixed with Mesopotamians.
Eline
04-04-2020, 12:59 PM
Looks more like a mutt to me. Closeness to Kurds is probably just because you guys are mixed with Mesopotamians.Kurds are actually the very definition of the Northern Mesopotamians with some minor admixture from the Caucasus (Hurrians, Scythians etc.) and Southcentral Asia (Parthians).
Armenians are much more shifted toward the the Mediterranean Sea/Levant.
With respect to the relatedness to the higher quality 2 Sarmatian Kazakhstan samples N Ossetians and Kurmanji Kurds top the list and are shoulder to shoulder. However you'll see later that the gap between N Ossetians and Kurmanji Kurds grows a little with the 6 Alan samples. N Ossetians take the lead by a little.
I was surprised to see Armenians that far down.
https://i.imgur.com/Fe406qM.jpg
One of the Biggest difference between Armenians and Kurds
Powered by vBulletin® Version 4.2.3 Copyright © 2025 vBulletin Solutions, Inc. All rights reserved.