My top 20 and graph by mnd661 from ABF
http://img28.imageshack.us/img28/1334/no2b.jpgCode:SE4
SE5
NO3
EE1
NO1
SE8
US71
SE1
SE2
NO5
NO6
NO7
US61
US182
US111
SE6
North_Russian
US5
EE2
RU
Printable View
My top 20 and graph by mnd661 from ABF
http://img28.imageshack.us/img28/1334/no2b.jpgCode:SE4
SE5
NO3
EE1
NO1
SE8
US71
SE1
SE2
NO5
NO6
NO7
US61
US182
US111
SE6
North_Russian
US5
EE2
RU
When is a genetic map also a geographic map? Always and never
Quote:
Just after posting my last entry I realized that it might seem a bit confusing to those who don't really have much experience in reading PCA/MDS plots. So instead of rehashing it, I decided to make another entry about how to read such plots, which might come in handy to my project members. It's really not that difficult, if you keep in mind that they're never really geographic maps, but at the same time, always contain at least some results that show high correlation with geography.
In that last blog entry about the Balto-Slavs I concluded that, in terms of intra-European genetic diversity, Balts were more easterly than Slavs. I based this on both PCA/MDS plots and genetic distances. The reason I did this was because, as mentioned above, PCA/MDS results don't always gel with geography, especially when relative genetic isolates are included in the analysis. The plot I used from Nelis et al. 2009 showed clearly how adding a relative genetic isolate, like Eastern Finns from Kuusamo, can wreck the correlation between genes and geography.
http://i129.photobucket.com/albums/p..._Europe2-2.png
Adding the samples from the Kuusamo isolate basically means that "east" is no longer in the same direction for all the samples on that plot. If we are to assume that it is in the same direction, we get absurd results like Swedes being more easterly than Russians. Obviously, there are now two, perhaps more, directions which correlate well with the geographic "east" within Europe. As you can see, I marked the two obvious ones on that plot as East 1 and East 2.
So now, it seems, we're facing a problem. Which "east" applies to the Balts and Slavs? If it's East 1, then Russians are more easterly than Lithuanians. If it's East 2, then it's the other way around. But not to worry, because it's easy to work that out. The simplest thing to do is to focus on the samples we're interested in, and zoom in on that area of the plot. So let's just leave the Swedes, North Germans, Czechs, Poles, Russians, Lithuanians and Latvians in the analysis, and ignore the Finnic and Southern European samples.
http://i129.photobucket.com/albums/p..._Europe3-1.png
OK, so now the plot makes much more sense; Swedes and North Germans are located west of Poles and Russians, as we'd expect. At the same time, if we're to follow this line of thinking, Lithuanians and Latvians are located east of Poles and Russians. Just to make sure this is correct, let's see what happens on a plot that doesn't include the Kuusamo isolate.
http://img251.imageshack.us/img251/4...12overview.png
The difference is clear; there's now only one "east", which runs towards the bottom of the plot (ie. west to east = vertical axis).
In order to cement these findings, we can now look at some pairwise intra-European genetic distances. Let's double check, for instance, that the Lithuanians and Latvians are indeed more easterly than the Poles and Russians, as opposed to just being, say, more northerly. For this I use the same table as in my last log entry.
Russians from Tver versus...
- Utah Americans of Northern and Western European origin (1.56)
- French (1.94)
- Swedes (1.59)
- Finns from Helsinki (2.10)
- Southern Italians (2.68)
- Spanish (2.32)
Lithuanians versus...
- Utah Americans of Northern and Western European origin (1.74)
- French (2.20)
- Swedes (1.74)
- Finns from Helsinki (2.33)
- Southern Italians (2.96)
- Spanish (2.62)
Lithuanians and Latvians are much less similar to western Europeans than Russians are. Therefore, they are more easterly than Russians in terms of intra-European genetic diversity. At the same time, they're not more similar to Northern European populations such as Swedes and Finns. Therefore, any claims, for instance, that they're simply more northerly than Russians don't hold up. In fact, based on all the pairwise scores, the best way to describe the situation is to say that Russians are more mainstream as far as intra-European genetic diversity is concerned, while there's something fairly unique about the Balts, which is especially evident when looking at Latvians.
As I've already noted in my last blog entry, I can't see this as being a recent development. Rather, the close genetic relationship between current Northern Slavs and Balts seems like the recent development, and likely due to mixing in the last 1000 years or so. Before that, I suspect, these two groups were much more distinct from each other than they are today.
By the way, there are two points I'd like to stress before signing off. Firstly, it's important to understand that what is "east" on an intra-European PCA/MDS plot, need not be "east" on an inter-continental plot. For instance, consider a Central European ethnic group with some minor East Eurasian admixture, and an Eastern European group with less East Eurasian admixture. The former will cluster "west" of the latter on an intra-European plot. However, when East Asian samples are added to the analysis, it then becomes an inter-continental plot, and the Central European group with the more significant Asian influence will pull "east" further than the Eastern Europeans. You can actually see something like that on the following two plots I published recently; intra-North Eurasian where Lithuanians cluster west of Hungarians, and intra-European, where the situation is reversed.
http://img255.imageshack.us/img255/6818/neura12.pnghttp://img251.imageshack.us/img251/4...12overview.png
Anyone with an interest in these sorts of analyses should try and sort this out in their minds before attempting any sort of interpretation of PCA/DS plots.
Secondly, and this goes without saying, but I'll say it anyway; beware of making generalizations about entire language groups based on small sample sets. For instance, it's not reasonable to draw inferences about the relationship between Balts and Slavs based on the 10 Lithuanians from the Behar et al. study. These are just 10 people, possibly from near the Belorussian and Polish borders, and might be very different from their countrymen from another part of Lithuania, and even more distinct from Latvians. However, it is reasonable to do what I did, and that was to look at sample sets of tens and hundreds of individuals, from Poland, Lithuania, Latvia and Russia, featured in recent peer-reviewed studies. As I say, this all pure logic, but sometimes it needs to be reiterated.
There's not much to do really. Just load the samples into ADMIXTURE correctly, and give them a spin.
There you have it, mystery solved.
You can call it the pink cluster if that makes you feel better.Quote:
The Southeast European cluster appears highest in the Turks and Jews but not in the Southeast Europeans, so that’s a bad name for it.
Have you ever hard of rounding off figures to the nearest per cent?Quote:
For example, PL1 sums to 1.02 and CA5 sums to 0.98. I would like to see more clusters. Dienekes will be doing 64 clusters. With more clusters, we would be able to notice more differences among the samples. Polako does have an agenda though.
By the way, what's my agenda? Please do tell. Mind you, if you come up with some crock of shit I'm gonna pressure you to prove it. And you better do a good job by loading up these samples into ADMIXTURE and showing us exactly how I "manipulated" them.
If you can't do that, then I'd suggest you keep your conspiracy theories to yourself.
I don't make mistakes like that. It's a sample from the Finnish HapMap. You can see how he's behaving in a study done by the Finns (look for the southernmost Finn).Quote:
Maybe Polako made a mistake? It’s not from someone that submitted his/her raw data to him but from a dataset.
Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging - supp info
Not gonna work, because ADMIXTURE goes ballistic at high Ks, especially with closely related groups.Quote:
With more clusters, it should be more apparent.
Europe at K=5 is all I can do without hundreds of samples from each country and ethnic group. Maybe 6 might work, but that'd take all night.
Obviously, they're related. Again, mystery solved. Glad I could help out.Quote:
What happened to NO4? Some Germans are missing. Maybe they requested withdrawal?
I find it funny that you actually decided to post this in public, without the foggiest idea how ADMIXTURE works.
The program picks who has the highest of whatever, and the cluster that seems "Northern European" always peaks in Lithuanians.
If you don't think so, then try it. Everything's online. I'll give you $500 if you put it all together properly, and then miraculously prove myself and Behar wrong by showing that Lithuanians aren't the modal group for that component.
But if you fail, you owe me $500.
Bwahaha...yeah, you do that sunshine.Quote:
Now if I made a genetic research and started collecting 23andme raw data, I could take the Scanian(southern-Swedish) population as my Northern-European sample population and then I could claim that Scanians are the most Northern-European Europeans, followed by Danes and Swedes.
Basque 13%
Mediterranean 2%
Southeast European 3%
Baltic Finnish 7%
North Euro 75%
Top matches, excluding Americans, Canadians etc..
Quote:
- BY2
- IE3
- NO7
- IE4
- FR
- IE6
- UK16
- UK4
- DE1
- FR
- IE9
- IE8
- DE9
- RU11
- IE10
Something like that I expected. Parameters can be changed. There's more genetic diversity in there, so that explains the not so distinct cluster. Jews and Turks aren’t Southeastern European. As for cluster names, those that know more history, migrations, archaeology, etc. would understand them better. For those that don't, it's misleading.
I've programmed using rounding algorithms. It's obviously rounding up whether it's round-ceiling or round-half-up, or whatever. Comparing the two extremes, 98% and 102%, there's a substantial 4% difference. Rounding to the nearest whole percent isn’t causing that. You or the spreadsheet is causing the limitation. You should fix it.Quote:
Have you ever hard of rounding off figures to the nearest per cent?
By the way, what's my agenda? Please do tell. Mind you, if you come up with some crock of shit I'm gonna pressure you to prove it. And you better do a good job by loading up these samples into ADMIXTURE and showing us exactly how I "manipulated" them.
If you can't do that, then I'd suggest you keep your conspiracy theories to yourself.
In order for me to prove that, you would have to give me all the data. It's not necessarily the results but the interpretations. Conspiracies are fun. I'm not the only one.
On your global MDS, some SNPs you picked between v3 and v2 data didn't overlap as they were no-calls. That caused skewness.Quote:
I don't make mistakes like that. It's a sample from the Finnish HapMap. You can see how he's behaving in a study done by the Finns (look for the southernmost Finn).
Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging - supp info
Upgrade your hardware or wait longer. v1.1 can use parallel processing on multi-core processors. Still, these simulations and programs like STRUCTURE which uses a different algorithm can have problems that is if the user uses them wrong.Quote:
Not gonna work, because ADMIXTURE goes ballistic at high Ks, especially with closely related groups.
Europe at K=5 is all I can do without hundreds of samples from each country and ethnic group. Maybe 6 might work, but that'd take all night.
<!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:PunctuationKerning/> <w:ValidateAgainstSchemas/> <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid> <w:IgnoreMixedContent>false</w:IgnoreMixedContent> <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText> <w:Compatibility> <w:BreakWrappedTables/> <w:SnapToGridInCell/> <w:WrapTextWithPunct/> <w:UseAsianBreakRules/> <w:DontGrowAutofit/> </w:Compatibility> <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel> </w:WordDocument> </xml><![endif]--><!--[if gte mso 9]><xml> <w:LatentStyles DefLockedState="false" LatentStyleCount="156"> </w:LatentStyles> </xml><![endif]--><!--[if gte mso 10]> <style> /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman"; mso-ansi-language:#0400; mso-fareast-language:#0400; mso-bidi-language:#0400;} </style> <![endif]--> It wasn’t obvious, but it was a mystery.Quote:
Obviously, they're related. Again, mystery solved. Glad I could help out.
*Loki pours a glass of Tokaj while making this post of his top matches*
(thanks Graham)Quote:
1.HU
2.HU
3.HU
4.HU
5.HU
6.HU
7.CH1
8.HU
9.HU
10.HU
11.HU
12.HU
13.DE7
14.HU
15.US46
Polako did his run? I was supposed to be in the next one, but I never got my name.
Fine scale analysis of Eurogenes' British and Irish
http://i129.photobucket.com/albums/p217/dpwes/UK.png
SpreadsheetQuote:
Key: Red = Belarussian + Lihuanian (Baltic?), Yellow = Spanish (Iberian), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue = Italian (Southern European), Pink = Norwegian + Swedish (Scandinavian).
Fine scale analysis of Eurogenes' Scandinavians
http://i129.photobucket.com/albums/p217/dpwes/NOR-1.png(the second bar is me)
SpreadsheetQuote:
Key: Red = Belorussian + Lithuanian (Baltic?), Orange = East Finnish (Finnic), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue (not recorded) = Italian (Southern European), Dark Blue = Nganassan + Koryak + Yakut (Siberian), Lezgin (Caucasus or maybe Odin?).
Fine scale analysis of Eurogenes' Finns and Estonians
http://i129.photobucket.com/albums/p217/dpwes/FIN-1.png
Key: Red = Belorussian + Lithuanian (Balto-Slavic?), Orange = Pathan + Burusho (South Central Asian), Light Green = Finnish (Finnic), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue = Italian (Southern European), Dark Blue = Karitiana + Pima + Koryak (Amerindian), Purple = Koryak + Nganassan + Yakut (Siberian), Pink = Norwegian + Swedish (Scandinavian).
Spreadsheet
The two first bars are Estonians, EE1 and EE2.
It's pretty interesting, how the founder population in Lithuania has had an effect on these charts.
When purely looking at Y-DNA, Lithuania is the 2nd most Finno-Ugric country in Europe, after Finland. That red "Balto-Slavic" genetic group that forms quite definitely has a very large "southern-Finnic" component which is automatically labeled as Balto-Slavic, thus increasing the "Balto-Slavic", in native southern-Finnic populations.
What's the difference between the "Baltic" in the Scandinavian charts and the "Balto-Slavic" in the Estonian-Finnish chart?