Eurogenes Biogeographic Ancestry Project

Printable View

Show 40 post(s) from this thread on one page

02-27-2011, 08:24 PM
Pallantides

My top 20 and graph by mnd661 from ABF

Code:

SE4 SE5 NO3 EE1 NO1 SE8 US71 SE1 SE2 NO5 NO6 NO7 US61 US182 US111 SE6 North_Russian US5 EE2 RU

http://img28.imageshack.us/img28/1334/no2b.jpg
02-28-2011, 05:59 AM
Pallantides

When is a genetic map also a geographic map? Always and never

When is a genetic map also a geographic map? Always and never

Quote:

Just after posting my last entry I realized that it might seem a bit confusing to those who don't really have much experience in reading PCA/MDS plots. So instead of rehashing it, I decided to make another entry about how to read such plots, which might come in handy to my project members. It's really not that difficult, if you keep in mind that they're never really geographic maps, but at the same time, always contain at least some results that show high correlation with geography.

In that last blog entry about the Balto-Slavs I concluded that, in terms of intra-European genetic diversity, Balts were more easterly than Slavs. I based this on both PCA/MDS plots and genetic distances. The reason I did this was because, as mentioned above, PCA/MDS results don't always gel with geography, especially when relative genetic isolates are included in the analysis. The plot I used from Nelis et al. 2009 showed clearly how adding a relative genetic isolate, like Eastern Finns from Kuusamo, can wreck the correlation between genes and geography.

http://i129.photobucket.com/albums/p..._Europe2-2.png

Adding the samples from the Kuusamo isolate basically means that "east" is no longer in the same direction for all the samples on that plot. If we are to assume that it is in the same direction, we get absurd results like Swedes being more easterly than Russians. Obviously, there are now two, perhaps more, directions which correlate well with the geographic "east" within Europe. As you can see, I marked the two obvious ones on that plot as East 1 and East 2.

So now, it seems, we're facing a problem. Which "east" applies to the Balts and Slavs? If it's East 1, then Russians are more easterly than Lithuanians. If it's East 2, then it's the other way around. But not to worry, because it's easy to work that out. The simplest thing to do is to focus on the samples we're interested in, and zoom in on that area of the plot. So let's just leave the Swedes, North Germans, Czechs, Poles, Russians, Lithuanians and Latvians in the analysis, and ignore the Finnic and Southern European samples.

http://i129.photobucket.com/albums/p..._Europe3-1.png

OK, so now the plot makes much more sense; Swedes and North Germans are located west of Poles and Russians, as we'd expect. At the same time, if we're to follow this line of thinking, Lithuanians and Latvians are located east of Poles and Russians. Just to make sure this is correct, let's see what happens on a plot that doesn't include the Kuusamo isolate.

http://img251.imageshack.us/img251/4...12overview.png

The difference is clear; there's now only one "east", which runs towards the bottom of the plot (ie. west to east = vertical axis).

In order to cement these findings, we can now look at some pairwise intra-European genetic distances. Let's double check, for instance, that the Lithuanians and Latvians are indeed more easterly than the Poles and Russians, as opposed to just being, say, more northerly. For this I use the same table as in my last log entry.

Russians from Tver versus...

- Utah Americans of Northern and Western European origin (1.56)
- French (1.94)
- Swedes (1.59)
- Finns from Helsinki (2.10)
- Southern Italians (2.68)
- Spanish (2.32)

Lithuanians versus...

- Utah Americans of Northern and Western European origin (1.74)
- French (2.20)
- Swedes (1.74)
- Finns from Helsinki (2.33)
- Southern Italians (2.96)
- Spanish (2.62)

Lithuanians and Latvians are much less similar to western Europeans than Russians are. Therefore, they are more easterly than Russians in terms of intra-European genetic diversity. At the same time, they're not more similar to Northern European populations such as Swedes and Finns. Therefore, any claims, for instance, that they're simply more northerly than Russians don't hold up. In fact, based on all the pairwise scores, the best way to describe the situation is to say that Russians are more mainstream as far as intra-European genetic diversity is concerned, while there's something fairly unique about the Balts, which is especially evident when looking at Latvians.

As I've already noted in my last blog entry, I can't see this as being a recent development. Rather, the close genetic relationship between current Northern Slavs and Balts seems like the recent development, and likely due to mixing in the last 1000 years or so. Before that, I suspect, these two groups were much more distinct from each other than they are today.

By the way, there are two points I'd like to stress before signing off. Firstly, it's important to understand that what is "east" on an intra-European PCA/MDS plot, need not be "east" on an inter-continental plot. For instance, consider a Central European ethnic group with some minor East Eurasian admixture, and an Eastern European group with less East Eurasian admixture. The former will cluster "west" of the latter on an intra-European plot. However, when East Asian samples are added to the analysis, it then becomes an inter-continental plot, and the Central European group with the more significant Asian influence will pull "east" further than the Eastern Europeans. You can actually see something like that on the following two plots I published recently; intra-North Eurasian where Lithuanians cluster west of Hungarians, and intra-European, where the situation is reversed.

http://img255.imageshack.us/img255/6818/neura12.png http://img251.imageshack.us/img251/4...12overview.png

Anyone with an interest in these sorts of analyses should try and sort this out in their minds before attempting any sort of interpretation of PCA/DS plots.

Secondly, and this goes without saying, but I'll say it anyway; beware of making generalizations about entire language groups based on small sample sets. For instance, it's not reasonable to draw inferences about the relationship between Balts and Slavs based on the 10 Lithuanians from the Behar et al. study. These are just 10 people, possibly from near the Belorussian and Polish borders, and might be very different from their countrymen from another part of Lithuania, and even more distinct from Latvians. However, it is reasonable to do what I did, and that was to look at sample sets of tens and hundreds of individuals, from Poland, Lithuania, Latvia and Russia, featured in recent peer-reviewed studies. As I say, this all pure logic, but sometimes it needs to be reiterated.
02-28-2011, 06:35 AM
Polako

Quote:

Originally Posted by Franz

The Southeastern European cluster makes no sense. It’s not clear at all what this cluster means like the others, or what he did.

There's not much to do really. Just load the samples into ADMIXTURE correctly, and give them a spin.

There you have it, mystery solved.

Quote:

The Southeast European cluster appears highest in the Turks and Jews but not in the Southeast Europeans, so that’s a bad name for it.

You can call it the pink cluster if that makes you feel better.

Quote:

For example, PL1 sums to 1.02 and CA5 sums to 0.98. I would like to see more clusters. Dienekes will be doing 64 clusters. With more clusters, we would be able to notice more differences among the samples. Polako does have an agenda though.

Have you ever hard of rounding off figures to the nearest per cent?

By the way, what's my agenda? Please do tell. Mind you, if you come up with some crock of shit I'm gonna pressure you to prove it. And you better do a good job by loading up these samples into ADMIXTURE and showing us exactly how I "manipulated" them.

If you can't do that, then I'd suggest you keep your conspiracy theories to yourself.

Quote:

Maybe Polako made a mistake? It’s not from someone that submitted his/her raw data to him but from a dataset.

I don't make mistakes like that. It's a sample from the Finnish HapMap. You can see how he's behaving in a study done by the Finns (look for the southernmost Finn).

Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging - supp info

Quote:

With more clusters, it should be more apparent.

Not gonna work, because ADMIXTURE goes ballistic at high Ks, especially with closely related groups.

Europe at K=5 is all I can do without hundreds of samples from each country and ethnic group. Maybe 6 might work, but that'd take all night.

Quote:

What happened to NO4? Some Germans are missing. Maybe they requested withdrawal?

Obviously, they're related. Again, mystery solved. Glad I could help out.

Quote:

Originally Posted by Karl

I find it extremely intriguing that Polako decided to take the Lithuanians as a sample population for "Northern-European". Knowing his theories about the history of European populations, I shouldn't be surprised.

This is one of the reasons why I do not trust these "one-man" genetic researches.

I'll bring an example how this can influence people: Loki told me that Lithuanians and Belorussians are the most Northern-European Europeans.

Now if I made a genetic research and started collecting 23andme raw data, I could take the Scanian(southern-Swedish) population as my Northern-European sample population and then I could claim that Scanians are the most Northern-European Europeans, followed by Danes and Swedes.

Some people take these "one-man" genetic researches very seriously, which I find funny.

I find it funny that you actually decided to post this in public, without the foggiest idea how ADMIXTURE works.

The program picks who has the highest of whatever, and the cluster that seems "Northern European" always peaks in Lithuanians.

If you don't think so, then try it. Everything's online. I'll give you $500 if you put it all together properly, and then miraculously prove myself and Behar wrong by showing that Lithuanians aren't the modal group for that component.

But if you fail, you owe me $500.

Quote:

Now if I made a genetic research and started collecting 23andme raw data, I could take the Scanian(southern-Swedish) population as my Northern-European sample population and then I could claim that Scanians are the most Northern-European Europeans, followed by Danes and Swedes.

Bwahaha...yeah, you do that sunshine.
02-28-2011, 07:49 PM
Graham
Basque 13%
Mediterranean 2%
Southeast European 3%
Baltic Finnish 7%
North Euro 75%

Top matches, excluding Americans, Canadians etc..
Quote:
BY2
IE3
NO7
IE4
FR
IE6
UK16
UK4
DE1
FR
IE9
IE8
DE9
RU11
IE10
03-01-2011, 07:20 AM
Franz

Quote:

Originally Posted by Polako

There's not much to do really. Just load the samples into ADMIXTURE correctly, and give them a spin.

There you have it, mystery solved.

You can call it the pink cluster if that makes you feel better.

Something like that I expected. Parameters can be changed. There's more genetic diversity in there, so that explains the not so distinct cluster. Jews and Turks aren’t Southeastern European. As for cluster names, those that know more history, migrations, archaeology, etc. would understand them better. For those that don't, it's misleading.

Quote:

Have you ever hard of rounding off figures to the nearest per cent?

By the way, what's my agenda? Please do tell. Mind you, if you come up with some crock of shit I'm gonna pressure you to prove it. And you better do a good job by loading up these samples into ADMIXTURE and showing us exactly how I "manipulated" them.

If you can't do that, then I'd suggest you keep your conspiracy theories to yourself.

I've programmed using rounding algorithms. It's obviously rounding up whether it's round-ceiling or round-half-up, or whatever. Comparing the two extremes, 98% and 102%, there's a substantial 4% difference. Rounding to the nearest whole percent isn’t causing that. You or the spreadsheet is causing the limitation. You should fix it.

In order for me to prove that, you would have to give me all the data. It's not necessarily the results but the interpretations. Conspiracies are fun. I'm not the only one.

Quote:

I don't make mistakes like that. It's a sample from the Finnish HapMap. You can see how he's behaving in a study done by the Finns (look for the southernmost Finn).

Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging - supp info

On your global MDS, some SNPs you picked between v3 and v2 data didn't overlap as they were no-calls. That caused skewness.

Quote:

Not gonna work, because ADMIXTURE goes ballistic at high Ks, especially with closely related groups.

Europe at K=5 is all I can do without hundreds of samples from each country and ethnic group. Maybe 6 might work, but that'd take all night.

Upgrade your hardware or wait longer. v1.1 can use parallel processing on multi-core processors. Still, these simulations and programs like STRUCTURE which uses a different algorithm can have problems that is if the user uses them wrong.

Quote:

Obviously, they're related. Again, mystery solved. Glad I could help out.

 It wasn’t obvious, but it was a mystery.
03-01-2011, 08:12 PM
Loki

*Loki pours a glass of Tokaj while making this post of his top matches*

Quote:

1.HU
2.HU
3.HU
4.HU
5.HU
6.HU
7.CH1
8.HU
9.HU
10.HU
11.HU
12.HU
13.DE7
14.HU
15.US46

(thanks Graham)
03-01-2011, 08:13 PM
Grumpy Cat

Polako did his run? I was supposed to be in the next one, but I never got my name.
03-03-2011, 10:07 AM
Pallantides

Fine scale analysis of Eurogenes' British and Irish
http://i129.photobucket.com/albums/p217/dpwes/UK.png

Quote:

Key: Red = Belarussian + Lihuanian (Baltic?), Yellow = Spanish (Iberian), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue = Italian (Southern European), Pink = Norwegian + Swedish (Scandinavian).

Spreadsheet

Fine scale analysis of Eurogenes' Scandinavians
http://i129.photobucket.com/albums/p217/dpwes/NOR-1.png(the second bar is me)

Quote:

Key: Red = Belorussian + Lithuanian (Baltic?), Orange = East Finnish (Finnic), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue (not recorded) = Italian (Southern European), Dark Blue = Nganassan + Koryak + Yakut (Siberian), Lezgin (Caucasus or maybe Odin?).

Spreadsheet
03-03-2011, 12:52 PM
Pallantides

Fine scale analysis of Eurogenes' Finns and Estonians
http://i129.photobucket.com/albums/p217/dpwes/FIN-1.png

Key: Red = Belorussian + Lithuanian (Balto-Slavic?), Orange = Pathan + Burusho (South Central Asian), Light Green = Finnish (Finnic), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue = Italian (Southern European), Dark Blue = Karitiana + Pima + Koryak (Amerindian), Purple = Koryak + Nganassan + Yakut (Siberian), Pink = Norwegian + Swedish (Scandinavian).

Spreadsheet

The two first bars are Estonians, EE1 and EE2.
03-03-2011, 02:04 PM
Äike

Quote:

Originally Posted by Pallantides

Fine scale analysis of Eurogenes' Finns and Estonians
http://i129.photobucket.com/albums/p217/dpwes/FIN-1.png

Key: Red = Belorussian + Lithuanian (Balto-Slavic?), Orange = Pathan + Burusho (South Central Asian), Light Green = Finnish (Finnic), Green = French (Atlantic?), Aqua = Hungarian (Central European), Blue = Italian (Southern European), Dark Blue = Karitiana + Pima + Koryak (Amerindian), Purple = Koryak + Nganassan + Yakut (Siberian), Pink = Norwegian + Swedish (Scandinavian).

Spreadsheet

The two first bars are Estonians, EE1 and EE2.

It's pretty interesting, how the founder population in Lithuania has had an effect on these charts.

When purely looking at Y-DNA, Lithuania is the 2nd most Finno-Ugric country in Europe, after Finland. That red "Balto-Slavic" genetic group that forms quite definitely has a very large "southern-Finnic" component which is automatically labeled as Balto-Slavic, thus increasing the "Balto-Slavic", in native southern-Finnic populations.

What's the difference between the "Baltic" in the Scandinavian charts and the "Balto-Slavic" in the Estonian-Finnish chart?

Show 40 post(s) from this thread on one page