Log in

View Full Version : Eurogenes K15 confusion



RyoHazuki
03-02-2020, 08:33 AM
When I PCA plot the G25 K15 samples for Southern England and Southern France (taken from the K15 Vahaduo reference sheet), I come out exactly half way inbetween them.

https://i.imgur.com/r0Cy1WR_d.jpg?maxwidth=640&shape=thumb&fidelity=medium

Yet when I compare the distances in Vahaduo, I'm several times closer to England.

Distance to: RyoHazuki
4.55089002 England
19.06995281 French_South

What gives? Shouldn't I be equidistant to both?

RyoHazuki
03-02-2020, 09:01 AM
I assume due to the legality of DNA testing in France, the French sample size isn't wide enough to illustrate an equal affinity to England for me, so it comes up very distant.
https://i.imgur.com/ymNLJ6X_d.jpg?maxwidth=640&shape=thumb&fidelity=medium
https://i.imgur.com/xdG59hw_d.jpg?maxwidth=640&shape=thumb&fidelity=medium

Korialstrasz
03-02-2020, 06:26 PM
A PCA chart is usually a projection onto the 2 dimensional plane (in other words, it uses only two of all components available). In eurogenes 15k's case, the distances are calculated in the 15 dimensional space. Therefore, these distances do not quite match those seen on the PCA charts.

So, unless two components that constitute the plot are satisfactorily representative for the relevant populations, the ensuing plot is quite misleading.

For a "true" visualization experience on the two dimensional plane, we would need 15c2 = 105 such plots.

RyoHazuki
03-02-2020, 07:37 PM
A PCA chart is usually a projection onto the 2 dimensional plane (in other words, it uses only two of all components available). In eurogenes 15k's case, the distances are calculated in the 15 dimensional space. Therefore, these distances do not quite match those seen on the PCA charts.

So, unless two components that constitute the plot are satisfactorily representative for the relevant populations, the ensuing plot is quite misleading.

For a "true" visualization experience on the two dimensional plane, we would need 15c2 = 105 such plots.

Thank you! Looking at raw distance numbers and comparing them to visualizations should've given me the clue that their criteria for calculating distance is not the same.

Gorilla
03-02-2020, 08:03 PM
When I PCA plot the G25 K15 samples for Southern England and Southern France (taken from the K15 Vahaduo reference sheet), I come out exactly half way inbetween them.

https://i.imgur.com/r0Cy1WR_d.jpg?maxwidth=640&shape=thumb&fidelity=medium

Yet when I compare the distances in Vahaduo, I'm several times closer to England.

Distance to: RyoHazuki
4.55089002 England
19.06995281 French_South

What gives? Shouldn't I be equidistant to both?

How do you plot your results? Can you show me please?

RyoHazuki
03-02-2020, 08:06 PM
How do you plot your results? Can you show me please?

Take your K15 Admixture percentages from Gedmatch
Paste them into the boxes and generate a coordinate on this site https://gen3553.pagesperso-orange.fr/ADN/K15.htm
Download the chart image and open it in ms paint
Match the generated coordinate with the mouse position shown in the bottom left of the window
click to map yourself

RyoHazuki
03-02-2020, 09:00 PM
A PCA chart is usually a projection onto the 2 dimensional plane (in other words, it uses only two of all components available). In eurogenes 15k's case, the distances are calculated in the 15 dimensional space. Therefore, these distances do not quite match those seen on the PCA charts.

So, unless two components that constitute the plot are satisfactorily representative for the relevant populations, the ensuing plot is quite misleading.

For a "true" visualization experience on the two dimensional plane, we would need 15c2 = 105 such plots.

Also, I think the dimensions for the K15 PCA algorithm aren't related to the 15 Admixture percentages Eurogenes reports, since those admixture groups are mapped on the chart as well; you have to enter all 15 numbers to get a coordinate.