Results 1 to 8 of 8

Thread: Vahaduo K13 PCA Plot

  1. #1
    Member MoroLP's Avatar
    Join Date
    Apr 2020
    Last Online
    04-01-2023 @ 07:39 PM
    Ethnicity
    N/A
    Country
    Croatia
    Gender
    Posts
    136
    Thumbs Up
    Received: 116
    Given: 218

    0 Not allowed!

    Default Vahaduo K13 PCA Plot

    I don't know if was already discussed but what's the reason individual samples, for e.g. from Croatia because have them at hand, have such Vahaduo K13 distance where are closest to "Croat_Bosnia":

    Distance to: RG
    1.97208012 Croat_Bosnia
    2.03388790 Croat_Slavonia
    2.23008968 Croat_Herzegovina
    2.27530218 Croat_Southern
    2.59792225 Croat_Istria
    2.63294512 Bosniak
    2.91192376 Serb_NorthEastBosnia
    2.91849276 Croat_Dalmatia
    2.95428502 Croat_Lika
    3.03103943 Romanian_Moldavia_North
    3.17102507 Bosniak_Krajina_West
    3.23069652 Serb_CentralCroatia
    3.25024614 Romanian_North
    3.33393161 Serb_WestSerbia
    3.40797594 Bosniak_Southeast
    3.83605266 Croat
    3.84108058 Serb_Lika
    4.04990123 Serb_BosanskaKrajina
    4.10195076 Croat_Kvarner
    4.18637074 Serb_CentralSerbia
    4.26313265 Moldovan_Central
    4.26313265 Moldovan_Central
    4.39130960 Bosniak_Central_Northeast
    4.53995595 Moldovan_North
    4.53995595 Moldovan_North

    but on Vahaduo Custom PCA are closer to "Bosniak_Krajina_West"?

    As if current Custom PCA (X : PC1, Y : PC2) favors more Southern-Eastern plot. Such things with distance did happen before but weren't so significant. I also see slight differences compared to the plot of some Croatian regional averages in comparison to a few weeks ago. Are some, maybe new, Bigger/Smaller Regions population references influencing plot accuracy and should be removed from "Source"? Should use a different "PC" combination?

  2. #2
    Banned
    Join Date
    Jun 2017
    Last Online
    04-13-2024 @ 09:22 AM
    Ethnicity
    Healthy human being
    Country
    Moldova
    Gender
    Posts
    5,581
    Thumbs Up
    Received: 5,506
    Given: 1,507

    1 Not allowed!

    Default

    Don't expect them to be surgically accurate. If you take the same samples and create averages in other calculators, say Dodecad K12b, HarrapaWorld, maybe MDLP K11, etc., you will see that the preference for the closest pops could vary. Therefore, it's useful to just look at the general picture which tells you that they cluster with Croats and some Bosniaks and Serbs.

  3. #3
    Veteran Member
    Join Date
    Aug 2014
    Last Online
    Today @ 10:48 AM
    Location
    Côte d'Azur
    Ethnicity
    Solutrean
    Country
    Monaco
    Region
    Lyon
    Y-DNA
    R1b-Z367
    mtDNA
    H1c1
    Gender
    Posts
    7,405
    Thumbs Up
    Received: 9,496
    Given: 5,740

    1 Not allowed!

    Default

    A distance in oracle doesn't equate a metric distance on pca. It also depends how the pca was done, if it has any proper weighting for each components. You can only visualize the true distances in 3 dimensions instead of 2, like your actual point could be more forward or behind the other ones around because it's really multi dimensional with 13 components that have complex relationship with each others.


    The fits with 2,3,4 pops will usually give you a better representation of oracle values on a pca, as it negates some of the translation effects of the flat dimensions of a pca for most of the components. In general the values you see in Oracle are more a metric of precision than distances seen on a pca.

  4. #4
    Member MoroLP's Avatar
    Join Date
    Apr 2020
    Last Online
    04-01-2023 @ 07:39 PM
    Ethnicity
    N/A
    Country
    Croatia
    Gender
    Posts
    136
    Thumbs Up
    Received: 116
    Given: 218

    0 Not allowed!

    Default

    Quote Originally Posted by Ion Basescul View Post
    Don't expect them to be surgically accurate. If you take the same samples and create averages in other calculators, say Dodecad K12b, HarrapaWorld, maybe MDLP K11, etc., you will see that the preference for the closest pops could vary. Therefore, it's useful to just look at the general picture which tells you that they cluster with Croats and some Bosniaks and Serbs.
    Of course, they aren't surgically accurate, but other calculators aren't part of the question what's the reason for the difference between Vahaduo distance and distance on Vahaduo PCA? Is more accurate Vahaduo distance or Vahaduo PCA? If is Vahaduo distance what can be done on Vahaduo PCA to accurately show Vahaduo distance (e.g. remove some population references etc.)? There must be some exact technical answer.

    Are you saying that nothing can be done with Vahaduo PCA?

  5. #5
    Member MoroLP's Avatar
    Join Date
    Apr 2020
    Last Online
    04-01-2023 @ 07:39 PM
    Ethnicity
    N/A
    Country
    Croatia
    Gender
    Posts
    136
    Thumbs Up
    Received: 116
    Given: 218

    0 Not allowed!

    Default

    Quote Originally Posted by Petalpusher View Post
    A distance in oracle doesn't equate a metric distance on pca. It also depends how the pca was done, if it has any proper weighting for each components. You can only visualize the true distances in 3 dimensions instead of 2, like your actual point could be more forward or behind the other ones around because it's really multi dimensional with 13 components that have complex relationship with each others.


    The fits with 2,3,4 pops will usually give you a better representation of oracle values on a pca, as it negates some of the translation effects of the flat dimensions of a pca for most of the components. In general the values you see in Oracle are more a metric of precision than distances seen on a pca.
    I assume Lucas knows the exact answer to all of it.

  6. #6
    Banned
    Join Date
    Jun 2017
    Last Online
    04-13-2024 @ 09:22 AM
    Ethnicity
    Healthy human being
    Country
    Moldova
    Gender
    Posts
    5,581
    Thumbs Up
    Received: 5,506
    Given: 1,507

    0 Not allowed!

    Default

    Quote Originally Posted by MoroLP View Post
    Of course, they aren't surgically accurate, but other calculators aren't part of the question what's the reason for the difference between Vahaduo distance and distance on Vahaduo PCA?
    It's just a matter of slight discrepancies when you convert K13 data to coordinates.
    Let me give you an example.

    Here are some random K13 values (4 because that's apparently the minimum that you are allowed to input).

    Romania_Transylvania_44_Cluj,26.75,25.54,17.98,8.7 5,16.54,1.83,0,0.22,0.93,0.96,0.48,0,0
    Romania_Transylvania_45_Cluj,28.95,23.38,16.35,9.3 1,17.97,0,0,0.27,1.93,0.84,1.01,0,0
    Romania_Transylvania_46_Cluj,22.09,26.93,18.03,11. 41,16.92,1.60,0,0.48,1.39,0.42,0.06,0,0.65
    Romania_Transylvania_47_Cluj,23.20,31.08,17.07,5.7 8,16.60,1.42,1.58,0.84,1.55,0.55,0.33,0,0



    But for the PCA to plot them, they are being converted to projected data.
    If you go to "PCA Data", then there you would see the real values on which the PCA is based on.
    And in this case that's

    Romania_Transylvania_44_Cluj,1.671709,0.329760,1.5 31675
    Romania_Transylvania_45_Cluj,5.211808,1.180717,-0.899635
    Romania_Transylvania_46_Cluj,-1.566898,-3.929776,-0.328587
    Romania_Transylvania_47_Cluj,-5.316619,2.419299,-0.303453

    So the process of converting from one format to the other loses some of the accuracy, maybe because of the selection of the preferred dimension or something as simple as rounding. And as Petalpusher explained, 2D PCAs have drawbacks because they can capture only 2 dimensions. That's why you receive the option to pick "from which perspective you want to see the data", and a 2D PCA let's you select 2. A 3D PCA is more objective and that's why they have appeared later.

  7. #7
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 04:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,911
    Thumbs Up
    Received: 3,471
    Given: 1,541

    0 Not allowed!

    Default

    Quote Originally Posted by MoroLP View Post
    I don't know if was already discussed but what's the reason individual samples, for e.g. from Croatia because have them at hand, have such Vahaduo K13 distance where are closest to "Croat_Bosnia":

    Distance to: RG
    1.97208012 Croat_Bosnia
    2.03388790 Croat_Slavonia
    2.23008968 Croat_Herzegovina
    2.27530218 Croat_Southern
    2.59792225 Croat_Istria
    2.63294512 Bosniak
    2.91192376 Serb_NorthEastBosnia
    2.91849276 Croat_Dalmatia
    2.95428502 Croat_Lika
    3.03103943 Romanian_Moldavia_North
    3.17102507 Bosniak_Krajina_West
    3.23069652 Serb_CentralCroatia
    3.25024614 Romanian_North
    3.33393161 Serb_WestSerbia
    3.40797594 Bosniak_Southeast
    3.83605266 Croat
    3.84108058 Serb_Lika
    4.04990123 Serb_BosanskaKrajina
    4.10195076 Croat_Kvarner
    4.18637074 Serb_CentralSerbia
    4.26313265 Moldovan_Central
    4.26313265 Moldovan_Central
    4.39130960 Bosniak_Central_Northeast
    4.53995595 Moldovan_North
    4.53995595 Moldovan_North

    but on Vahaduo Custom PCA are closer to "Bosniak_Krajina_West"?

    As if current Custom PCA (X : PC1, Y : PC2) favors more Southern-Eastern plot. Such things with distance did happen before but weren't so significant. I also see slight differences compared to the plot of some Croatian regional averages in comparison to a few weeks ago. Are some, maybe new, Bigger/Smaller Regions population references influencing plot accuracy and should be removed from "Source"? Should use a different "PC" combination?
    A PCA is used for a visual overwiev of a large number of samples. Of course, a lot of of the data is lost when 13 dimensions are reduced to 2.

    In this case his closeness to that average is sacrificed, so something else could be better represented.

    I don't think different PCs would help. Maybe try using only Croatian and some closely related averages as sources, and project the rest, to get a PCA which reflects some Croatian-specific differences better.

  8. #8
    Veteran Member Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Oct 2016
    Last Online
    @
    Ethnicity
    me
    Country
    European Union
    Y-DNA
    R1a > YP1337 > R-BY160486*
    mtDNA
    H3*
    Gender
    Posts
    6,066
    Thumbs Up
    Received: 7,243
    Given: 2,623

    0 Not allowed!

    Default

    What is more interesting is that PCA output of for example Updated K13, created by Vahaduo PCA could be run in original Vahaduo oracle too and is giving different results ( to some extent).
    Also distances are lower. Someone could explore this topic more and maybe would be profitable to make special K13 Updated PCA based sheet.

    I mean such PCA values could be run in Vahaduo oracle, you can easily convert whole sheet by yourself and your own coords too.

    Code:
    German_Baden-Württemberg,-27.036395,2.153501,0.521963,-1.587294,-6.163404,2.930148,0.195385,4.090403,-7.315862,-1.609318

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. K25 Vahaduo Calculator
    By Bakha in forum Autosomal DNA
    Replies: 75
    Last Post: 02-05-2024, 06:20 PM
  2. Ancient K15 Vahaduo
    By InfamousAngel99 in forum Autosomal DNA
    Replies: 44
    Last Post: 04-07-2021, 11:21 PM
  3. Vahaduo BA-CA Calculator
    By Aren in forum Autosomal DNA
    Replies: 46
    Last Post: 08-13-2020, 08:57 PM
  4. Guys how do I use Vahaduo
    By Oluniaczek5 in forum Autosomal DNA
    Replies: 5
    Last Post: 06-14-2020, 12:34 PM
  5. HELP VAHADUO!
    By andre in forum Autosomal DNA
    Replies: 0
    Last Post: 11-19-2019, 08:36 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •