Page 1 of 3 123 LastLast
Results 1 to 10 of 22

Thread: Georgian qpAdm: 4-way inconsistency, advice appreciated.

  1. #1
    Banned Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Jan 2022
    Last Online
    @
    Ethnicity
    3:28
    Country
    European Union
    Gender
    Posts
    1,957
    Thumbs Up
    Received: 1,921
    Given: 1,564

    1 Not allowed!

    Default Georgian qpAdm: 4-way inconsistency, advice appreciated.

    Georgian (Reich dataset 1240K) is easily modelled with qpAdm:

    38.4% Turkey_N + 16.6% Russia_Samara_EBA_Yamnaya + 18.3% Georgia_Kotias.SG + 26.8 Iran_GanjDareh_N.

    Spoiler!

    The same model with G25:

    Code:
    Target: Georgian_Imer
    Distance: 3.0576% / 0.03057601
    55.2	GEO_CHG
    38.4	TUR_Barcin_N
    6.4	IRN_Ganj_Dareh_N

    Why such big incosistency? What is the issue here?
    Last edited by kingmob; 02-06-2022 at 09:53 AM.

  2. #2
    Veteran Member FinalFlash's Avatar
    Join Date
    Jun 2018
    Last Online
    Yesterday @ 10:10 PM
    Meta-Ethnicity
    YNWA
    Ethnicity
    The Human Race
    Country
    United States
    Gender
    Posts
    5,886
    Thumbs Up
    Received: 4,385
    Given: 2,855

    0 Not allowed!

    Default

    Quote Originally Posted by eupator View Post
    Georgian (Reich dataset 1240K) is easily modelled with qpAdm:

    38.4% Turkey_N + 16.6% Russia_Samara_EBA_Yamnaya + 18.3% Georgia_Kotias.SG + 26.8 Iran_GanjDareh_N.

    Spoiler!

    The same model with G25:

    Code:
    Target: Georgian_Imer
    Distance: 3.0576% / 0.03057601
    55.2	GEO_CHG
    38.4	TUR_Barcin_N
    6.4	IRN_Ganj_Dareh_N

    Why such big incosistency? What is the issue here?
    Georgians aren't a monolith. Where is the Georgian from in the qpAdm run?

  3. #3
    Banned Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Jan 2022
    Last Online
    @
    Ethnicity
    3:28
    Country
    European Union
    Gender
    Posts
    1,957
    Thumbs Up
    Received: 1,921
    Given: 1,564

    0 Not allowed!

    Default

    Quote Originally Posted by FinalFlash View Post
    Georgians aren't a monolith. Where is the Georgian from in the qpAdm run?

    Ι don't know, it's the one in the Reich Dataset 1240K which was probably also used in G25 before user/regional references were added.

    However, the inconsistency still remains huge, even if someone also accounts for regional variance.

    Armenians also show such inconsistency, although not as big, if you are interested.

  4. #4
    Banned Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Jan 2022
    Last Online
    @
    Ethnicity
    3:28
    Country
    European Union
    Gender
    Posts
    1,957
    Thumbs Up
    Received: 1,921
    Given: 1,564

    1 Not allowed!

    Default

    qpAdm Armenian (1240K) 4-way:

    44.4% Turkey_N + 6.3% Russia_Samara_EBA_Yamnaya + 4.9% Georgia_Kotias.SG + 44.4% Iran_GanjDareh_N.


    Spoiler!

  5. #5
    Veteran Member FinalFlash's Avatar
    Join Date
    Jun 2018
    Last Online
    Yesterday @ 10:10 PM
    Meta-Ethnicity
    YNWA
    Ethnicity
    The Human Race
    Country
    United States
    Gender
    Posts
    5,886
    Thumbs Up
    Received: 4,385
    Given: 2,855

    1 Not allowed!

    Default

    Quote Originally Posted by eupator View Post
    Ι don't know, it's the one in the Reich Dataset 1240K which was probably also used in G25 before user/regional references were added.

    However, the inconsistency still remains huge, even if someone also accounts for regional variance.

    Armenians also show such inconsistency, although not as big, if you are interested.
    qpAdm I believe uses more SNPs and is more accurate when determining actual ancestry compared to G25. G25 tends to suffer from calculator effect where you can replace a certain ancient population with a similar ancient population without it affecting fit distances. qpAdm is also the tool of choice for pro geneticists so I'd wager it's probably a better choice.

    As for the Georgian run, I think it's possible that it may be a Northeastern Georgian(Tusheti or Khevsur) given the excess Yamnaya. Imeretians score almost none in G25 runs.

    You'd be surprised how different Georgians can be. Make some runs on G25 for all subgroups and you'll see.

  6. #6
    Veteran Member Zoro's Avatar
    Join Date
    Dec 2017
    Last Online
    01-22-2023 @ 10:21 AM
    Meta-Ethnicity
    Indo-Iranian
    Ethnicity
    Kurd
    Ancestry
    74.31% W. Eurasian + 11.42% E. Eurasian + 5.42% S. Eurasian + 8.85% Basal Eurasian/African
    Country
    United States
    Region
    Kurdistan
    Y-DNA
    Q-M25
    mtDNA
    W4
    Gender
    Posts
    2,225
    Thumbs Up
    Received: 1,249
    Given: 524

    2 Not allowed!

    Default

    Quote Originally Posted by eupator View Post
    Georgian (Reich dataset 1240K) is easily modelled with qpAdm:


    Why such big incosistency? What is the issue here?
    Congrats on learning to use qpAdm! Looks like you have a pretty solid model. Your standard errors are low at about 3% and p-value is decent at 0.18. Your one to one results show Georgians a little closer to Iran-N than the rest and a little further from Yamnaya than the rest.

    As far as your question. G25 or PCA coordinates are hugely affected by which samples he uses. Looks like Davidski finally came out recently and said you shouldn’t take G25 disrances too seriously. You can google all the papers written which talk about PCA coordinates problems.

    It also seems that the nMonte program maker also posted the issue with the nMonte program people use with G25 for modeling (below). QpAdm on the other hand isn’t affected by all those issues. That’s why it is used in scientific papers and not G25 PCA !!

    January 23, 2022 at 2:15 AM
    Blogger huijbregts said...
    I was surprised to find Matt yesterday explaining the place of nMonte in genetic history. I was not aware of this all.
    Hopefully David will permit me a few supplementary remarks.
    I wrote nMonte as an experiment because I was curious whether a simple random walk algorithm could identify relevant groups.
    I was pleasantly surprises when it did. Actually it identified a single set of non-unique samples; next it used the naming labels within this set to infer distances to well known predefined groups.
    I am still surprised that the simple trick worked so well.
    By then (about 2015) I knew next to nothing about mathematical genetics.
    By now I better understand the problems with this kind of algorithms.
    In the first place there is the problem of overfitting. The ancients division of Global25 contains some five thousand samples. This permits no more 12 binary choices (because 2^12 = 4096).
    And if you select a subset, the number is still smaller. So if you use 25 dimensions, you are are heavily overfitting.
    Many guys have tried to repair this by by what they call "scaling" the data, but which is really an anti-scaling and does not solve the problem of too many dimensions. It is much better to truncate the dimensions at a much lower value than 25.
    In nMonte3 I have limited the damage of overfitting by applying a penalty on larger distances. Unfortunately I also offered the opportunity to switch off this penalizing by using the option pen=0. In spite of my repeated warning not to use this, if you do not perfectly understand what you are doing, many users interpreted this as an opportunity to prove there expert status

    If I were younger, I might try to improve nMonte by using a Bayesian algorithm.
    As it is, I incidentally use nMonte as a quick and dirty simple method.
    But mostly I am content with visualizing the data with algorithms like UMAP, which I think is underestimated, especially at this forum.
    Muzh ba staso la tyaro tsakha ra wubaasu

    [IMG][/IMG]

  7. #7
    Banned Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Jan 2022
    Last Online
    @
    Ethnicity
    3:28
    Country
    European Union
    Gender
    Posts
    1,957
    Thumbs Up
    Received: 1,921
    Given: 1,564

    0 Not allowed!

    Default

    Quote Originally Posted by FinalFlash View Post
    qpAdm I believe uses more SNPs and is more accurate when determining actual ancestry compared to G25. G25 tends to suffer from calculator effect where you can replace a certain ancient population with a similar ancient population without it affecting fit distances. qpAdm is also the tool of choice for pro geneticists so I'd wager it's probably a better choice.

    As for the Georgian run, I think it's possible that it may be a Northeastern Georgian(Tusheti or Khevsur) given the excess Yamnaya. Imeretians score almost none in G25 runs.

    You'd be surprised how different Georgians can be. Make some runs on G25 for all subgroups and you'll see.




    It doesn't look like it.


    Also, I looked up the Georgian details in the Reich .anno file and it states: Zugdidi in Megrelia.

  8. #8
    Banned Apricity Funding Member
    "Friend of Apricity"


    Join Date
    Jan 2022
    Last Online
    @
    Ethnicity
    3:28
    Country
    European Union
    Gender
    Posts
    1,957
    Thumbs Up
    Received: 1,921
    Given: 1,564

    0 Not allowed!

    Default

    Quote Originally Posted by Zoro View Post
    Congrats on learning to use qpAdm! Looks like you have a pretty solid model. Your standard errors are low at about 3% and p-value is decent at 0.18. Your one to one results show Georgians a little closer to Iran-N than the rest and a little further from Yamnaya than the rest.

    As far as your question. G25 or PCA coordinates are hugely affected by which samples he uses. Looks like Davidski finally came out recently and said you shouldn’t take G25 disrances too seriously. You can google all the papers written which talk about PCA coordinates problems.

    It also seems that the nMonte program maker also posted the issue with the nMonte program people use with G25 for modeling (below). QpAdm on the other hand isn’t affected by all those issues. That’s why it is used in scientific papers and not G25 PCA !!


    Thank you for the good post, very informative.

    A couple of my ponticgreek runs have std. errors of 6.3%ish, I don't seem to be able to be able to bring it down further. What do you think it's a threshold for an acceptable model?

  9. #9
    Veteran Member Zoro's Avatar
    Join Date
    Dec 2017
    Last Online
    01-22-2023 @ 10:21 AM
    Meta-Ethnicity
    Indo-Iranian
    Ethnicity
    Kurd
    Ancestry
    74.31% W. Eurasian + 11.42% E. Eurasian + 5.42% S. Eurasian + 8.85% Basal Eurasian/African
    Country
    United States
    Region
    Kurdistan
    Y-DNA
    Q-M25
    mtDNA
    W4
    Gender
    Posts
    2,225
    Thumbs Up
    Received: 1,249
    Given: 524

    0 Not allowed!

    Default

    Quote Originally Posted by eupator View Post
    qpAdm Armenian (1240K) 4-way:

    44.4% Turkey_N + 6.3% Russia_Samara_EBA_Yamnaya + 4.9% Georgia_Kotias.SG + 44.4% Iran_GanjDareh_N.


    1.491 Mbytes in use[/code][/spoiler]
    P-value a little on the lower side at 0.11 but still a pass

    You can test if your models are solid by modeling both Georgians and Armenians using

    Iran-N
    Kotias
    ENF
    EHG
    WHG

    for starters with and without WHG or CHG

  10. #10
    Veteran Member Zoro's Avatar
    Join Date
    Dec 2017
    Last Online
    01-22-2023 @ 10:21 AM
    Meta-Ethnicity
    Indo-Iranian
    Ethnicity
    Kurd
    Ancestry
    74.31% W. Eurasian + 11.42% E. Eurasian + 5.42% S. Eurasian + 8.85% Basal Eurasian/African
    Country
    United States
    Region
    Kurdistan
    Y-DNA
    Q-M25
    mtDNA
    W4
    Gender
    Posts
    2,225
    Thumbs Up
    Received: 1,249
    Given: 524

    0 Not allowed!

    Default

    Quote Originally Posted by eupator View Post
    Thank you for the good post, very informative.

    A couple of my ponticgreek runs have std. errors of 6.3%ish, I don't seem to be able to be able to bring it down further. What do you think it's an acceptable threshold for an acceptable model?
    Yeah that’s acceptable. Std errors can be lowered by introducing an outgroup that is more related to some outgroups than others but that may or may not lower your p-value to below passing 0.05

Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. qpAdm thread
    By vbnetkhio in forum Autosomal DNA
    Replies: 138
    Last Post: 09-01-2021, 05:02 PM
  2. [qpAdm] Someone know how to use it?
    By andre in forum Autosomal DNA
    Replies: 5
    Last Post: 08-28-2020, 05:04 PM
  3. qpAdm modelling, first attempt
    By vbnetkhio in forum Autosomal DNA
    Replies: 87
    Last Post: 06-21-2020, 03:29 PM
  4. Which countries could I fit in ?!much appreciated:)
    By Mandyk44667 in forum Taxonomy
    Replies: 2
    Last Post: 04-01-2019, 04:24 AM
  5. experienced advice appreciated
    By lei.talk in forum Technology
    Replies: 1
    Last Post: 07-20-2013, 06:37 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •