Page 8 of 14 FirstFirst ... 456789101112 ... LastLast
Results 71 to 80 of 139

Thread: qpAdm thread

  1. #71
    Veteran Member Zoro's Avatar
    Join Date
    Dec 2017
    Last Online
    01-22-2023 @ 10:21 AM
    Meta-Ethnicity
    Indo-Iranian
    Ethnicity
    Kurd
    Ancestry
    74.31% W. Eurasian + 11.42% E. Eurasian + 5.42% S. Eurasian + 8.85% Basal Eurasian/African
    Country
    United States
    Region
    Kurdistan
    Y-DNA
    Q-M25
    mtDNA
    W4
    Gender
    Posts
    2,225
    Thumbs Up
    Received: 1,249
    Given: 524

    0 Not allowed!

    Default

    Quote Originally Posted by Korialstrasz View Post

    I have got these results:

    Code:
            left   weight    se     z
            
    Adygei     0.439 0.595 0.738
    Turkmen   0.348 0.304 1.14 
    Bulgarian  0.213 0.404 0.527
    Not bad for a first attempt, I guess. If you put closely related populations together, you will likely get negative estimations with very high standard errors.
    I would like to reiterate that I have no prior knowledge on population genetics and quite ignorant compared to other apricians. For all I know, what I attempted might just be bullshit.


    I would also like to add that, after extracting the f2 statistics, I get notified that

    √ 1034771 SNPs read in total
    ! 1331 SNPs remain after filtering. 1331 are polymorphic.
    i Allele frequency matrix for 1331 SNPs and 22 populations is 0 MB

    I am not sure if this is normal but it seemed suspicious to me. As it eliminates almost the entireity of the SNPs. (I checked to see how many SNPS my FTDNA data and the Reich dataset has in common, and it turned out to be a little fewer than 130k. Both Reich and raw data have almost 600k lines for SNPs, significant amount of which I believe are either no-calls or missing values.)

    First, congrats on getting the software running and absolutely no to you being more ignorant than other people here. In fact 98% of the people wouldn't even have a clue as to what you just wrote. I would say you're more knowledgeable than 98% of the people here !

    Ok, so here's a few observations and tips:

    1- "! 1331 SNPs remain after filtering. 1331 are polymorphic." This is absolutely not acceptable and will give you horrible results and in fact is mostly responsible for the 114% standard errors you got on Turkmen. Although its very important for Admixtools 2 not to have missing SNPs in any of your samples ( in other words maxmiss=0) it's just as important that you salvage at least 100,000 SNPs. Drop low coverage samples if you have to

    2- Assuming you were able to get close to 100,000 SNPs if you still get high SE it means your left pops are too closely related and your right pops are unable to properly distinguish the difference between them. So add some right pops that are very differentially related to one left pop vs the other left pop.

    3- Let me know if you need a simple script to convert your FTDNA or Ancestry data to 23andme format

    You're on a good track, Good luck !
    Muzh ba staso la tyaro tsakha ra wubaasu

    [IMG][/IMG]

  2. #72
    Member Korialstrasz's Avatar
    Join Date
    Jan 2020
    Last Online
    06-29-2023 @ 09:09 PM
    Ethnicity
    Balkan + Caucasus
    Country
    Germany
    Region
    Berlin
    Y-DNA
    G-M210
    mtDNA
    U5b2a3a
    Politics
    Duncan Idaho
    Gender
    Posts
    124
    Thumbs Up
    Received: 124
    Given: 62

    0 Not allowed!

    Default

    Quote Originally Posted by Zoro View Post
    First, congrats on getting the software running and absolutely no to you being more ignorant than other people here. In fact 98% of the people wouldn't even have a clue as to what you just wrote. I would say you're more knowledgeable than 98% of the people here !

    Ok, so here's a few observations and tips:

    1- "! 1331 SNPs remain after filtering. 1331 are polymorphic." This is absolutely not acceptable and will give you horrible results and in fact is mostly responsible for the 114% standard errors you got on Turkmen. Although its very important for Admixtools 2 not to have missing SNPs in any of your samples ( in other words maxmiss=0) it's just as important that you salvage at least 100,000 SNPs. Drop low coverage samples if you have to

    2- Assuming you were able to get close to 100,000 SNPs if you still get high SE it means your left pops are too closely related and your right pops are unable to properly distinguish the difference between them. So add some right pops that are very differentially related to one left pop vs the other left pop.

    3- Let me know if you need a simple script to convert your FTDNA or Ancestry data to 23andme format

    You're on a good track, Good luck !
    Thanks! Your instructions have been tremendously helpful. I think I managed to convert the FTDNA file myself without losing any SNPS, but I don´t know if I missed anything.

    I took the part below from the admixtools documentation and this is pretty much in line with what you advise.

    By default, extract_f2() will be very cautious and exclude all SNPs which are missing in any population (maxmiss = 0). If you lose too many SNPs this way, you can either

    *limit the number of populations for which to extract f2-statistics,
    *compute f3- and f4-statistics directly from genotype files, or
    *increase the maxmiss parameter (maxmiss = 1 means no SNPs will be excluded).
    The advantages and disadvantages of the different approaches are described here. Briefly, when running qpadm() and qpdstat() it can be better to choose the safer but slower options 1 and 2, while for qpgraph(), which is not centered around hypothesis testing, it is usually fine choose option 3. Since the absolute difference in f-statistics between these approaches is usually small, it can also make sense to use option 3 for exploratory analyses, and confirm key results using options 1 or 2.
    I tried different maxmiss values to salvage some SNPS but the models I ran afterwards did not make much sense. I need to try different sets of populations, it seems. I had the impression that right-hand side populations functions akin to a "control variable", so, would it then make sense to run an analysis on modern populations using, let´s say, Iron Age samples that provide enough "control" for the left. Or is it better not to take too many liberties in this regard?


    I´ll be reading the instructions here: https://www.biorxiv.org/content/bior...ed/media-1.pdf
    50% Turkish_Deliorman + 50% Adygei @ 4,879

  3. #73
    Veteran Member Zoro's Avatar
    Join Date
    Dec 2017
    Last Online
    01-22-2023 @ 10:21 AM
    Meta-Ethnicity
    Indo-Iranian
    Ethnicity
    Kurd
    Ancestry
    74.31% W. Eurasian + 11.42% E. Eurasian + 5.42% S. Eurasian + 8.85% Basal Eurasian/African
    Country
    United States
    Region
    Kurdistan
    Y-DNA
    Q-M25
    mtDNA
    W4
    Gender
    Posts
    2,225
    Thumbs Up
    Received: 1,249
    Given: 524

    0 Not allowed!

    Default

    Quote Originally Posted by Korialstrasz View Post
    Thanks! Your instructions have been tremendously helpful. I think I managed to convert the FTDNA file myself without losing any SNPS, but I don´t know if I missed anything.

    I took the part below from the admixtools documentation and this is pretty much in line with what you advise.



    I tried different maxmiss values to salvage some SNPS but the models I ran afterwards did not make much sense. I need to try different sets of populations, it seems. I had the impression that right-hand side populations functions akin to a "control variable", so, would it then make sense to run an analysis on modern populations using, let´s say, Iron Age samples that provide enough "control" for the left. Or is it better not to take too many liberties in this regard?


    I´ll be reading the instructions here: https://www.biorxiv.org/content/bior...ed/media-1.pdf

    Post a couple of runs here showing me all the details of the output such as no of snps, right and left pops and I’ll try to diagnose for you. I would use maxmiss=0.002 or 0.003

  4. #74
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 04:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,911
    Thumbs Up
    Received: 3,471
    Given: 1,541

    2 Not allowed!

    Default

    some might find this useful:

    I made an AncestryDNA raw data to .ped converter script for R:
    Attachment 106242

    to run it, rename your raw data to "data.txt", then place the "data.txt" and "anc_to_ped.r" into your R directory, and run this command in R: source("anc_to_ped.r")

    The file has to be in the AncestryDNA format.
    If you have a different format, e.g 23andme, you can convert it first with DNA kit Studio (don't use a raw data template, just choose the ancestryDNA format)

  5. #75
    Veteran Member
    Apricity Funding Member
    "Friend of Apricity"

    Kaspias's Avatar
    Join Date
    Oct 2017
    Last Online
    @
    Location
    Ankara
    Meta-Ethnicity
    Rumelian
    Ethnicity
    Balkan Turkish, Pomak
    Country
    Turkey
    Y-DNA
    Q-F16045
    mtDNA
    K1a
    Gender
    Posts
    7,446
    Thumbs Up
    Received: 11,836
    Given: 7,303

    1 Not allowed!

    Default

    So I felt a need to learn how to run qpAdm, but pretty much beginner in these tools.

    I get this error while trying to create the 3rd file in plink:

    Code:
    1426149 (of 1426149) markers to be included from [ data.map ]
    
    ERROR:
    A problem with line 1 in [ data.ped ]
    Expecting 6 + 2 * 1426149 = 2852304 columns, but found 2842162

  6. #76
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 04:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,911
    Thumbs Up
    Received: 3,471
    Given: 1,541

    0 Not allowed!

    Default

    Quote Originally Posted by Kaspias View Post
    So I felt a need to learn how to run qpAdm, but pretty much beginner in these tools.

    I get this error while trying to create the 3rd file in plink:

    Code:
    1426149 (of 1426149) markers to be included from [ data.map ]
    
    ERROR:
    A problem with line 1 in [ data.ped ]
    Expecting 6 + 2 * 1426149 = 2852304 columns, but found 2842162
    did you use my script?
    there's a bug, of course... i'll try to fix it.

    edit:
    all seems to work fine for me, what are you trying to do with the file?

  7. #77
    Veteran Member
    Apricity Funding Member
    "Friend of Apricity"

    Kaspias's Avatar
    Join Date
    Oct 2017
    Last Online
    @
    Location
    Ankara
    Meta-Ethnicity
    Rumelian
    Ethnicity
    Balkan Turkish, Pomak
    Country
    Turkey
    Y-DNA
    Q-F16045
    mtDNA
    K1a
    Gender
    Posts
    7,446
    Thumbs Up
    Received: 11,836
    Given: 7,303

    0 Not allowed!

    Default

    Quote Originally Posted by vbnetkhio View Post
    did you use my script?
    there's a bug, of course... i'll try to fix it.

    edit:
    all seems to work fine for me, what are you trying to do with the file?
    I was following Korialstrasz's entries in #68. Here what I have done:

    I have got .bed and .bam, but while using this command: plink --file yourfile --make-bed --out yourfile_new to plink(bim fam fim) I received the error I posted. I used the R script you posted in order to get bed and bam.

    Besides, while extracting the populations from eigenstrat file I could not manage to get multiple populations within the file but only a pop, like: eigenstrat_to_plink("v44.3_HO_public",outpref = "master_plink",pops = 316)

    I think I will have some more problems in the following steps as I'm clueless, but that's it for now

  8. #78
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 04:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,911
    Thumbs Up
    Received: 3,471
    Given: 1,541

    0 Not allowed!

    Default

    Quote Originally Posted by Kaspias View Post
    I was following Korialstrasz's entries in #68. Here what I have done:

    I have got .bed and .bam, but while using this command: plink --file yourfile --make-bed --out yourfile_new to plink(bim fam fim) I received the error I posted. I used the R script you posted in order to get bed and bam.

    Besides, while extracting the populations from eigenstrat file I could not manage to get multiple populations within the file but only a pop, like: eigenstrat_to_plink("v44.3_HO_public",outpref = "master_plink",pops = 316)

    I think I will have some more problems in the following steps as I'm clueless, but that's it for now
    did you convert your raw data to ancestry format first? (with allele1 and allele2 in separate columns?)
    that seems to be causing your error.

  9. #79
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 04:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,911
    Thumbs Up
    Received: 3,471
    Given: 1,541

    0 Not allowed!

    Default

    ...

  10. #80
    Veteran Member
    Apricity Funding Member
    "Friend of Apricity"

    Kaspias's Avatar
    Join Date
    Oct 2017
    Last Online
    @
    Location
    Ankara
    Meta-Ethnicity
    Rumelian
    Ethnicity
    Balkan Turkish, Pomak
    Country
    Turkey
    Y-DNA
    Q-F16045
    mtDNA
    K1a
    Gender
    Posts
    7,446
    Thumbs Up
    Received: 11,836
    Given: 7,303

    0 Not allowed!

    Default

    Quote Originally Posted by vbnetkhio View Post
    did you convert your raw data to ancestry format first?
    I have done. However, I used a super kit(created with 3 different raw data) and had ~40MB size while an average raw data has 15-20, stating in case if it might be about it.

Page 8 of 14 FirstFirst ... 456789101112 ... LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Long Range Rifle thread......(Sniper thread)
    By koinovskiduckling in forum The Lounge
    Replies: 11
    Last Post: 03-13-2021, 03:41 PM
  2. [qpAdm] Someone know how to use it?
    By andre in forum Autosomal DNA
    Replies: 5
    Last Post: 08-28-2020, 05:04 PM
  3. qpAdm modelling, first attempt
    By vbnetkhio in forum Autosomal DNA
    Replies: 87
    Last Post: 06-21-2020, 03:29 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •