PDA

View Full Version : Should I buy G25 coordinates? List pros and cons. I am expecting a debate



Scandal
08-15-2020, 11:29 AM
Pros:
- another calculator that can give a different perspective
- good calculator according to certain members

Cons:
- costs some money
- not a good calculator according to certain members

:confused: :shrug:

Kaspias
08-15-2020, 11:40 AM
Future of the genetic community lies in G25. You should buy while it is still cheap.

Scandal
08-15-2020, 11:43 AM
Future of the genetic community lies in G25. You should buy while it is still cheap.

I don't know much about g25 tbh. I heard it's based on k15.

mitalit
08-15-2020, 11:46 AM
yes you should, I guess it is not to everyone's taste (nothing is to everyone's taste) but it is an entertaining tool.

Kaspias
08-15-2020, 11:48 AM
I don't know much about g25 tbh. I heard it's based on k15.

It is based on a Eurogenes calculator which has 25 components. (which is not released publicly) The newly arrived data on ancient genomes are can be found in G25 SS and Davidski updating it regularly. So the main reason is accessibility to ancient DNA and 3rd party tools developed based on G25 coordinates such as various calculators or other stuff.

Jana
08-15-2020, 11:50 AM
Not worth it in my opinion because 12 EUR/USD (whatever currency it is) isn't small money.
For modern populations it's horrible for east Europeans (very scarce and poor references), for ancients is is very good indeed but you don't struck me as type so interested in ancient origins.

And K13 added almost same ancient samples G25 has (it's still little behind perhaps but getting there), so you have it covered.

My advice - don't buy it.

But you should make your own decision :)

vbnetkhio
08-15-2020, 11:51 AM
Pros:
- another calculator that can give a different perspective
- has good reputation according to certain members

Cons:
- costs some money
- not a good calculator according to certain members

:confused: :shrug:

i would summarize the good calc/bad calc argument like this...

cons
- sometimes gives very similar results as k13, so it doesn't offer much of a diferrent perspective.
- modern populations are lacking
- has problems with calculating proportions of some very ancient or distant admixtures, like Asian and African in Europeans, or Iran Neolithic

pros
-can measure Yamnaya/WHG/EEF influence more accurately than k13.
-includes some ancient samples that aren't publicly available otherwise.

Jana
08-15-2020, 11:53 AM
I also want to add problem isn't with G25 tool (which I like), problem is with David's modern references, lack of regional samples and unrepresetative modern national east euro averages.
People here throw money on any genetic toy that comes out, use it wisely.

Scandal
08-15-2020, 11:58 AM
I also want to add problem isn't with G25 tool (which I like), problem is with David's modern references, lack of regional samples and unrepresetative modern national east euro averages.
People here throw money on any genetic toy that comes out, use it wisely.

The reference populations may be improved in the future.

Jana
08-15-2020, 12:02 PM
The reference populations may be improved in the future.

Yes, and than I would consider buying it (if I already didn't).
IMO in your place I would save money and invest to find out more about your Y/mt DNA. It may be small part of genome after all but it's something unique to you and can shed light on your direct paternal/maternal origins.

For example: http://www.yseq.net/product_info.php?products_id=11788

gixajo
08-15-2020, 12:03 PM
Pros:
- another calculator that can give a different perspective
- has good reputation according to certain members

Cons:
- costs some money
- not a good calculator according to certain members

:confused: :shrug:

You will use it very often, and you can do a lot of things only playing with coordinates and creating models, there are some members that periodically post new models, so I think that
for that price it's worth it.

As for precision, it all depends on the references, so in that sense, as a tool itself, it is as accurate (or more in my opinion) than any other similar previous project.

I suppose the references will be expanded and updated.

But take it for what it is, a toy with a certain "sophistication", not a proof of anything, and as always, see together all the results that you are obtaining and observe trends, and thus, together with other results of commercial companies, gedmatch etc., we can get a general idea.

Edit: I forgot one "con" in which I thought before obtain my G25, you must to give your raw data to an "unknown" person.

Vid Flumina
08-15-2020, 12:17 PM
only see pros and cost is that of a pint of beer..

Scandal
08-15-2020, 12:24 PM
only see pros and cost is that of a pint of beer..

You've got quite expensive beers in Italy : - )

vbnetkhio
08-15-2020, 12:28 PM
Future of the genetic community lies in G25. You should buy while it is still cheap.

the future lies in a commercial version of sicentific tools which you can't run easily on an average PC, like qpAdm or fastIBD.
g25 is just on gedmatch level, has some advandages over it, but also disadvantages.

J. Ketch
08-15-2020, 12:30 PM
I would buy it, but then I'm not you. It's also possible that Davidski despises Hungarians and uses the money for anti-Hungary hate propaganda.

Do what you want.

Defcon2
08-15-2020, 12:37 PM
I wouldn't buy it, it entertains too much ...

Luke35
08-15-2020, 12:38 PM
You already know my opinion my friend.

My vote is Yea.

Leto
08-15-2020, 12:47 PM
I'm not the most active user of G25 here and I have to admit I find the constant spamming of the section with G25 threads rather annoying. Those who do so know I'm talking about them. However I think you can still buy the coordinates, they are definitely useful in some cases. After all, 12 bucks is not a huge amount of money.

Leto
08-15-2020, 12:49 PM
the future lies in a commercial version of sicentific tools which you can't run easily on an average PC, like qpAdm or fastIBD.
g25 is just on gedmatch level, has some advandages over it, but also disadvantages.
You're the first follower of the crank Zoro xD

Jana
08-15-2020, 12:50 PM
I'm not the most active user of G25 here and I have to admit I find the constant spamming of the section with G25 threads rather annoying. Those who do so know I'm talking about them. However I think you can still buy the coordinates, they are definitely useful in some cases. After all, 12 bucks is not a huge amount of money.

I'd like to see your results here :)

https://www.theapricity.com/forum/showthread.php?329782-G25-East-Slavic-ancient-calculator

Leto
08-15-2020, 12:53 PM
the future lies in a commercial version of sicentific tools which you can't run easily on an average PC, like qpAdm or fastIBD.
g25 is just on gedmatch level, has some advandages over it, but also disadvantages.
You're the first follower of the crank Zoro xD

vbnetkhio
08-15-2020, 12:56 PM
I would buy it, but then I'm not you. It's also possible that Davidski despises Hungarians and uses the money for anti-Hungary hate propaganda.

Do what you want.

well, there are 37 Hungarian samples available, and he included only 14. that's pretty high on his scale, he included only 3 for some Greek regions.

vbnetkhio
08-15-2020, 12:57 PM
You're the first follower of the crank Zoro xD

i used to be on the dark side, but he converted me.

Rocinante
08-15-2020, 01:08 PM
I highly recommend it. Now i understand better myself.

Zoro
08-15-2020, 01:11 PM
G25 is based on the Plink program PCA. No scientist in his/her right mind uses PCA to calculate admixture percentages. I have given several reasons why and I’m not going to repeat myself since no one here except for a couple will understand the technical reasons anyways.

PROOF:

Scientific papers use formal methods such as qpAdm (including Davidski himself) MOST of the time
ADMIXTURE SOME of the time for quick and dirty work
G25 or PCA NONE of the time for admixture calculations

vbnetkhio
08-15-2020, 01:31 PM
G25 is based on the Plink program PCA. No scientist in his/her right mind uses PCA to calculate admixture percentages. I have given several reasons why and I’m not going to repeat myself since no one here except for a couple will understand the technical reasons anyways.

PROOF:

Scientific papers use formal methods such as qpAdm (including Davidski himself) MOST of the time
ADMIXTURE SOME of the time for quick and dirty work
G25 or PCA NONE of the time for admixture calculations

i think i would be right to say in most cases it's not horribly inaccurate.

it's just that it's on gedmatch level, not some super advanced tool like most people believe.

here is a comparison of qpadm and g25, g25 is mostly 5-10% off.
https://anthrogenica.com/showthread.php?21236-Eurogenes-Hunter_Farmer_Herder_2020&p=689155&viewfull=1#post689155

it's a very simple model though. in more complicated models there is a problem of reduced precision: Illyrians being identical to modern Italians, some of the Bell Beakers identical to modern Norwegians, etc. and they aren't identical, of course.

ph2ter
08-15-2020, 01:37 PM
You will not learn nothing spectacular with G25, but it is fun to play with these numbers.
Here you have a zillion of threads about G25 and you can't get part in any of them.
We all here are not like other normal people who are uninterested in DNA and 12$ is nothing if you like to play with your autosomal data.

Why to miss such pleasures?

gixajo
08-15-2020, 01:39 PM
G25 is based on the Plink program PCA. No scientist in his/her right mind uses PCA to calculate admixture percentages. I have given several reasons why and I’m not going to repeat myself since no one here except for a couple will understand the technical reasons anyways.

s

Can you tell me ( summarized or detailed) your reasons in a private message,please? I don´t remember them.

vbnetkhio
08-15-2020, 02:05 PM
Pros:
- another calculator that can give a different perspective
- good calculator according to certain members

Cons:
- costs some money
- not a good calculator according to certain members

:confused: :shrug:

this is what i would do. i would ask him to add all the available Hungarian samples.

there is 37 of them, and they are mostly labeled as "Budapest" and "Hungary", but they are very diverse by results, and probably have ancestry from all Hungarian regions.
i don't know about you, but if I was Hungarian it would be the most important to analyze the ancestry of all available samples and compare myself to them.
this is just my problem with G25, maybe you don't mind this.

if he adds them, i would buy it. it isn't that expensive.

Scandal
08-15-2020, 02:35 PM
Can you tell me ( summarized or detailed) your reasons in a private message,please? I don´t remember them.

He can write it here.

Zoro
08-15-2020, 02:35 PM
i think i would be right to say in most cases it's not horribly inaccurate.

it's just that it's on gedmatch level, not some super advanced tool like most people believe.

here is a comparison of qpadm and g25, g25 is mostly 5-10% off.
https://anthrogenica.com/showthread.php?21236-Eurogenes-Hunter_Farmer_Herder_2020&p=689155&viewfull=1#post689155

it's a very simple model though. in more complicated models there is a problem of reduced precision: Illyrians being identical to modern Italians, some of the Bell Beakers identical to modern Norwegians, etc. and they aren't identical, of course.

Trust but verify those models yourself using an appropriate set of outgroups. It’s easy to get a qpAdm result to pass. Some people use rediculosly few outgroups just to get their model to pass!

I’ll try to verify some of those models using at least 10 outgroups if i have time. I haven’t had much luck to see a decent agreement between qpAdm and G25 when i use at least 10 outgroups

Zoro
08-15-2020, 02:37 PM
He can write it here.

Sorry bro. You can search my last 30 posts and find it

Kaspias
08-15-2020, 02:43 PM
the future lies in a commercial version of sicentific tools which you can't run easily on an average PC, like qpAdm or fastIBD.
g25 is just on gedmatch level, has some advandages over it, but also disadvantages.

If an automated version of qpAdm will be released, then yes, that's correct. But 90% of this community simply can't run it.

I personally even believe Gedmatch is better than G25. But they refuse to add new projects/samples.

Token
08-15-2020, 02:44 PM
Yes if you don't feel like learning how to use qpAdm and Plink. While it takes half a hour to learn G25, learning how to use Admixtools decently will take you at least some weeks

Zoro
08-15-2020, 02:57 PM
If an automated version of qpAdm will be released, then yes, that's correct. But 90% of this community simply can't run it.

I personally even believe Gedmatch is better than G25. But they refuse to add new projects/samples.

I don’t want to confuse the issue even more, but all those Gedmatch calculators were made using samples genotyped with the older W Eurasian biased Illumina OmniExpress array. If someone were to make a Gedmatch calculator today using exclusively the newer Illumina GSA array data or better yet the Illumina Multi-Ethnic array or WGS you’ll see many more E Eurasian alleles uncovered as the W Eurasian bias decreases and in most cases individual’s E Eurasian percent will increase more so for E European and W Asian populations

Leto
08-15-2020, 02:57 PM
If an automated version of qpAdm will be released, then yes, that's correct. But 90% of this community simply can't run it.

I personally even believe Gedmatch is better than G25. But they refuse to add new projects/samples.
For most people even G25 won't be a thing. Go to the 23andme subreddit and see what kind of people take the tests. Commercial results and maybe Gedmatch (not necessarily).

Kaspias
08-15-2020, 03:23 PM
I don’t want to confuse the issue even more, but all those Gedmatch calculators were made using samples genotyped with the older W Eurasian biased Illumina OmniExpress array. If someone were to make a Gedmatch calculator today using exclusively the newer Illumina GSA array data or better yet the Illumina Multi-Ethnic array or WGS you’ll see many more E Eurasian alleles uncovered as the W Eurasian bias decreases and in most cases individual’s E Eurasian percent will increase more so for E European and W Asian populations

The same goes for G25 because it is based on a Gedmatch like calculator as well. What is worse here is there is something off in G25 which I called PCA effect. This makes Balkan Turks and Balkars as close populations just because both are plotting between W and E Eurasian. On the other hand, despite it has glitches such as the calculator effect on oracles or arrays you mentioned, Gedmatch oracles might work better when they supplied with required data and you also have a chance to compare individuals with fixed components. My criticizing here to Davidski who decided to improve G25 over Gedmatch. It could be more useful if supplied with fresh data and whole-genome sequenced component references and spreadsheet.

I feel like Davidski tried to find a solution to West and East Eurasian differentiation, and in the end, it is resulted in overestimating.





For most people even G25 won't be a thing. Go to the 23andme subreddit and see what kind of people take the tests. Commercial results and maybe Gedmatch (not necessarily).

I'm sure half of this forum is thinking I'm Anatolian because of my commercial results.

Lucas
08-15-2020, 03:31 PM
G25 is based on the Plink program PCA. s

It's Eigenstrat PCA, not Plink. But rest is true.

Lucas
08-15-2020, 03:34 PM
If an automated version of qpAdm will be released, then yes, that's correct. But 90% of this community simply can't run it.

I personally even believe Gedmatch is better than G25. But they refuse to add new projects/samples.

I posted automated version already here. And since that time second such tool was released by Skoglund if I remember.

Token
08-15-2020, 03:39 PM
I posted automated version already here. And since that time second such tool was released by Skoglund if I remember.

You still need to set up Admixtools manually and know how to use Plink, so still far from being user friendly unless you are a Linux nerd

Damião de Góis
08-15-2020, 03:54 PM
It's the most recent calculator avaliable with lots of samples (both ancient and modern) and the possibilities of experimentations are endless with websites like https://vahaduo.github.io/vahaduo/.
12 usd (~10€) shouldn't be an issue compared to what 23andme cost you.

Lucas
08-15-2020, 03:58 PM
You still need to set up Admixtools manually and know how to use Plink, so still far from being user friendly unless you are a Linux nerd

Yes. But much easier still.

Zoro
08-15-2020, 04:51 PM
The same goes for G25 because it is based on a Gedmatch like calculator as well. What is worse here is there is something off in G25 which I called PCA effect. This makes Balkan Turks and Balkars as close populations just because both are plotting between W and E Eurasian. On the other hand, despite it has glitches such as the calculator effect on oracles or arrays you mentioned, Gedmatch oracles might work better when they supplied with required data and you also have a chance to compare individuals with fixed components. My criticizing here to Davidski who decided to improve G25 over Gedmatch. It could be more useful if supplied with fresh data and whole-genome sequenced component references and spreadsheet.

I feel like Davidski tried to find a solution to West and East Eurasian differentiation, and in the end, it is resulted in overestimating.






I'm sure half of this forum is thinking I'm Anatolian because of my commercial results.



The systemic issue I was referring to which causes a W Eurasian bias affects all analysis whether it's PCA, qpAdm, dstats, ADMIXTURE, G25, etc.

It's an innocent chip design issue (nothing malicious) but its a secret I guarantee no one here or on any forum is familiar with. I'll be back later to explain it after consulting with Dilawer to make sure I have everything correct.

gixajo
08-15-2020, 05:02 PM
:picard1:

Adamm
08-15-2020, 05:27 PM
G25 is the best thing, so yes you should get it.

vbnetkhio
08-15-2020, 05:43 PM
The systemic issue I was referring to which causes a W Eurasian bias affects all analysis whether it's PCA, qpAdm, dstats, ADMIXTURE, G25, etc.

It's an innocent chip design issue (nothing malicious) but its a secret I guarantee no one here or on any forum is familiar with. I'll be back later to explain it after consulting with Dilawer to make sure I have everything correct.

there have been rumors for a long time that the new chips have a lower European resolution and a higher Asian one. not that it underestimates the overall European percentage, but it has problems differentiating between the European subregions.

actually, the current 23andme/myheritage/FTDNA chip misses around 100k of the 150k SNPs used by gedmatch calculators.
and those 150k SNPs are selected by maf and indep-pairwise pruning (of a mostly European dataset) , which is supposed to remove the noise and keep the ancestry relevant SNPs if i got it right?

Lucas
08-15-2020, 05:47 PM
there have been rumors for a long time that the new chips have a lower European resolution and a higher Asian one. not that it underestimates the overall European percentage, but it has problems differentiating between the European subregions.

actually, the current 23andme/myheritage/FTDNA chip misses around 100k of the 150k SNPs used by gedmatch calculators.
and those 150k SNPs are selected by maf and indep-pairwise pruning, which is supposed to remove the noise and keep the ancestry relevant SNPs if i got it right?

I think if Dienekes would make dodecad 3.0 today it would encompass much more snps from new chips. Simply Gedmatch use Dodecad created with old chips and not reading many new snps names(?) used in new ones.

Someone could hack it and make new Dodecad;/

knez01
08-15-2020, 05:48 PM
I don't buy that, the current chip is horrible, the phased results for me are miles more realistic for both me and hundreds of other people on this site.

Zoro
08-15-2020, 07:17 PM
The systemic issue I was referring to which causes a W Eurasian bias affects all analysis whether it's PCA, qpAdm, dstats, ADMIXTURE, G25, etc.

It's an innocent chip design issue (nothing malicious) but its a secret I guarantee no one here or on any forum is familiar with. I'll be back later to explain it after consulting with Dilawer to make sure I have everything correct.

Ok I got everything confirmed.

So let's say humans have 4 million positions which are variable. It's too expensive to genotype all 4 million positions because 23andme or AncestryDNA customers are not willing to pay $800 each to get tested.

So 23andMe comes up with the idea let's not genotype 4 million SNPs since our customers won't pay $800, let's only genotype a SAMPLE of 700,000 SNPs instead since that will be cheaper.

NOW the questions is which 700,000 SNPs out of 4 million SNPs to use. Some of those SNPs have alleles derived in E Asians, some in Europeans, and some in Africans.

SO 23andme has to make a business decision. They figure that since most of their customers are Americans and Europeans, they want to sample more SNPs derived in Europeans than in E Asians so that they can more accurately tell their customers what part of Europe they have ancestry from.

SO they need to sacrifice some E Asian SNPs for European SNPs to serve their main customers which are Americans and Europeans more accurately.

So hypothetically they decide that out of the 700,000 SNPs, 300 K will be European, 200K E Asian, 200 K African (just making the number up but you get the idea)

Let's simplify this and suppose humans only have 20 SNPs, 10 are derived in E Asians and 10 in Europeans like this:


<colgroup width="85" span="10"></colgroup> <tbody>
rs1
rs2
rs3
rs4
rs5
rs6
rs7
rs8
rs9
rs10


rs11
rs12
rs13
rs14
rs15
rs16
rs17
rs18
rs19
rs20

</tbody>


Let's say a W Asian or E European tester comes along and he has the following SNPs

<style type="text/css">body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small } a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em; } a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em; } comment { display:none; }</style>
<colgroup width="85" span="10"></colgroup> <tbody>
rs1
rs2
rs3
rs4
rs5
rs6
rs7
rs8
rs9
rs10


rs11
rs12
rs13
rs14
rs15
rs16
rs17
rs18
rs19
rs20

</tbody>


So to save money let's say 23andMe tells ILLUMINA make us a chip that samples SNPs rs1-rs6 and rs11-rs14. So the E European or W Asian tester will test like this

<style type="text/css">body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small } a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em; } a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em; } comment { display:none; }</style>
<colgroup width="85" span="6"></colgroup> <tbody>
rs1
rs2
rs3
rs4
rs5
rs6


rs11
rs12
rs13
rs14



</tbody>

So if you do his admixture calculation he will be 100% European and 0% E Asian but had 23andMe sampled his whole genome he would have shown

6/8 = 75% European
2/8 = 25% E Asian

So the fact that the whole genome was not sampled fucks up qpAdm, Admixture, PCA, and everything else under the sun because of the W Eurasian bias in sampling SNPs<style type="text/css">body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small } a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em; } a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em; } comment { display:none; }</style>

Zoro
08-15-2020, 07:20 PM
I don't buy that, the current chip is horrible, the phased results for me are miles more realistic for both me and hundreds of other people on this site.

I don't expect you to buy it since you have very very limited knowledge on how this stuff works

Of course its going to be horrible but that's not because of the chip. It's because the new chip only has a small percentage of SNPs overlapping with the old calculators which were based on the old chip !

The fix is you have to design new calculators based on the SNPs in the new chip.

knez01
08-15-2020, 07:41 PM
I don't expect you to buy it since you have very very limited knowledge on how this stuff works

Of course its going to be horrible but that's not because of the chip. It's because the new chip only has a small percentage of SNPs overlapping with the old calculators which were based on the old chip !

The fix is you have to design new calculators based on the SNPs in the new chip.

How come the imputed v3 provides better results on g25 as well? Is the g25 based on the old SNP overlapping or? But yes, I understand what you're trying to say.

waam
08-15-2020, 08:32 PM
A question due to my lack of knowledge, in Admixture Studio Pro version they give you the option to see your nmonte oracle, it doesn't necessarily mean it uses G25 coordinated to calculate right?

Zoro
08-15-2020, 09:37 PM
there have been rumors for a long time that the new chips have a lower European resolution and a higher Asian one. not that it underestimates the overall European percentage, but it has problems differentiating between the European subregions.

actually, the current 23andme/myheritage/FTDNA chip misses around 100k of the 150k SNPs used by gedmatch calculators.
and those 150k SNPs are selected by maf and indep-pairwise pruning (of a mostly European dataset) , which is supposed to remove the noise and keep the ancestry relevant SNPs if i got it right?

Yes so what’s more important being able to detect your E Asian alleles or higher European resolution?

For 23andme and others its European resolution and not as much detecting your E Asian alleles since europeans and Americans are their main customer. For W Asians it’s Detecting E Asian alleles and not European resolution so W Asians have to find east Asian companies to do their testing for them.

Or you can have the Best of both worlds and do whole genome sequencing but that’s a little pricey and you can’t really take advantage of your whole genome sequence using Gedmatch calculators because those calculators are not designed for analyzing all those SNPs

Zoro
08-15-2020, 09:41 PM
How come the imputed v3 provides better results on g25 as well? Is the g25 based on the old SNP overlapping or? But yes, I understand what you're trying to say.

Almost all those calculators have more overlapping SNPs with V3 and V4 than V5. Someone needs to sit down and design new calculators based on V5 data only so all the references have to be V5. Then and only then do you have some hope of maximizing accuracy of detection of your non-European ancestry but even then it’s not as good as whole genome sequencing. But for you to take advantage of whole genome sequencing someone would have to sit down and make calculators based on whole genome sequencing references only

knez01
08-15-2020, 09:44 PM
Almost all those calculators have more overlapping SNPs with V3 and V4 than V5. Someone needs to sit down and design new calculators based on V5 data only so all the references have to be V5. Then and only then do you have some hope of detecting some of your non-European ancestry but even then it’s not as good as whole genome sequencing. But for you to take advantage of whole genome sequencing someone would have to sit down and make calculators based on whole genome sequencing references only

Interesting, thank you for the lesson.

Elias.99
08-15-2020, 10:54 PM
Tbh its good because it gives my % correctly but the admixture thingy is pretty bad compared to GEDmatch.