View Full Version : Scaled vs. Unscaled, why?
JerryS.
08-06-2020, 03:59 PM
what's the difference between scaled and unscaled g25 deciphered results? all the DIY calculators I've seen recommend scaled data, so why is there unscaled data? what is the value if any of unscaled data?
second question I have is your thoughts on slanted G25 data based on supposed author bias against some southern populations. I've read posts here expressing concern that sample populations chosen for the G25 data break down are not the best references for some southern Europeans and tend to cause the data to shift or lean north.
thanks.
marco
08-06-2020, 04:31 PM
Scaled is the more accurate of the two, their is no vs it’s not even an argument. G25 is only as good as the references after the release of some components like basal west Africa results have become a lot more accurate for some
Etc my results now:
Distance: 1.8674% / 0.01867441
52.0 LEVANTINE
21.6 IBEROMAURUSIAN
16.6 IBERIAN
7.2 Basal_West_African
2.0 EAST-HUNNIC-STEPPE
0.6 GREEK
Distance: 1.4910% / 0.01491036
34.0 TUR_Barcin_N
16.0 Ancient_Arabian
12.6 Ancestral_N_African
9.6 Caspian
6.8 ITA_Villabruna
6.4 Basal_West_African
6.0 RUS_Ust_Ishim
4.0 Yamnaya_RUS_Samara
3.8 Pre-Pastoralist
0.6 Papuan
0.2 Basal_East_African
JerryS.
08-06-2020, 04:39 PM
Scaled is the more accurate of the two, their is no vs it’s not even an argument. G25 is only as good as the references after the release of some components like basal west Africa results have become a lot more accurate for some
Etc my results now:
Distance: 1.8674% / 0.01867441
52.0 LEVANTINE
21.6 IBEROMAURUSIAN
16.6 IBERIAN
7.2 Basal_West_African
2.0 EAST-HUNNIC-STEPPE
0.6 GREEK
thank you for your reply. since scaled data is more accurate, why is there unscaled data if it has no real value on its face? regarding the population samples used, I've read that eurogenes and G25 data have a north bias. I mentioned my own data results on another thread where this statement was made:
https://www.theapricity.com/forum/showthread.php?326147-(((Davidski))) South Slavs (there have been rumors, and the averages he made for them speak for themselves.) Also probably Italians and Greeks, since his arch-nemesis Dienekes(Dodecad) and Angela(Eupedia) are Greek and Italian.
It seems there is something fishy with his Italian averages as well https://www.theapricity.com/forum/sh...(((Davidski)))
thank you again.
marco
08-06-2020, 04:42 PM
thank you for your reply. since scaled data is more accurate, why is there unscaled data if it has no real value on its face? regarding the population samples used, I've read that eurogenes and G25 data have a north bias. I mentioned my own data results on another thread where this statement was made:
https://www.theapricity.com/forum/showthread.php?326147-(((Davidski))) South Slavs (there have been rumors, and the averages he made for them speak for themselves.) Also probably Italians and Greeks, since his arch-nemesis Dienekes(Dodecad) and Angela(Eupedia) are Greek and Italian.
It seems there is something fishy with his Italian averages as well https://www.theapricity.com/forum/sh...(((Davidski)))
thank you again.
It’s an option provided, some people are pro unscaled but let me ask you this when you are doing a PCA plot do you use scaled cords or unscaled? Scaled cords make the most sense considering distance, if you’ve worked with unscaled you will know they give unstable results. For me like I said g25 can only be as good as its references, to be more references that come out the more accurate it will be
One is better for ancient samples while the other is better for modern, don't remember which though.
digital_noise
08-06-2020, 09:00 PM
There’s nothing “fishy” about it, meaning there’s no sabotage intended Jerry. Italian samples are probably the hardest “pure” sample pop to nail down due to the sheer number of outside influences on almost all the samples. But for everyday calc usage, this isn’t going to be a problem. Rest assured, your results are accurate
JerryS.
08-06-2020, 09:08 PM
There’s nothing “fishy” about it, meaning there’s no sabotage intended Jerry. Italian samples are probably the hardest “pure” sample pop to nail down due to the sheer number of outside influences on almost all the samples. But for everyday calc usage, this isn’t going to be a problem. Rest assured, your results are accurate
so is the scaled for modern use and unscaled for ancient use, or what? I still haven't gotten a clear answer on my initial question. also, welcome to the methadone clinic until 08 August 1300 ET. LOL. btw the something fishy or sabotage comments are not mine.
ph2ter
08-06-2020, 09:23 PM
Unscaled are your real G25 values and scaled are a filtered version of unscaled.
This filter accentuates your lower PC components and attenuates your higher PC components.
Those higher PC values make unscaled models unstable, because there lands the majority of your individual genetic drift which can be falsely directed towards some exotic populations in 25-dimensional G25 space.
For that reason scaled is more popular.
But with scaled you sometimes can loose some real part of your ancestry.
Scaled is usually better for ancients and unscaled for moderns.
JerryS.
08-06-2020, 09:26 PM
Unscaled are your real G25 values and scaled are a filtered version of unscaled.
This filter accentuates your lower PC components and attenuates your higher PC components.
Those higher PC values make unscaled models unstable, because there lands the majority of your individual genetic drift which can be falsely directed towards some exotic populations in 25-dimensional G25 space.
For that reason scaled is more popular.
But with scaled you sometimes can loose some real part of your ancestry.
Scaled is usually better for ancients and unscaled for moderns.
tha's a little over my head but i think I understand you. Thanks
vbnetkhio
08-07-2020, 01:52 PM
There’s nothing “fishy” about it, meaning there’s no sabotage intended Jerry.
did you even read the thread Jerry linked?
if there are 100 Tuscan samples available, than making an average from all 100 will be pretty accurate, simply because of the law of large numbers.
if you apply an outlier removing algorithm to remove obviously foreign samples, it's going to be even more accurate.
now imagine if you sorted those samples by name and kept only the first 5. Let's say the samples are labeled Tuscan1, Tuscan2.. etc. and for some reason you decide to keep only 1 through 5, and make an average from those.
how accurate is the average going to be? it's kind of a hit or miss. if just 2 of those 5 are too northern shifted, then the average will be too northern shifted too, right.
Also, it makes absolutely no sense to apply this "method" in the first place, when just running all 100 samples would be much faster and easier, and more accurate.
it's absolutely ridiculous, and makes no sense why somebody would do this, unless he wants to make inaccurate averages on purpose.
then you also add some non-academic samples you pulled out of your ass to the average.
well this is exactly what Davidski does. and when he was confronted about this by his customer, he resorted to banning and blocking.
what's the difference between scaled and unscaled g25 deciphered results? all the DIY calculators I've seen recommend scaled data, so why is there unscaled data? what is the value if any of unscaled data?
second question I have is your thoughts on slanted G25 data based on supposed author bias against some southern populations. I've read posts here expressing concern that sample populations chosen for the G25 data break down are not the best references for some southern Europeans and tend to cause the data to shift or lean north.
thanks.
his "methodology" of calculating averages is one big problem, the other is that a PCA simply isn't meant for measuring genetic admixture. no matter which samples he includes, there will always be big underlying issues. neither useing scaled nor unscaled can help. user Zoro elaborated more about this in his posts.
admixture calculators, which are designed for the purpose of measuring are surely much more accurate.
also, admixture calculators at least give you some kind of results as a reference point. in G25 you just get the coordinates, and then you have to "model" yourself.
when modelling yourself, you'll always be biased, and include the references which you think make sense. this is uselless for me, i want to get some neutral, unbiased information from an algorithm.
Lucas
08-07-2020, 10:27 PM
his "methodology" of calculating averages is one big problem, the other is that a PCA simply isn't meant for measuring genetic admixture. no matter which samples he includes, there will always be big underlying issues. neither useing scaled nor unscaled can help. user Zoro elaborated more about this in his posts.
admixture calculators, which are designed for the purpose of measuring are surely much more accurate.
also, admixture calculators at least give you some kind of results as a reference point. in G25 you just get the coordinates, and then you have to "model" yourself.
when modelling yourself, you'll always be biased, and include the references which you think make sense. this is uselless for me, i want to get some neutral, unbiased information from an algorithm.
I think he switched from admixture calcs like K15, k36 to G25 PCA just to get rid of Dodecad tool for calculating components. Because it was created by his enemy Dienekes and for years he had to use it which was pain for him. I could understand such reason:) At least he greatly diminished popularity of admixture calcs now and finally defeated Dienekes...
vbnetkhio
08-10-2020, 09:06 PM
I think he switched from admixture calcs like K15, k36 to G25 PCA just to get rid of Dodecad tool for calculating components. Because it was created by his enemy Dienekes and for years he had to use it which was pain for him. I could understand such reason:) At least he greatly diminished popularity of admixture calcs now and finally defeated Dienekes...
:icon_lol: this stuff is hilarious.
who knows, maybe there'll be a "Dienekes strikes back"?
JerryS.
08-10-2020, 11:31 PM
:icon_lol: this stuff is hilarious.
who knows, maybe there'll be a "Dienekes strikes back"?
I find your lack of faith in the Dodecad disturbing.
Luke35
08-10-2020, 11:50 PM
I find your lack of faith in the Dodecad disturbing.
Enough of this! JerryS, release him!
Solitude
10-02-2023, 04:08 PM
hello vbnet , can explain to me difference between averages calculators and admixture calculators? , what is kind of calculator that companies like ancestry and 23andme use to calculate their dna samples? in your opnion what is best calc is g25 or calc from companies?
Solitude
10-02-2023, 04:09 PM
hello lucas , can explain to me difference between averages calculators and admixture calculators? , what is kind of calculator that companies like ancestry and 23andme use to calculate their dna samples? in your opnion what is best calc is g25 or calc from companies?
Powered by vBulletin® Version 4.2.3 Copyright © 2025 vBulletin Solutions, Inc. All rights reserved.