The PCA-driven Eurogenes G25 calculator has been one of the topics of great interest here in this forum when talking about autosomal tests and, because of that, there was the need of a research looking at the method used by G25 and replicating it for testing accuracy.
METHODOLOGY
Using PLINK, I made a PCA with the samples gathered from the Estonian Biocentre (for European, Asian and Amerindian samples) and Henn et al. (for African samples).
The PCA consisted of 20 dimensions, instead of the 25 dimensions used by G25, and the Quality Control was made with geno set to 20%.
After the Quality Control of PLINK and the PCA being made, there was also the removal of individuals that distanced themselves from the cluster they were supposed to be in (an example being a San sample that appeared as between the Europeans and the Africans, probably a consequence of colonialism).
With the scaling of the eigenvecs obtained by the PCA using the eigenvals, the data was then sent to Vahaduo for the estimation of admixture.
RESULTS
Although in a general way the results appeared consistent for the samples tested, they were not that accurate for both the continental residual percentages (up to 6%), often lacking certain well known residuals or attributing false residuals (southeast asian with SSA), and the intracontinental percentages (not counting only residuals), having the most apparent intracontinental problems in West Eurasia.
In conclusion, further testing is required to see if the methodology could be perfected or if it has inherent flaws, but in this case, it appeared to have problems pinpointing with a high accuracy like it is described as being by forum members.
REFERENCES
http://www-evo.stanford.edu/repository/paper0002/
https://www.cog-genomics.org/plink/1.9/
https://evolbio.ut.ee/
https://vahaduo.github.io/vahaduo/
Bookmarks