PDA

View Full Version : Tutorial for autosomal calculators



Petalpusher
07-16-2015, 08:26 AM
I ll make this quick in 5 simple steps, since so many people ask how to use these calculators out of GEDmatch


1- First, Download R and Dodecad DIY:

http://www.r-project.org/

https://drive.google.com/file/d/0B7AJcY18g2GaZGU4OWQ5OWItMzY2NC00NzI1LWIzNWMtMzUxY WI4NjRmMTlk/view


2- Get your genome raw datas ready in your Dodecad folder in the form of :

Genome_Yourname.zip
Genome_Yourname.txt (unzipped)

Drag and drop the calculators files inside this folder as well:

Calculator files:
https://docs.google.com/file/d/0B7AJcY18g2GaM2ZlOGQ0NjMtYzRlMS00YjA5LWIzZmUtM2RkM DIyNDEzZWZm/edit?hl=en_US&pli=1


3- Open R and "change current folder", point to the calculator's folder

4- Since it's the first time you are using it, you gonna have to standardize your data :


Enter the command : source('standardize.r')

This loads a small program that will convert your data from the
company-specific format to a common format in the next step.

- At the R prompt, enter:

a. If you have 23andMe data (either v2 or v3 chip):

standardize('johndoe.txt', company='23andMe')

b. if you have Family Finder data (Illumina chip only):

standardize('johndoe.csv', company='ftdna')

This command will write a file called 'genotype.txt' in the working directory;
this contains your genotype in a format understood by DIYDodecad.


5- Finally you run the calculator with this command :

system('DIYDodecadWin dv3.par')

Replace the "dv3" part by the name of the calculator you are currently using

For example with Dodecad "Euro7" :

system('DIYDodecadWin Euro7.par')

The program is gonna stall for a minute or two depending on the calculator (means it's working) and then you get your admix results which might look like this (not as cool of course) :

FINAL ADMIXTURE PROPORTIONS:
----------------------------

3.03% Caucasus
44.43% Northwestern
17.76% Northeastern
14.39% Southeastern
0.00% African
0.02% Far_Asian
20.36% Southwestern

CPU time = 69.56 sec



How do you Oracle then ?

Download Admix4 : https://drive.google.com/file/d/0B9o3EYTdM8lQSVFBYmRWTU1GdEE/view

Oracle4 file : 59268

Put the Oracle calculator file in the folder, everytime you will launch Admix4, it s gonna ask which calculator file you want to use. Once the right one is loaded, just input you final admix results, let it process it, you ll get a file named "Output1" in that same folder with your single and multi population results.

gültekin
07-24-2015, 10:00 AM
good explained and detailed :thumbs up this thread need a sticky, would helpfull for the other testers

Longbowman
07-31-2015, 12:32 AM
Stickied.

Peterski
09-23-2016, 06:32 PM
Bump! Good tutorial! :thumb001:

Peterski
04-06-2017, 12:13 AM
ANOTHER TUTORIAL:

http://www.theapricity.com/forum/showthread.php?170036-Post-your-Steppe-K10-results&p=4330097&viewfull=1#post4330097

Nope, it is Steppe K10. This calculator is not on GEDmatch.

But you can use it on your computer, following these tips:

HOW TO?:

Let’s presume that John tested with 23andMe, and that Adam tested with FTDNA.

John will download his 23andMe raw data and name the file as John.txt , while Adam will download his FTDNA raw data (Build 37 Raw Data Concatenated) and name the file as Adam.csv . Both using Windows.

Download SteppeK10.zip (save icon - an arrow in top right):
https://drive.google.com/file/d/0B-XBmvmgdkfVM2RRRHlRSkcwV0k/view

Extract the files into a folder called SteppeK10, in C:/ . Now you have the folder C:/SteppeK10 , right? Inside it, the files DIYDodecadWin.exe , standardize.r , Steppe.10.F , Steppe.alleles , steppe.par , Steppe.txt .

John and Adam will put the files John.txt and Adam.csv into the folder C:/SteppeK10, respectively.

Download R software: https://cran.r-project.org/bin/windows/base/

Install* this program in English and using default options (if you don’t know how to do it, just ask). After that, run it (click in the icon “R x.. 3.3.1” in Desktop).
*I installed just the 64 bits version.

Once R is running, click in “File” (top left), then click in “Change dir”. Select the folder C:/SteppeK10 and click in “ok”.

Do you see the R console? After the “>” character, type the command below and press Enter:
source('standardize.r')

Now:
a. John, who tested hypothetically with 23andMe, will type the command below, press Enter and wait til he can type anything again:
standardize('John.txt', company='23andMe')

b. Adam will do that with this command:
standardize('Adam.csv', company='ftdna')

Finally, you both will type the command below and wait results show up:
system('DIYDodecadWin steppe.par')

========================

Spreadsheet with population averages:

https://docs.google.com/spreadsheets/d/1Hb0GVyrf2ztR_QvoIYcmhWtsYv0p39avjqM-G3-6Xew/edit#gid=1809893991

========================

Official Tutorial:

http://dodecad.blogspot.com/2011/09/do-it-yourself-dodecad-v-21.html

========================

Turkic K11 calculator:

http://www.theapricity.com/forum/showthread.php?195528-How-T%FCrkic-are-you-Post-your-Turkic-K11-results&p=4068246&viewfull=1#post4068246

Enflamme
05-03-2017, 06:25 PM
I don't understand...

Peterski
06-20-2017, 10:04 AM
Here you can download Eurogenes K36 DIY calculator files:

https://onedrive.live.com/?authkey=%21AqtZhF2ljrwt6-0&cid=5223CC821FDFEB45&id=5223CC821FDFEB45%21141&parId=root&action=locate

Iloko
06-22-2017, 08:08 AM
Someone should make a Youtube tutorial video :)

Maguzanci
07-11-2017, 09:37 PM
Is it possible to run gedmatch kit numbers in calculators outside gedmatch?

Dibran
08-01-2017, 11:37 PM
ANOTHER TUTORIAL:

http://www.theapricity.com/forum/showthread.php?170036-Post-your-Steppe-K10-results&p=4330097&viewfull=1#post4330097

Nope, it is Steppe K10. This calculator is not on GEDmatch.

But you can use it on your computer, following these tips:

HOW TO?:

Let’s presume that John tested with 23andMe, and that Adam tested with FTDNA.

John will download his 23andMe raw data and name the file as John.txt , while Adam will download his FTDNA raw data (Build 37 Raw Data Concatenated) and name the file as Adam.csv . Both using Windows.

Download SteppeK10.zip (save icon - an arrow in top right):
https://drive.google.com/file/d/0B-XBmvmgdkfVM2RRRHlRSkcwV0k/view

Extract the files into a folder called SteppeK10, in C:/ . Now you have the folder C:/SteppeK10 , right? Inside it, the files DIYDodecadWin.exe , standardize.r , Steppe.10.F , Steppe.alleles , steppe.par , Steppe.txt .

John and Adam will put the files John.txt and Adam.csv into the folder C:/SteppeK10, respectively.

Download R software: https://cran.r-project.org/bin/windows/base/

Install* this program in English and using default options (if you don’t know how to do it, just ask). After that, run it (click in the icon “R x.. 3.3.1” in Desktop).
*I installed just the 64 bits version.

Once R is running, click in “File” (top left), then click in “Change dir”. Select the folder C:/SteppeK10 and click in “ok”.

Do you see the R console? After the “>” character, type the command below and press Enter:
source('standardize.r')

Now:
a. John, who tested hypothetically with 23andMe, will type the command below, press Enter and wait til he can type anything again:
standardize('John.txt', company='23andMe')

b. Adam will do that with this command:
standardize('Adam.csv', company='ftdna')

Finally, you both will type the command below and wait results show up:
system('DIYDodecadWin steppe.par')

========================

Spreadsheet with population averages:

https://docs.google.com/spreadsheets/d/1Hb0GVyrf2ztR_QvoIYcmhWtsYv0p39avjqM-G3-6Xew/edit#gid=1809893991

========================

Official Tutorial:

http://dodecad.blogspot.com/2011/09/do-it-yourself-dodecad-v-21.html

========================

Turkic K11 calculator:

http://www.theapricity.com/forum/showthread.php?195528-How-T%FCrkic-are-you-Post-your-Turkic-K11-results&p=4068246&viewfull=1#post4068246

error message after entering the following

standardize('f.txt', company='23andMe')

ERROR: could not find function "standardize"



Standardize.r is in the folder? What am I doing wrong? K11 folder is in C as instructed?

Petalpusher
08-02-2017, 12:23 PM
Is it possible to run gedmatch kit numbers in calculators outside gedmatch?

No, you need the raw datas, gedmatch numbers are just numbers tied to people's genomes in their database.


error message after entering the following

standardize('f.txt', company='23andMe')

ERROR: could not find function "standardize"



Standardize.r is in the folder? What am I doing wrong? K11 folder is in C as instructed??

I don't know if it's your error but you should run source('standardize.r') first and getting a new prompt, then standardize('f.txt', company='23andMe')


While f.txt has to be replaced by the actual name of your genome like Dibran16022017.txt

Proto-Shaman
01-26-2018, 01:20 AM
Is it possible to run gedmatch kit numbers in calculators outside gedmatch?
well, this thread actually does not exist :rolleyes:

DarkWater
07-16-2018, 08:13 PM
does it work with raw data from Ancestry?

Yinwang888
01-28-2019, 06:32 AM
Very nice explanation, thank you very much!

Carlito's Way
03-22-2019, 07:21 AM
im trying to do this for a new calculator and im getting really mad because I cant get it to work, I have followed all the steps and I get no damn results

Taiguaitiaoghyrmmumin
03-22-2019, 07:32 AM
These calculators are boring. The only thjng Im still interested is about haplogroups for genetics

Lemgrant
08-04-2019, 11:23 AM
+

How to make 3D PCA in RStudio and perform hierarchical k-means clustering
https://youtu.be/_lZ_EqV-cZw

https://youtu.be/_lZ_EqV-cZw

Eurogenes K13 datasheet: https://drive.google.com/file/d/1cDv9fq4TyQuI21Y3Sbs8p0ffwoAHewJ6/view

Script:



install.packages("randomcoloR")
install.packages("factoextra")
install.packages("rgl")
install.packages("rmarkdown", repos = "https://cran.revolutionanalytics.com")

data01 <- read.csv(file.choose(), row.names = 'Population')
pca <- princomp(data01[,1:13], cor=TRUE, scores=TRUE)
summary(pca)
library(rgl) #load the package
plot3d(pca$scores[,1:3], col="red", type="p")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2)
grid3d('x')
grid3d('y')
grid3d('z')

text3d(pca$loadings[,1:3], texts=rownames(pca$loadings), col="red")
coords <- NULL
for (i in 1:nrow(pca$loadings)) {
coords <- rbind(coords, rbind(c(0,0,0),pca$loadings[i,1:3]))
}
lines3d(coords, col="red", lwd=4)

library(factoextra) #load the package
cl <- hkmeans(data01[,1:13],130) #130 clusters
data01$cluster <- as.factor(cl$cluster)
library(randomcoloR) #load the package
palette <- distinctColorPalette(130) #130 colors
palette(palette)
plot3d(pca$scores[,1:3], col=data01$cluster, main="Hierarchical k-means clusters")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2, col=data01$cluster)

observer3d(0, 0, 40) # x, y, z

__________________________________________

and

How to make custom oracle in RStudio (Gedmatch, G25...)

https://www.youtube.com/watch?v=cFpBinU5E18

https://www.youtube.com/watch?v=cFpBinU5E18

Algorithm: https://docs.google.com/document/d/1aGDxJJJSTDE1_znyI_lTE_3c7BFBgGTSHcgAZMCzx8A/edit

Lucas
08-08-2019, 02:49 PM
+

How to make 3D PCA in RStudio and perform hierarchical k-means clustering
https://youtu.be/_lZ_EqV-cZw

https://youtu.be/_lZ_EqV-cZw

Eurogenes K13 datasheet: https://drive.google.com/file/d/1cDv9fq4TyQuI21Y3Sbs8p0ffwoAHewJ6/view

Script:



install.packages("randomcoloR")
install.packages("factoextra")
install.packages("rgl")
install.packages("rmarkdown", repos = "https://cran.revolutionanalytics.com")

data01 <- read.csv(file.choose(), row.names = 'Population')
pca <- princomp(data01[,1:13], cor=TRUE, scores=TRUE)
summary(pca)
library(rgl) #load the package
plot3d(pca$scores[,1:3], col="red", type="p")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2)
grid3d('x')
grid3d('y')
grid3d('z')

text3d(pca$loadings[,1:3], texts=rownames(pca$loadings), col="red")
coords <- NULL
for (i in 1:nrow(pca$loadings)) {
coords <- rbind(coords, rbind(c(0,0,0),pca$loadings[i,1:3]))
}
lines3d(coords, col="red", lwd=4)

library(factoextra) #load the package
cl <- hkmeans(data01[,1:13],130) #130 clusters
data01$cluster <- as.factor(cl$cluster)
library(randomcoloR) #load the package
palette <- distinctColorPalette(130) #130 colors
palette(palette)
plot3d(pca$scores[,1:3], col=data01$cluster, main="Hierarchical k-means clusters")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2, col=data01$cluster)

observer3d(0, 0, 40) # x, y, z

__________________________________________

and

How to make custom oracle in RStudio (Gedmatch, G25...)

https://www.youtube.com/watch?v=cFpBinU5E18

https://www.youtube.com/watch?v=cFpBinU5E18

Algorithm: https://docs.google.com/document/d/1aGDxJJJSTDE1_znyI_lTE_3c7BFBgGTSHcgAZMCzx8A/edit

Did you know that MyTrueAncestry used your tutorial to make their 3d plot lol?:)

Lemgrant
08-08-2019, 08:01 PM
Did you know that MyTrueAncestry used your tutorial to make their 3d plot lol?:)

They have 3d plots? I saw only 2d pca. I made this so that everyone could make a 3d pca, since RStudio is a free software.
https://www.rstudio.com/products/rstudio/download/

Mayuk24
09-09-2022, 10:52 AM
+

How to make 3D PCA in RStudio and perform hierarchical k-means clustering
https://youtu.be/_lZ_EqV-cZw

https://youtu.be/_lZ_EqV-cZw

Eurogenes K13 datasheet: https://drive.google.com/file/d/1cDv9fq4TyQuI21Y3Sbs8p0ffwoAHewJ6/view

Script:



install.packages("randomcoloR")
install.packages("factoextra")
install.packages("rgl")
install.packages("rmarkdown", repos = "https://cran.revolutionanalytics.com")

data01 <- read.csv(file.choose(), row.names = 'Population')
pca <- princomp(data01[,1:13], cor=TRUE, scores=TRUE)
summary(pca)
library(rgl) #load the package
plot3d(pca$scores[,1:3], col="red", type="p")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2)
grid3d('x')
grid3d('y')
grid3d('z')

text3d(pca$loadings[,1:3], texts=rownames(pca$loadings), col="red")
coords <- NULL
for (i in 1:nrow(pca$loadings)) {
coords <- rbind(coords, rbind(c(0,0,0),pca$loadings[i,1:3]))
}
lines3d(coords, col="red", lwd=4)

library(factoextra) #load the package
cl <- hkmeans(data01[,1:13],130) #130 clusters
data01$cluster <- as.factor(cl$cluster)
library(randomcoloR) #load the package
palette <- distinctColorPalette(130) #130 colors
palette(palette)
plot3d(pca$scores[,1:3], col=data01$cluster, main="Hierarchical k-means clusters")
text3d(pca$scores[,1:3],texts=rownames(data01),font=2, col=data01$cluster)

observer3d(0, 0, 40) # x, y, z

__________________________________________

and

How to make custom oracle in RStudio (Gedmatch, G25...)

https://www.youtube.com/watch?v=cFpBinU5E18

https://www.youtube.com/watch?v=cFpBinU5E18

Algorithm: https://docs.google.com/document/d/1aGDxJJJSTDE1_znyI_lTE_3c7BFBgGTSHcgAZMCzx8A/edit

thanks so much .... btw I like giordano bruno too c;

I never try R or any data analysis before.... I didn't have any project on mind ... until now ....

I'm autodidact but can you ask you about ... nmonte or how I can create my costume G25 from raw files ?

I mean I just need some inside about how to translate data into coordinates ... I saw in other post about normalize data...

I want to made coordinates of my chromosomes... where can I start to do so ? Dx