5
I used data from the file "Global25 pop averages modern scaled": https://eurogenes.blogspot.com/2019/...obal25_12.html.
The numbers displayed in the heatmap are the same Euclidean distances that are shown by Vahaduo, but I multiplied them by 100 so I could make them fit the cells of the heatmap better, and because integers are nice.
Distances based on G25 have to be taken with a grain of salt, but based on the distances shown in the image below, Kola Saami are closer to Komis than to other Saami. The population that is closest to Komis are Kola Saami. Bashkirs are far from every population. Maris are the closest to non-Kola Saami. Non-Kola Saami are closest to Kola Saami, followed by Udmurts. Kazan Tatars are closest to Mishars, followed by Besermyans.
Here are also some Siberian population averages:Code:brew install R R -e 'install.packages(c("pheatmap","RColorBrewer"),repos="https://cloud.r-project.org")' curl -Ls 'drive.google.com/uc?export=download&id=1wZr-UOve0KUKo_Qbgeo27m-CQncZWb8y'>modernave awk -F, 'NR==FNR{a[$0];next}$1 in a' <(printf %s\\n Bashkir Besermyan Chuvash Estonian Finnish Karelian Komi Mari Mordovian Russian_Kostroma Russian_Pinega Saami Saami_Kola Tatar_Kazan Tatar_Mishar Udmurt Vepsian) modernave>selected R -e 'library("pheatmap");library("RColorBrewer");t<-read.csv("selected",header=F,row.names=1);t2<-100*as.matrix(dist(t,upper=T));diag(t2)<-NA; pheatmap(t2,filename="output.png",main="G25 Euclidean distances multiplied by 100",cellwidth=12,cellheight=12,fontsize=9,border_color=NA, display_numbers=T,number_format="%.0f",fontsize_number=7,number_color="black",rev(colorRampPalette(brewer.pal(11,"Spectral"))(256)))'
Here are 16 random populations:
Note that the heatmaps above use different ranges of numbers for the color scale. You can give an argument like `breaks=seq(0,14.88,14.88/256)` for `pheatmap` to use a fixed scale from 0 to 14.88.
It's easy to calculate Euclidean distances in R:
You can also use awk:Code:$ Rscript -e 'round(dist(read.csv("modernave",row.names=1,header=T)[c("Chuvash","Khanty","Komi","Mari","Nenets","Udmurt"),],upper=T),3)' Chuvash Khanty Komi Mari Nenets Udmurt Chuvash 0.205 0.073 0.056 0.302 0.049 Khanty 0.205 0.247 0.173 0.112 0.186 Komi 0.073 0.247 0.125 0.343 0.067 Mari 0.056 0.173 0.125 0.270 0.082 Nenets 0.302 0.112 0.343 0.270 0.286 Udmurt 0.049 0.186 0.067 0.082 0.286 $ Rscript -e 't<-read.csv("modernave",row.names=1,header=T);round(head(sort(as.matrix(dist(t))["Chuvash",]),8),3)' Chuvash Besermyan Udmurt Mari Tatar_Kazan Saami Komi Saami_Kola 0.000 0.048 0.049 0.056 0.064 0.071 0.073 0.077 $ Rscript -e 't<-read.csv("modernave",row.names=1,header=T);p<-t["Chuvash",];head(round(sort(apply(t,1,function(x)sqrt(sum((x-p)^2)))),3),8)' Chuvash Besermyan Udmurt Mari Tatar_Kazan Saami Komi Saami_Kola 0.000 0.048 0.049 0.056 0.064 0.071 0.073 0.077
Or use this Ruby script:Code:$ awk -F, 'NR==FNR{for(i=2;i<=NF;i++)a[i]=$i;next}{s=0;for(i=2;i<=NF;i++)s+=($i-a[i])^2;print s^.5","$1}' <(grep Chuvash, modernave) modernave|sort -n|awk -F, '{printf"%.03f %s\n",$1,$2}'|sed s/^0//|head -n8 .000 Chuvash .048 Besermyan .049 Udmurt .056 Mari .064 Tatar_Kazan .071 Saami .073 Komi .077 Saami_Kola
The distances calculated by Vahaduo are also simple Euclidean distances:Code:$ cat ~/bin/eud #!/usr/bin/env ruby -roptparse opt={} OptionParser.new{|x| x.on("-m NUM",Integer){|y|opt[:m]=y} x.on("-f NUM",Integer){|y|opt[:f]=y} }.parse! a=IO.readlines(ARGV[0]).map{|l|x,*y=l.chomp.split(",");[x,y.map(&:to_f)]} puts IO.readlines(ARGV[1]).map{|l| x,*y=l.chomp.split(",") y.map!(&:to_f) d=a.reject{|z|z[0]==x}.map{|z|[z[1].map.with_index{|v,i|(v-y[i])**2}.sum**0.5,z[0]]}.sort_by(&:first) d=d.take(opt[:m])if opt[:m] "Distance to: #{x}\n"+d.map{|x,y|("%.#{opt[:f]||3}f"%x).sub(/^0/,"")+" "+y}*"\n" }*"\n\n" $ eud -m8 modernave <(grep Chuvash modernave) Distance to: Chuvash .048 Besermyan .049 Udmurt .056 Mari .064 Tatar_Kazan .071 Saami .073 Komi .077 Saami_Kola .087 Tatar_Mishar
Bookmarks