Arborean
08-21-2018, 02:45 PM
So someone ran this for me an explained their process. I always have trouble getting the program to work on my own.
"Very interesting test and puzzle. When I run you with the Gradient Descent algorithm with all of the moderns, it creates a very simple model without any adjusting, but it isn't as close as I would like to see (I would like the distance to be less than .01 or 1%:
distance: 0.011711932
Macedonian 0.30014296621084213
Albanian 0.24731162935495377
Greek 0.19394290447235107
Sardinian 0.1770375818014145
Greek_Trabzon 0.08156487345695496
It seems like these components aren't bad ones and all are relevant to your ancestry. When I run nMonte against all of the moderns, I get the following (displaying just those above 1% in the results):
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Italian_Abruzzo:ItalyAbruzzo20 Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373
1.915411 1.983457 1.984641 2.029828 2.090191
Bulgarian:Bulgaria1 Italian_Abruzzo:ItalyAbruzzo17 Greek:NA17377
2.123794 2.134549 2.257056
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14
Arborean 0.0112000 0.0148000 0.0054000 -0.008000 0.0093000 -0.002600 0.0026000 0.0012000 -0.0014000 0.012200 0.0010000 0.0049000 -0.0078000 0.0023000
fitted 0.0102916 0.0141782 0.0036164 -0.005606 0.0076852 -0.002623 0.0011714 0.0009352 0.0009064 0.009098 -0.0009366 0.0018392 -0.0050268 0.0007484
dif -0.0009084 -0.0006218 -0.0017836 0.002394 -0.0016148 -0.000023 -0.0014286 -0.0002648 0.0023064 -0.003102 -0.0019366 -0.0030608 0.0027732 -0.0015516
PC15 PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean -0.015900 0.0052000 0.0233000 -0.0021000 0.0049000 -0.005900 -0.0085000 0.0001000 0.0008000 0.000800 -0.0017000
fitted -0.011812 0.0022038 0.0146336 -0.0013674 0.0037306 -0.001793 -0.0058286 -0.0000242 0.0012714 0.001373 -0.0012754
dif 0.004088 -0.0029962 -0.0086664 0.0007326 -0.0011694 0.004107 0.0026714 -0.0001242 0.0004714 0.000573 0.0004246
[1] "distance%=1.3433"
Greek,24.2
Italian_Abruzzo,22.2
Albanian,19.2
Macedonian,16.4
Sardinian,6
Serbian,4
Montenegrin,3.2
Bulgarian,1.6
Cypriot,1.2
A little more complicated, but really a similar type of model. When I run with just the five populations from the Gradient Descent model in that tool, I get the following:
distance: 0.0116983
Macedonian 0.328005850315094
Albanian 0.23669229075312614
Greek 0.17675118148326874
Sardinian 0.17566224932670593
Greek_Trabzon 0.0828883945941925
And with nMonte, I get this:
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373 Greek:NA17377 Greek:GREEKGRALPOP9 Macedonian:Macedonian8
1.983457 1.984641 2.029828 2.090191 2.257056 2.258517 2.309091
Greek:GREEKGRALPOP5
2.346146
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14 PC15
Arborean 0.0112000 0.0148000 0.0054000 -0.008000 0.0093000 -0.0026000 0.0026000 0.0012000 -0.00140 0.0122000 0.001000 0.0049000 -0.0078000 0.002300 -0.0159000
fitted 0.0103082 0.0142238 0.0046926 -0.004633 0.0087132 -0.0022728 0.0017514 0.0015826 0.00127 0.0082596 -0.001081 0.0017918 -0.0049668 0.001826 -0.0131254
dif -0.0008918 -0.0005762 -0.0007074 0.003367 -0.0005868 0.0003272 -0.0008486 0.0003826 0.00267 -0.0039404 -0.002081 -0.0031082 0.0028332 -0.000474 0.0027746
PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean 0.005200 0.02330 -0.0021000 0.0049000 -0.0059000 -0.0085000 0.000100 0.0008000 0.000800 -0.0017000
fitted 0.002179 0.01478 -0.0019074 0.0054234 -0.0016982 -0.0070448 -0.000167 0.0010294 0.001666 -0.0014172
dif -0.003021 -0.00852 0.0001926 0.0005234 0.0042018 0.0014552 -0.000267 0.0002294 0.000866 0.0002828
[1] "distance%=1.3024"
Macedonian,37.2
Greek,27.2
Albanian,24.6
Sardinian,9.4
Greek_Trabzon,1.6
I ran you against all of the ancients with Gradient Descent. I normally don't consider this a finished model, but there are so many kinds of models (bronze age, medieval, copper age, iron age, etc.), that I don't really know where to go with it from here. But, these results turned out to be important:
distance: 0.0065850196
Baltic_BA 0.15700548887252808
Koros_N 0.1548050194978714
Oy_Dzhaylau_MLBA 0.13288335502147675
Balaton_Lasinja_CA 0.11794568598270416
Armenia_EBA 0.10862516611814499
Peloponnese_N 0.09092467278242111
LBK_N 0.07686357200145721
Boncuklu_N 0.04197021201252937
CHG 0.03837617486715317
Afanasievo 0.025242319330573082
Mentese_N 0.023140691220760345
Alan 0.019512005150318146
Sappali_Tepe_BA 0.00653311051428318
Balkans_N 0.006172508001327515
It turns out it is easier to model you with the ancients! The distance is better. My father has a similar result where adding an ancient reference to his modern model improves it quite a bit. So, I played around to find out what it would take to transform your modern model. It turned out to be pretty simple. I kept Macedonian, Greek, and Albanian and needed two of the ancients. With Gradient Descent, it didn't even use the Albanian, though nMonte does. First, with the Gradient Descent:
distance: 0.0069065676
Macedonian 0.522225096821785
Balaton_Lasinja_CA 0.33253446221351624
CHG 0.08547361940145493
Greek 0.05976679176092148
With nMonte, the distance improves, too:
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373 Greek:NA17377 Greek:GREEKGRALPOP9 Macedonian:Macedonian8
1.983457 1.984641 2.029828 2.090191 2.257056 2.258517 2.309091
Greek:GREEKGRALPOP5
2.346146
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14
Arborean 0.0112000 0.0148000 0.0054000 -0.0080000 0.0093000 -0.0026000 0.0026000 0.0012000 -0.0014000 0.0122000 0.0010000 0.0049000 -0.0078000 0.0023000
fitted 0.0103608 0.0144154 0.0049702 -0.0052802 0.0091866 -0.0027528 0.0016666 0.0012542 0.0016076 0.0091088 -0.0007764 0.0020346 -0.0045784 0.0026116
dif -0.0008392 -0.0003846 -0.0004298 0.0027198 -0.0001134 -0.0001528 -0.0009334 0.0000542 0.0030076 -0.0030912 -0.0017764 -0.0028654 0.0032216 0.0003116
PC15 PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean -0.015900 0.0052000 0.0233000 -0.0021000 0.0049000 -0.0059000 -0.0085000 0.0001000 0.0008000 0.0008000 -0.0017000
fitted -0.014747 0.0023916 0.0156686 -0.0019596 0.0059326 -0.0016248 -0.0071012 0.0005006 0.0010506 0.0020898 -0.0016498
dif 0.001153 -0.0028084 -0.0076314 0.0001404 0.0010326 0.0042752 0.0013988 0.0004006 0.0002506 0.0012898 0.0000502
[1] "distance%=1.1851"
Macedonian,36.8
Greek,26.8
Albanian,23.8
Balaton_Lasinja_CA,12
CHG,0.6
It still isn't under 1%, because (I think), the distance penalty really penalizes the ancients in the model. If I shut the distance penalty off, it becomes similar to the Gradient Descent model:
[1] "distance%=0.6723"
Macedonian,49.4
Balaton_Lasinja_CA,32.2
Albanian,9.6
CHG,8.2
Greek,0.6
It seems like maybe the moderns reference spreadsheet may need some additional references to help with your situation, maybe your region isn't well represented??? I am not sure. I hope this helps."
"Very interesting test and puzzle. When I run you with the Gradient Descent algorithm with all of the moderns, it creates a very simple model without any adjusting, but it isn't as close as I would like to see (I would like the distance to be less than .01 or 1%:
distance: 0.011711932
Macedonian 0.30014296621084213
Albanian 0.24731162935495377
Greek 0.19394290447235107
Sardinian 0.1770375818014145
Greek_Trabzon 0.08156487345695496
It seems like these components aren't bad ones and all are relevant to your ancestry. When I run nMonte against all of the moderns, I get the following (displaying just those above 1% in the results):
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Italian_Abruzzo:ItalyAbruzzo20 Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373
1.915411 1.983457 1.984641 2.029828 2.090191
Bulgarian:Bulgaria1 Italian_Abruzzo:ItalyAbruzzo17 Greek:NA17377
2.123794 2.134549 2.257056
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14
Arborean 0.0112000 0.0148000 0.0054000 -0.008000 0.0093000 -0.002600 0.0026000 0.0012000 -0.0014000 0.012200 0.0010000 0.0049000 -0.0078000 0.0023000
fitted 0.0102916 0.0141782 0.0036164 -0.005606 0.0076852 -0.002623 0.0011714 0.0009352 0.0009064 0.009098 -0.0009366 0.0018392 -0.0050268 0.0007484
dif -0.0009084 -0.0006218 -0.0017836 0.002394 -0.0016148 -0.000023 -0.0014286 -0.0002648 0.0023064 -0.003102 -0.0019366 -0.0030608 0.0027732 -0.0015516
PC15 PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean -0.015900 0.0052000 0.0233000 -0.0021000 0.0049000 -0.005900 -0.0085000 0.0001000 0.0008000 0.000800 -0.0017000
fitted -0.011812 0.0022038 0.0146336 -0.0013674 0.0037306 -0.001793 -0.0058286 -0.0000242 0.0012714 0.001373 -0.0012754
dif 0.004088 -0.0029962 -0.0086664 0.0007326 -0.0011694 0.004107 0.0026714 -0.0001242 0.0004714 0.000573 0.0004246
[1] "distance%=1.3433"
Greek,24.2
Italian_Abruzzo,22.2
Albanian,19.2
Macedonian,16.4
Sardinian,6
Serbian,4
Montenegrin,3.2
Bulgarian,1.6
Cypriot,1.2
A little more complicated, but really a similar type of model. When I run with just the five populations from the Gradient Descent model in that tool, I get the following:
distance: 0.0116983
Macedonian 0.328005850315094
Albanian 0.23669229075312614
Greek 0.17675118148326874
Sardinian 0.17566224932670593
Greek_Trabzon 0.0828883945941925
And with nMonte, I get this:
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373 Greek:NA17377 Greek:GREEKGRALPOP9 Macedonian:Macedonian8
1.983457 1.984641 2.029828 2.090191 2.257056 2.258517 2.309091
Greek:GREEKGRALPOP5
2.346146
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14 PC15
Arborean 0.0112000 0.0148000 0.0054000 -0.008000 0.0093000 -0.0026000 0.0026000 0.0012000 -0.00140 0.0122000 0.001000 0.0049000 -0.0078000 0.002300 -0.0159000
fitted 0.0103082 0.0142238 0.0046926 -0.004633 0.0087132 -0.0022728 0.0017514 0.0015826 0.00127 0.0082596 -0.001081 0.0017918 -0.0049668 0.001826 -0.0131254
dif -0.0008918 -0.0005762 -0.0007074 0.003367 -0.0005868 0.0003272 -0.0008486 0.0003826 0.00267 -0.0039404 -0.002081 -0.0031082 0.0028332 -0.000474 0.0027746
PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean 0.005200 0.02330 -0.0021000 0.0049000 -0.0059000 -0.0085000 0.000100 0.0008000 0.000800 -0.0017000
fitted 0.002179 0.01478 -0.0019074 0.0054234 -0.0016982 -0.0070448 -0.000167 0.0010294 0.001666 -0.0014172
dif -0.003021 -0.00852 0.0001926 0.0005234 0.0042018 0.0014552 -0.000267 0.0002294 0.000866 0.0002828
[1] "distance%=1.3024"
Macedonian,37.2
Greek,27.2
Albanian,24.6
Sardinian,9.4
Greek_Trabzon,1.6
I ran you against all of the ancients with Gradient Descent. I normally don't consider this a finished model, but there are so many kinds of models (bronze age, medieval, copper age, iron age, etc.), that I don't really know where to go with it from here. But, these results turned out to be important:
distance: 0.0065850196
Baltic_BA 0.15700548887252808
Koros_N 0.1548050194978714
Oy_Dzhaylau_MLBA 0.13288335502147675
Balaton_Lasinja_CA 0.11794568598270416
Armenia_EBA 0.10862516611814499
Peloponnese_N 0.09092467278242111
LBK_N 0.07686357200145721
Boncuklu_N 0.04197021201252937
CHG 0.03837617486715317
Afanasievo 0.025242319330573082
Mentese_N 0.023140691220760345
Alan 0.019512005150318146
Sappali_Tepe_BA 0.00653311051428318
Balkans_N 0.006172508001327515
It turns out it is easier to model you with the ancients! The distance is better. My father has a similar result where adding an ancient reference to his modern model improves it quite a bit. So, I played around to find out what it would take to transform your modern model. It turned out to be pretty simple. I kept Macedonian, Greek, and Albanian and needed two of the ancients. With Gradient Descent, it didn't even use the Albanian, though nMonte does. First, with the Gradient Descent:
distance: 0.0069065676
Macedonian 0.522225096821785
Balaton_Lasinja_CA 0.33253446221351624
CHG 0.08547361940145493
Greek 0.05976679176092148
With nMonte, the distance improves, too:
[1] "1. CLOSEST SINGLE ITEM DISTANCE%"
Macedonian:Macedonian2 Albanian:ALB220 Greek:GREEKGRALPOP10 Greek:NA17373 Greek:NA17377 Greek:GREEKGRALPOP9 Macedonian:Macedonian8
1.983457 1.984641 2.029828 2.090191 2.257056 2.258517 2.309091
Greek:GREEKGRALPOP5
2.346146
[1] "2. FULL TABLE nMONTE"
[1] "penalty= 0.001"
[1] "Ncycles= 1000"
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14
Arborean 0.0112000 0.0148000 0.0054000 -0.0080000 0.0093000 -0.0026000 0.0026000 0.0012000 -0.0014000 0.0122000 0.0010000 0.0049000 -0.0078000 0.0023000
fitted 0.0103608 0.0144154 0.0049702 -0.0052802 0.0091866 -0.0027528 0.0016666 0.0012542 0.0016076 0.0091088 -0.0007764 0.0020346 -0.0045784 0.0026116
dif -0.0008392 -0.0003846 -0.0004298 0.0027198 -0.0001134 -0.0001528 -0.0009334 0.0000542 0.0030076 -0.0030912 -0.0017764 -0.0028654 0.0032216 0.0003116
PC15 PC16 PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25
Arborean -0.015900 0.0052000 0.0233000 -0.0021000 0.0049000 -0.0059000 -0.0085000 0.0001000 0.0008000 0.0008000 -0.0017000
fitted -0.014747 0.0023916 0.0156686 -0.0019596 0.0059326 -0.0016248 -0.0071012 0.0005006 0.0010506 0.0020898 -0.0016498
dif 0.001153 -0.0028084 -0.0076314 0.0001404 0.0010326 0.0042752 0.0013988 0.0004006 0.0002506 0.0012898 0.0000502
[1] "distance%=1.1851"
Macedonian,36.8
Greek,26.8
Albanian,23.8
Balaton_Lasinja_CA,12
CHG,0.6
It still isn't under 1%, because (I think), the distance penalty really penalizes the ancients in the model. If I shut the distance penalty off, it becomes similar to the Gradient Descent model:
[1] "distance%=0.6723"
Macedonian,49.4
Balaton_Lasinja_CA,32.2
Albanian,9.6
CHG,8.2
Greek,0.6
It seems like maybe the moderns reference spreadsheet may need some additional references to help with your situation, maybe your region isn't well represented??? I am not sure. I hope this helps."