So recently curupira was a very nice guy and let me run some analyses in the data he collected from brazilian 23andme results.
I got 159 users with both results and location available (there might be more, I'm doing this automatically and the location is in non-formatted text, so I might have missed someone just because his location is written in unexpected formatting).
Here are the averages by state:
State: BA Sample Size:9 Euro:75.80 MENA:1.36 NA:2.52 SSA:19.28 S. Asian:0.03 Ocean.:0.00
State: PR Sample Size:8 Euro:85.55 MENA:0.66 NA:7.10 SSA:5.55 S. Asian:0.04 Ocean.:0.00
State: RS Sample Size:14 Euro:88.60 MENA:0.45 NA:3.68 SSA:6.54 S. Asian:0.04 Ocean.:0.00
State: PB Sample Size:5 Euro:86.56 MENA:1.46 NA:5.20 SSA:5.30 S. Asian:0.02 Ocean.:0.00
State: PA Sample Size:6 Euro:67.53 MENA:1.35 NA:18.05 SSA:10.27 S. Asian:0.00 Ocean.:0.00
State: PE Sample Size:10 Euro:83.54 MENA:0.85 NA:4.93 SSA:9.07 S. Asian:0.04 Ocean.:0.00
State: RN Sample Size:2 Euro:87.05 MENA:1.00 NA:5.50 SSA:3.90 S. Asian:0.15 Ocean.:0.00
State: RO Sample Size:1 Euro:90.50 MENA:1.10 NA:6.60 SSA:0.60 S. Asian:0.00 Ocean.:0.00
State: RJ Sample Size:17 Euro:88.26 MENA:0.79 NA:3.01 SSA:6.85 S. Asian:0.02 Ocean.:0.01
State: AM Sample Size:2 Euro:79.00 MENA:0.90 NA:14.25 SSA:3.95 S. Asian:0.05 Ocean.:0.00
State: AL Sample Size:3 Euro:84.03 MENA:0.63 NA:6.07 SSA:7.57 S. Asian:0.00 Ocean.:0.00
State: CE Sample Size:7 Euro:76.43 MENA:1.17 NA:8.79 SSA:11.44 S. Asian:0.03 Ocean.:0.00
State: GO Sample Size:6 Euro:76.88 MENA:0.65 NA:8.85 SSA:11.87 S. Asian:0.00 Ocean.:0.00
State: ES Sample Size:7 Euro:80.23 MENA:1.14 NA:4.51 SSA:13.24 S. Asian:0.07 Ocean.:0.00
State: MG Sample Size:21 Euro:85.08 MENA:3.23 NA:3.96 SSA:6.53 S. Asian:0.02 Ocean.:0.00
State: PI Sample Size:1 Euro:89.40 MENA:0.60 NA:4.60 SSA:3.10 S. Asian:0.00 Ocean.:0.00
State: MA Sample Size:2 Euro:80.95 MENA:1.00 NA:6.90 SSA:9.35 S. Asian:0.15 Ocean.:0.00
State: SP Sample Size:24 Euro:89.27 MENA:1.83 NA:5.18 SSA:2.23 S. Asian:0.02 Ocean.:0.00
State: MT Sample Size:1 Euro:94.10 MENA:0.00 NA:3.00 SSA:2.40 S. Asian:0.00 Ocean.:0.00
State: MS Sample Size:2 Euro:85.20 MENA:2.40 NA:6.50 SSA:4.35 S. Asian:0.00 Ocean.:0.00
State: SC Sample Size:9 Euro:88.84 MENA:0.80 NA:6.04 SSA:2.96 S. Asian:0.00 Ocean.:0.00
State: SE Sample Size:2 Euro:89.10 MENA:0.80 NA:2.50 SSA:6.15 S. Asian:0.00 Ocean.:0.00
Note: The users have more than one different states assigned to them, so deciding which one to use can impact strongly the averages of the results with less sample size. The way I did (slightly technical explanation ahead) is choosing among the options the state that, so far, had less results assigned to it, I did this to try to give representation to as many different states as possible. This is a greedy approach and it is evident that will not optimize the "representation", however it is way quicker to do than to implement an optimal algorithm (which I don't know what would be, maybe it's NP-complete).
Bookmarks