0
It's not about the distances, it's about missing crucial population signals that you might miss on a limited amount of source populations. I actually agree that 6 will be enough for most populations, but there are people that think even 6 is overfitting and don't go over 3-4. 3-4 will be extremely inaccurate.
I'll give you an example, say you model Italians with Celts, Germanics, Mycenaeans, Republican Romans, Natufians, and Iran_N. That's already 6. A crucial Anatolian signal will be missed(and possibly vice versa if you replaced one of them with Anatolian), Mycenaean will pick up a lot of the Anatolian admixture and total Middle-Eastern admixture isn't known. Crucial information missing. Putting multiple of each population you're trying to detect(ie multiple Celtic proxies, using both Hallstatt and Lech_MBA for example) is also useful because various groups in the G25 coordinates have outliers included in them or are really small sample sizes(Hallstatt is only 2 individuals for example, and both individuals are very different from eachother), and even more useful when using individuals/small averaged groups rather than averaged groups because you can miss out on crucial specific genetic drift that one individual may have but another may not.
Bookmarks