PDA

View Full Version : Investigating the origins of eastern Polynesians



Iloko
07-10-2018, 07:38 PM
https://www.nature.com/articles/s41598-018-20026-8

Introduction
The cultural and linguistic unity of the islands and atolls of the central Pacific was first documented in detail by Johann Reinhold Forster, a naturalist on James Cook’s second voyage of discovery to the Pacific (1772–1775). He suggested that the similarity of the languages spoken there, now known as Polynesian, reflected a comparatively shallow time-depth since their dispersal1. Forster’s seminal comparative study of Austronesian languages identified the lowland region of the Philippines in Island Southeast Asia (ISEA) as the ultimate source for the Polynesian languages and proposed a long-distance migration from there by the ancestors of today’s Polynesian speakers. This appeared to be the only explanation for the striking difference in phenotype that he observed between the peoples of the central Pacific and those of the intervening region, which is now known as Melanesia. Herein, the terms Melanesia and Micronesia are used in their geographical sense. We use the term Polynesia to include all islands and atolls whose inhabitants speak Polynesian languages, including 23 found throughout Melanesia and Micronesia, referred to as outlier Polynesia (Fig. 1a).
https://media.springernature.com/lw900/springer-static/image/art%3A10.1038%2Fs41598-018-20026-8/MediaObjects/41598_2018_20026_Fig1_HTML.jpg

Sampling locations and overview of genomic diversity. (a) Sources of population data used in the present study. The Philippine group names are abbreviated as follows: Aet (Aeta); Agt (Agta); Bat (Batak); Cas (Casiguran); Kan (Kankanaey); Taga (Tagalog); Tagb (Tagbanua); Zam (Zambales); and Phi (Philippines, incorporating all other groups from this region). Colours indicate regional affiliation of populations used for analysis of autosomal DNA: orange – mainland Southeast Asia and East Asia; dark blue – Taiwan; brown – Philippines Aeta, Agta and Batak negritos; light blue – Philippines non-negritos; red – western Indonesia; pink – eastern Indonesia; purple – northern Melanesia and New Guinea; black – Australia; green –Polynesia. The usage of populations varies with the type of analysis employed (Supplementary Table S1). Inset map shows the three populations from the Leeward Society Isles, and Tahiti, the major island in the Windward Society Isles. The red circles within Micronesia and Melanesia represent 20 of the atolls and islands referred to collectively as outlier Polynesia. The red stars denote the three additional Polynesian outlier populations (Rennell and Bellona, Tikopia), which together with Tonga, were used in analysis of ancient admixture by Skoglund, et al.25. Detailed sample information is given in Supplementary Table S1. The map was created using R v. 3.4.1 (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, https://www.R-project.org/), and packages ‘maps’ v. 3.2.0 (https://cran.r-project.org/package=maps) and ‘mapdata’ v. 2.2-6 (https://cran.r-project.org/package=mapdata). (b) Inset at top right shows two alternative reconstructed sub-groupings of Polynesian languages discussed in the text. The critical differences are the position of the East Polynesian languages relative to the rest of nuclear Polynesian, and their relationship to the Central Northern Outlier languages. In the sub-grouping according to Pawley31 all the Polynesian Outlier languages group within Samoic implying an early separation of Proto-East Polynesian from the rest of the Nuclear Polynesian languages. In the alternative sub-grouping proposed by Wilson32 the Central Northern Outlier languages group with the languages of East Polynesia, within a larger clade containing the other Northern Outlier languages. (c) Principal components analysis of genome-wide SNP diversity in 639 individuals populations shown in panel A; axes are scaled by the proportion of variance described by the corresponding principal component.

Separating the demographic histories of Polynesia and Melanesia became difficult to sustain with developments in archaeology during the second half of the 20th century. These established that the settlement of southern Melanesia (Santa Cruz, Vanuatu, New Caledonia and Fiji) and western Polynesia (Tonga, Samoa, Niue and Futuna) is marked by the same archaeological horizon, known as the Lapita Cultural Complex (LCC). The LCC first appears in northern Melanesia (the Bismarck Archipelago, Bougainville, and the Solomon Islands main chain) ~3,450–3,250 BP, and quickly spread into southern Melanesia ~3,200–3,000 BP, reaching Tonga and Samoa ~2,900 BP2,3,4. At the same time, the study of comparative linguistics has shown that the Oceanic branch of the Austronesian phylum of languages, of which Polynesian is a member, is spoken throughout most of Melanesia and parts of coastal New Guinea, and appears to be a recent intrusion from ISEA5. So while there is considerable overlap between the distributions of the LCC and the Oceanic languages, there remains a phenotypic divide between southern Melanesia and western Polynesia, which is observed between Fiji and Tonga6,7.

A central theme in this debate is the extent to which the development of the LCC involved local people in the Bismarck Archipelago of northern Melanesia8,9,10. An alternative is that the LCC represents the arrival of a largely pre-formed cultural package carried by speakers of proto-Oceanic languages from Taiwan, via the Philippines, in ISEA11. Hypotheses are placed on a continuum from a dendritic, radiating, phylogenetic model of cultural evolution that relies on the relative isolation of populations12, to one based on complex ongoing biological and cultural interaction between groups, leading to reticulated networks of genes and culture9. A compromise position has been promoted by the recognition of a Lapita homeland in the Bismarck Archipelago10, together with evidence that the genomes of contemporary Polynesians contain 20–30% ancestry typical of northern Melanesia and New Guinea13,14. This posits a period of limited cultural and genetic admixture involving migrants from ISEA during the early LCC phase in northern Melanesia ~3,450–3,250 BP15. Polynesian society then developed in relative isolation following the pioneering settlement of Tonga and Samoa ~2,900 BP12.

Genetic evidence for this intermediate model is provided by the presence of members of Y chromosome haplogroup (hg) C2a-M208, together with its daughter lineage C2a1-P33, among Polynesian speakers16,17. This is seen as a proxy for male-mediated admixture from northern Melanesian and New Guinean sources into the gene pool of migrants from ISEA during the formative period of the LCC in the Bismarck Archipelago, prior to the settlement of southern Melanesia and western Polynesia13,18. In contrast, the near fixation in Polynesian speaking groups of the mitochondrial lineage B4a1a1 is seen as evidence of a predominantly ISEA maternal heritage13,19. Subsequent research, however, has shown that B4a1a1 is widespread throughout northern Melanesia20, including regions that show no evidence of autosomal admixture with people from ISEA21. Alternatively, therefore, hg B4a1a1 might also have been present in northern Melanesia before the emergence of the LCC22,23. Similar ambiguity now exists over the origins of paternal lineage C2a-M208, due to its presence in ISEA24 and rather low overall frequencies in the Bismarck Archipelago and coastal New Guinea17.

An important advance in this debate is the recovery of ancient genomic DNA from LCC contexts on Vanuatu (~2,900 BP) (n = 3) and post-Lapita Tonga (~2,500 BP) (n = 1), since the results indicate people with close to 100% ancestry related to an ISEA heritage25. These data show that some settlers of the LCC period appear to have transited northern Melanesia and New Guinea from ISEA without receiving any significant amounts of genetic admixture. A second major finding is that the 20–30% ancestry originating from northern Melanesia and New Guinea, detected in contemporary genomes from the eastern fringe of southern Melanesia and western Polynesia, appears to have arrived during the 2nd millennium BP (1,900–1,200 BP). This result is consistent with post-LCC movements of people into southern Melanesia and western Polynesia, in a process of polygenesis, being responsible for the differences in phenotype observed between the two regions6.

The potential significance of this proposed post-LCC migration for the phylogenetic approach to cultural evolution cannot be overstated. This is because the model is based on an Ancestral Polynesian Society (APS) developing in a western Polynesian homeland during the mid 3rd millennium BP, followed by a rapid settlement of eastern Polynesia ~2,200 BP12. The settlement of eastern Polynesia, however, has witnessed significant reductions in the earliest secure radiometric dates in recent years. These currently stand at ~950 BP and come from Rai’atea in the Leeward Society Isles26,27, thereby excluding the original calibration for the model and subsequent revisions to it28. The archaeology for the phylogenetic model can also be challenged because the evidence post 2,500 BP suggests isolation of Tonga and Samoa, rather than the interaction invoked for the development of Proto-Polynesian language29. By ~950 BP, society in western Polynesia was differentiated, both culturally and linguistically, indicating that, if this late chronology is accurate, the source population for eastern Polynesia was likely a regional group rather than the hypothetical APS29,30.

A central component of the original phylogenetic model is the long-standing sub-grouping of the Polynesian languages. The initial divergence of Nuclear Polynesian from the Tongic languages is followed by a second-order split, between Proto-East Polynesian (Rapa Nui, Marquesan and Tahitic) and the rest of the Nuclear Polynesian languages (Samoic and all the Polynesian outlier languages)31 (Fig. 1b, left-hand tree). This sub-grouping recognizes the separation of Tongic and Samoic but is difficult to reconcile with a settlement of eastern Polynesia commencing ~950 BP, since it necessitates the second-order split, involving Proto-East Polynesian, to occur up to ~1,200 years earlier. An alternative linguistic sub-grouping that places the East Polynesian languages together with those of the central northern outliers (east coast of the northern Solomon Islands) provides a potential solution for the apparent discordance between archaeology and language32,33 (Fig. 1b, right-hand tree). This also challenges the orthodoxy within Polynesian studies that eastern Polynesia was settled directly from Samoa11,12,28. For Kirch and Green28, Samoa is ancient Hawa’iki, the cradle of Polynesian culture. In contrast, for Wilson32 Hawa’iki represents the ancient name for the Leeward Society Isles, which are referred to as the cultural and spiritual hub of eastern Polynesia in oral histories of the region, from where other islands and atolls were settled34.

The Leeward Society Isles, therefore, are of central importance to understanding the reasons for these conflicting signals from archaeology and language. If the ancestors of the Leeward Society Islanders experienced the same episode of ancient admixture as people in western Polynesia and outlier Polynesia during the mid 2nd millennium BP25, this would support the late settlement chronology. In this study, we report the first genomic data from Bora Bora, Rai’atea and Taha’a, three of the Leeward Society Isles. We use the analysis of genotype and haplotype data to ascertain whether the signals of admixture present in these eastern Polynesian populations are similar to those from western and outlier Polynesia and identify potential donors to the ancestors of the Leeward Society Islanders. Further insights into the demographic history of eastern Polynesia is provided by the first deep re-sequencing of Polynesian Y chromosomes, complemented by high-resolution genotyping of key paternal and maternal lineages from the Leeward Society Isles and New Zealand.

Autosomal analysis
The first two PCs of the principal components analysis (PCA, Fig. 1c) account for 38% of the variation in the studied dataset. The close overlap between eastern Polynesians and Samoans on the PC1 axis suggests similar amounts of genetic ancestry shared with New Guinea and northern Melanesia. The model-based analysis of autosomal SNPs using ADMIXTURE35 shows that, at K = 4, 70–80% of the Leeward Society Islander genomes can be characterized by the component typical of ISEA/East Asia (Fig. 2a); the remaining 20–30% of their genetic ancestry is best represented by Papuan speakers from New Guinea (light purple). From K = 5, Polynesians take their own ancestry (green), which, like their deflection on the PCA plot, is most likely due to genetic drift or, alternatively, cryptic relatedness or extreme inbreeding in studied populations. However, the latter is unlikely due to the lack of close relatives (up to third-degree, inclusive) in four Polynesian groups, and normal range of inbreeding coefficients when comparing to other human populations (F IS , Supplementary Table S6).
https://media.springernature.com/lw900/springer-static/image/art%3A10.1038%2Fs41598-018-20026-8/MediaObjects/41598_2018_20026_Fig2_HTML.jpg

Ancestral genomic components in study populations estimated using ADMIXTURE. Details of the populations are provided in Supplementary Table S1B. The colors used have been selected to be equivalent to those used in Fig. 1. Only runs from K = 4 to 10 are shown, complete results (K = 2 to 15) are given in Supplementary Fig. S1. (a) For every value of K, the modal solution with the highest number (superscript) of ADMIXTURE35 runs is shown; individual ancestry proportions were averaged across all runs and the average cross-validation statistics were calculated across all runs from the same mode (Supplementary Fig. S2). The minimum cross-validation score is observed at K = 11 but no further components appear in the profiles of Polynesians after K = 10. Populations from the Philippines can be generally divided into Negritos (Aeta, Agta and Batak), Kankanaey of northwestern Luzon, and all others representing an amalgamation of groups from Luzon, Palawan and Visayas (see Fig. 1 and Supplementary Table S1B). (b) The average of K = 10 ADMIXTURE profiles for groups of Leeward Society individuals clustered by fineSTRUCTURE37 (Supplementary Fig. S6), indicating the heterogeneous distribution of East Asian and European ancestry among the Leeward Society Islanders.

The lowest cross-validation (CV) score of ADMIXTURE is observed at K = 11, but no additional ancestries appear in Polynesians after K = 10, which has the second lowest CV score (Fig. 2a, Supplementary Figs S1 and S2). At K = 10, a dark blue component appears that is almost fixed in the Kankanaey of northwestern Luzon. The distinctive and uniform profiles of additional ISEA, Melanesian, and East Asian ancestries in two (Tonga and Samoa) out of four, otherwise very closely related, Polynesian groups hint that these may be the result of an old admixture process, rather than genetic drift, extreme bottlenecks or algorithmic artifacts. In contrast, the noticeably uneven distribution of the East Asian (yellow) and western European (grey) ancestry components within the profiles of the Leeward Society individuals (Fig. 2b) is consistent with recent historical admixture events (see haplotype-based admixture analysis below).

The outgroup f336 allele-sharing plot shows the length of a phylogenetic branch shared between two study populations and African Yoruba. For the Leeward Society Isles (Supplementary Fig. S3, Supplementary Table S7), the f3 allele-sharing results are consistent with a most recent evolutionary history shared with Samoa, Tahiti, and Tonga. It also suggests that the Kankanaey of the Philippines and Taiwanese aborigines are the next closest populations to all four Polynesian groups. These results remain robust to the different SNP subsets or population clustering schemes used in the present study (Supplementary Figs S3, S4, Supplementary Table S7). In contrast, the f3 admixture plots (Supplementary Fig. S5, Supplementary Table S7), which detect the presence of admixture in a study population from two reference groups, display different results for western and eastern Polynesia. These differences could be explained by a reduced effective population size for eastern Polynesians, caused by bottlenecks during the initial settlement process, or because Tonga and Samoa have experienced additional admixture since they last shared a common ancestral gene pool with Tahiti and the Leeward Society Isles.

The unsupervised fineSTRUCTURE (FS) analysis of haplotypes37 placed individuals into genetic clusters that include: Philippine groups from lowland Luzon, Palawan, and Visayas (‘Philippines 1’), Malaysia, Sulawesi, East Asia, northern Melanesia (Bougainville), New Guinea, and western Europe (Supplementary Fig. S6). The GLOBETROTTER (GT) analysis38 produced strong statistical support for two separate episodes of admixture involving the ancestors of the Leeward Society Islanders (Fig. 3, Supplementary Table S8). The first represents an average contribution of ~6% western European ancestry, which is dated to 1749–1803 CE. This is consistent with documented contact during Cook’s three voyages of exploration1, which took place 1768-71, 1772-75 and 1776-80. The second episode is estimated to have occurred in an interval from ~1,200 to 1,700 BP (229–725 CE), and is composed of a minor component (~17%), comprising mainly northern Melanesian and New Guinea sources, and a major one (~83%), in which the largest contributions are attributable to the ‘Philippines 1’, Sulawesi, and Malaysian clusters. The chronology indicates that this episode occurred prior to the earliest widely accepted radiometric dates for the permanent settlement of eastern Polynesia, which centre on ~950 BP and come from archaeological sites on Rai’atea in the Leeward Society Isles26,27. In addition, the presence of northern Melanesian ancestry in the minor component of the second (older) episode of admixture (~8% of the genome) reflects some genetic contact with this region for the ancestors of the Leeward Society Islanders prior to 1,200–1,700 BP.

Discussion
The genomes of contemporary Polynesian-speaking groups appear to be a mosaic of components derived from the coming together of long-diverged sources from ISEA and the region of northern Melanesia/New Guinea13,14,25. How this came about is the subject of considerable debate9,11,12,30. Our haplotype-based analysis of high-density autosomal SNPs indicates that, for the ancestors of the Leeward Society Isles, most of this admixture occurred during a period spanning ~1,200–1,700 BP. These genetic dates are nearly identical to those of a previous analysis that used a different method and amalgamated haplotype data from western (Tonga) and outlier (Rennell, Bellona and Tikopia) Polynesia25. They contrast with older dates obtained using different data sets and methods, which vary from ~7,000 BP to ~2,700 BP13,14,43. The method used here has been demonstrated to accurately identify known historical admixture events during the past 2,000 years38,44, but it is also possible that other analytical approaches may provide insights into a different part of the genealogical process.

The presence of this demographic signal in the data from the Leeward Society Isles is important, since it is consistent with archaeological evidence for a late settlement model for eastern Polynesia ~950 BP, and, therefore, the linguistic sub-grouping of Wilson32 (Fig. 1b). The substantial body of linguistic evidence supporting this sub-grouping includes over 200 lexical and grammatical innovations that are shared between the languages of eastern Polynesia and the central northern outliers (Luanguia, spoken on Ontong Java, Takuu, Nukumanu and Nuguria). Moreover, these innovations are stepwise and directional in nature, a pattern that is only consistent with a west-to-east movement of people, tracing the origins of eastern Polynesians to central northern outlier Polynesia, rather than Samoa32,33. The principal component analysis and phylogenetic reconstruction of the Polynesian mtDNA B4a1a1 sub-groups and C2a1-P33 paternal lineages (Supplementary Figs S8–S10, S12), are consistent with this linguistic evidence for the recent settlement of eastern Polynesia from the central northern outliers.

A further important contribution to the debate on Polynesian origins is the partitioning of northern Melanesian ancestry into both sides of the admixture episode taking place ~1,200–1,700 BP in the ancestors of the Leeward Society Islanders (Fig. 3). In particular, the contribution of ~8% of this ancestry to the side containing the ISEA sources is significant, because it suggests an earlier episode of admixture affecting the population ancestral to the Leeward Society Islanders. This result is robust to analysis by subsets of the data (Supplementary Fig. S7), but it is not possible to determine how and when this northern Melanesian ancestry entered into the ancestral gene-pool of the Leeward Society Islanders. It, therefore, remains feasible that, for some groups of Austronesian speaking migrants from ISEA, genetic admixture accompanied cultural interaction during the formative period of the LCC in the Bismarck archipelago ~3,450–3,250 BP8,15, which precedes the settlement of southern Melanesia and western Polynesia by at least 200 years3,45.

The position of the Kankanaey as the closest group to the Leeward Society Islanders in the outgroup f3 allele-sharing plots (Supplementary Figs 3 and 4), while not making any significant contribution to their genomes in the GLOBETROTTER38 (GT) results (Fig. 3) is potentially very revealing. It is arguable that one or other result is misleading as an effect of severe genetic drift. However, this hypothesis requires the concurrent excess retention of either SNPs (should f3 results be taken at face value), or haplotypes (should we trust only GT), typical of those found in the Leeward Society Islands today, which is statistically unlikely. Alternatively, while the Kankanaey are indeed the single best remaining proxy for the ancestors of the Leeward Society Islanders, the ‘Philippine 1’ cluster is admixed with a genetically closer population for those ancestors (comparing to the Kankanaey). Specifically, although the ‘Philippine 1’ cluster has received extensive admixture with other groups, which lowers their f3 score, they retain the best proxy for the haplotypic variation found in the original ancestors of the Leeward Society Islanders. This hypothesis is preferred because the GT approach models the recipient population using donors who are reconstructed rather than observed, allowing for subsequent admixture in the donor groups38.

Within the geographical context of the Philippines, the GT results make sense because the populations making up the other three Philippines clusters are all located in mountainous regions and have languages that are either relics or indicate long-term isolation46,47. In contrast, the ancestors of the demographic expansion that led to the settlement of Polynesia are anticipated to be part of a recent seafaring tradition. This necessarily would have been based in the coastal regions and could be related to pre-existing trading networks within ISEA that already had links to Melanesia (see Donohue and Denham48 and comments for a discussion of this subject). In this respect, it is interesting to note that the age of the most recent common ancestor of the Y chromosome haplogroup O3i-B451 (5,900–8,100 BP, Supplementary Table S10B), proposed as a marker for the expansion of Austronesian speaking people throughout ISEA40, exceeds the proposed timing for the transfer of the Neolithic from Taiwan (4,200 BP)11.

Within the Society Islands themselves, maternally-inherited mitochondrial DNA lineages are strongly biased towards variants thought to be associated with the dispersal of Austronesian speakers (96% B4a1a1, Supplementary Table S9A). The best candidate for a contribution from the Austronesian speaking diaspora of ISEA to the male lineages of the Leeward Society Islands is haplogroup O3i-B451. However, it contributes less than 10% to the Leeward Society Islands paternal lineages (Supplementary Table S3). The majority of Y chromosome lineages have proposed ancestral associations with modern Papuan groups (C2a1-P33 and S2a-P79, Supplementary Table S9B)13,17. This sex bias holds across Polynesia and is observed as far back as Island Southeast Asia49, and may have resulted from the practice of exogamy and matrilocal post-marital residence among early Austronesian speaking groups50. A sex bias is also reflected in the nuclear genomes of Austronesian speakers and appears to be a characteristic of the Pacific region as a whole25,51.

In conclusion, the picture of Polynesian origins emerging from the present study is one of a more complex demographic history than that originally envisioned in the phylogenetic model of cultural evolution12. The results presented here provide support for models based on inter-connectivity among, and within, the different parts of the Pacific, rather than their relative isolation8,9. The new data concur with a late chronology for the settlement of eastern Polynesia, which fits better with the linguistic arguments and haploid data linking this region to the northern central Polynesian outliers. With respect to the ultimate origin of the Island Southeast Asian ancestry found in the Leeward Society Isles, the results indicate a significant role for the lowland region of the Philippines, as predicted by Johann Reinhold Forster in his seminal comparative study of languages conducted more than two hundred and forty years ago.