Sunday, July 31, 2016

East Asians are part Ancient North Eurasian?

This was one helluva bomb the new Lazaridis et al. paper managed to drop at the end of their supplemental where they claimed East Asians are a mixture between MA-1-related peoples and a truly "Eastern Non-African" population:


To be fair, the ANE-related admixture doesn't seem substantial (10-15% ANE + 85-90% ENA/South Eurasian-like) when dealing with East Asians like the Japanese and Han Chinese who still seem mostly Eastern Non-African derived:




Once again we have formal stat based methods like qpAdm picking up on gene-flow which wasn't really caught by ADMIXTURE, to my knowledge. 

Now, it is intriguing to point out that, at least at the higher Ks of some runs, the Japanese & Han did show some Nganasan-like ancestry:




However, the Nganasan-like admixture seems minuscule in the Japanese, Korean and Northern Han Chinese samples and doesn't even show up in the non "Northern" Han Chinese sample-set. Nganasans are a Siberian population whom you'd expect to show some notable ANE-related ancestry but the amounts of ancestry the Japanese, Korean and Northern Han samples are showing from a population like them aren't enough to explain the levels of ANE-related ancestry we see with this study's qpAdm models.

So, ADMIXTURE, in this case, really didn't pick up on something qpAdm did, as far as I can see. David Wesolowski's own K=8 at most had the Japanese at about ~1-3% Ancient North Eurasian and those are negligible/noise levels.

ADMIXTURE might've mildly caught wind of this ANE-related admixture but it's results are definitely not consistent with what these new qpAdm runs are implying.




But I suppose the argument one can make is that how East Asian populations usually form their own cluster at the very early Ks is the problem with ADMIXTURE. It's being heavily skewed by how drifted East Asians are from West Eurasians.

That same substantive genetic drift helps form the genetic structure in the above PCA. The Y-axis marks the divide between African populations (with very little to no Eurasian admixture) and Out-of-Africa populations whilst the X-axis marks the divide between Eastern Non-Africans and West Eurasians (East Asians pulling farther away from West Eurasians than Papuans do in this case).

So, it is possible, I suppose, that all prior ADMIXTURE analyses were being fooled by this substantive and perhaps somewhat recently cemented drift (last 10,000-20,000 or so years?) which analyses like qpAdm are more resistant to. But, what's even more intriguing is that tree-mixes from David Wesolowski support this new paper's claims & data:




Nevertheless, I'm remaining somewhat skeptical about this until we have ancient DNA from around East Asia that might refute or reaffirm these new findings. We might discover something more interesting than these populations simply being part ANE-related. 

What I find especially odd is that even Southeast Asians show such admixture at levels comparable to those of the Japanese, Han and Koreans whom they're somewhat distinct from in terms of their genetic history. ~14% for the Thai and ~12% for Cambodians? Why is everyone so uniformly Ancient North Eurasian here (between 10-15%)? This somewhat brings the whole Mota debacle to mind, actually. [note]

Everyone (other than Siberians, Mongolians & Central Asians whom we've known for a while now have some Ancient North Eurasian-related ancestry) is turning up as 10-15% "ANE" and that's honestly a bit suspicious and is why I'm skeptical about this new data.


References:


2.  Ancient human genomes suggest three ancestral populations for present-day Europeans, Lazaridis et al. 

Notes:

1. I think this is honestly a quirk of the qpAdm model they used. Basically modeling these populations as "Onge + ANE"; it could be that all we're seeing is that these populations have an understandable shift away from Onge-like Eastern Non-Africans that may not necessarily be characterized by Ancient North Eurasian-related admixture. However, it does say something that the same pattern presented itself (when dealing with the Han) via tree-mixes. At any rate, we should wait on some East Asian ancient DNA.

Tuesday, July 19, 2016

Natufians were dark-skinned?

The creator of the PuntDNAL ancestry project (Abdullahi Warsame) managed to convert one of the Natufian samples' raw data (I1072) into a .txt file [note] and it seems, as per his findings, that the sample is GG for the SLC24A5 gene's rs1426654 SNP.





I've somewhat gone over this before but that would mean he lacks the derived A allele for the SNP which is important in modern West Eurasians' de-pigmentation. It's responsible for 1/3 of the skin pigmentation difference between Europeans and mostly non-Eurasian admixed African populations.

Lacking it as well the derived alleles for SLC45A2's rs1891982 SNP is why the reconstruction of La Braña-1 above has him being rather pigmented/dark-skinned. Western European Hunter-Gatherers like him had the alleles required for light-eyes but not for light-skin:


Neolithic farmers from Western Anatolia are responsible for bringing the derived alleles required from SLC24A5 and Eastern European Hunter-Gatherers seemingly carried the derived alleles required from SLC45A2 (the farmers carried this particular derived allele at very low frequencies). So, Bronze Age pastoralists from the steppe and Neolithic Farmers from West Asia ultimately brought de-pigmentation/light-skin to much of Peninsular Europe.

Anyway, if this Natufian is really GG for that SNP (he's most likely ancestral for SLC45A2's rs1891982 SNP as well), and none of this is caused by DNA damage and is quite legit, then it seems like this particular Natufian was an individual lacking in modern West Eurasians' de-pigmentation. No idea if the other samples are the same but they might be and it'd be pretty intriguing if they are.


General spread of the Natufian culture

What's even more interesting is that the Neolithic Levantines whose data I've managed to sift through (4 samples) are AA for SLC24A5's rs1426654 SNP, making them more similar to Western Neolithic Anatolians and Early European Farmers in this respect.

If this Natufian's result is truly legitimate and the other samples and numerous future Natufian samples prove to be just like him in this respect; it seems like something from outside the Levant brought the derived allele to the region as Caucasus Hunter-Gatherers carried it, Neolithic Iranians seemingly carried it and Western Neolithic Anatolians did as well. For Neolithic and not Epipaleolithic Levantines to carry it could mean an outside population brought it in perhaps from somewhere more north or east. But we'll see what future data on Natufians reveals.

Reference List: 

1. The genetic structure of the world's first farmers, Lazaridis et al. 2016

2. The genetic history of Ice Age Europe, Fu et al.

3. Eight thousand years of natural selection in Europe, Mathieson et al.

Update:

I originally made a mistake in not noting that the early farmers (Neolithic Anatolians and Early European Farmers) did seemingly carry the derived allele in SLC45A2's rs1891982 SNP. They did seem to carry the derived (C) allele for this gene but at what look to be somewhat low frequencies (see here). I used to be under the incorrect impression that they didn't carry it. Apologies...

Tuesday, July 5, 2016

Somali qpAdm models using new ancient genomes

So, I asked David over at Eurogenes to run Somalis as a mixture between South Sudanese people and Natufians in order to see how well the model would fit using a formal statistical method like qpAdm and he got some pretty surprising results overall:


Natufian + Sudanese (south):

Sudanese: 54%
Natufian: 46%

Neolithic Levant + Sudanese (south):

Sudanese: 54%
Neolithic Levant: 46%

Neolithic Levant + Chalcolithic Iran + Sudanese (south):

Sudanese: 55%
Neolithic Levant:  34%
Chalcolithic Iran: 11%


Now, what's going to surprise you is that the third model is the one that fits the best, and by a long shot when compared to the first model. Natufian + Sudanese (south) fits the worst (chisq: 26.256, tail prob: 0.09%, std. errors: 0.009), Neolithic Levant + Sudanese (south) fits much better (chisq: 7.593, tail prob: 47%, std. errors: 0.006) and Neolithic Levant + Chalcolithic Iran + Sudanese (south) fits even better (chisq: 4.975, tail prob: 66%, std. errors: 0.057).

The last one almost fits as well as a Corded Ware sample being modeled as ~70% Yamnaya & ~30% Esperstedt Middle-Neolithic (chisq: 2.621, std. errors: 0.060) which roughly fits with the data from peer-reviewed studies like Haak et al. 2015:





This oddly reminds me of some models the new Lazaridis et al. pre-print shared where they were asserting that Somalis were a mixture between Mota & population along the Iran_ChL→Levant_BA cline:




I didn't make much of the above at the time. For one, Mota is a poorer fit for Somalis' African ancestry than the South Sudanese (due to various reasons alluded to here), and it made little sense that Somalis' West Eurasian ancestry corresponded better with Bronze Age Levantines and Copper Age Iranians than Neolithic Levantines, for example. At least in my humble opinion.

I figured Lazaridis & company just didn't try a different model that would probably fit better but it's now odd that this fits a bit well with what the above qpAdm models imply which is that "Neolithic Levantine + Chalcolithic Iranian + Sudanese (south)" fits much better than "Natufian + Sudanese (south)" and somewhat better than "Neolithic Levant + Sudanese (south)".



Future analyses and data will be needed but I should point out that some current ADMIXTURE runs don't seem entirely supportive of such a model but ADMIXTURE is not necessarily as precise as qpAdm can be.

For one, qpAdm is preferable because it outright allows you to take a Natufian and then a Neolithic Levantine and see which one you have a greater affinity for (mixture wise) but ADMIXTURE is a lot more messy in that it allows all of these clusters to form among various modern & pre-historic populations and could thus be more prone to producing perhaps more clunky results. Formal statistical methods like qpAdm also seem to be "drift resistant" / resistant to being skewed by recent genetic drift and can thus notice deeper ancestry better than ADMIXTURE to a certain degree.

But we should see what some other analyses say like d-stats and tree-mix. I'm skeptical about the third model in particular (for the time being), despite how well it fits.

Reference List:



Notes: 

1. Link to the full qpAdm results.

Saturday, July 2, 2016

David's Treemix results for Natufians & Neolithic Levantines

I hesitated to make this post as David made it himself adequately enough and you can check his post out for all the tree-mixes but it is worth-noting that even tree-mixes show that Natufians and even Neolithic Levantines have some African ancestry:




This is a strange puzzle, quite frankly. Especially given the Lazaridis et al. 2016 Pre-Print's claim below:



"However, no affinity of Natufians to sub-Saharan Africans is evident in our genome-wide analysis, as present-day sub-Saharan Africans do not share more alleles with Natufians than with other ancient Eurasians (Extended Data Table 1). (We could not test for a link to present-day North Africans, who owe most of their ancestry to back-migration from Eurasia)."


In fact, as David's found, formal stats don't imply that Natufians have African admixture the way these tree-mixes, ADMIXTURE & PCAs do:


Chimp Biaka Anatolia_Neolithic Israel_Natufian -0.000422 -1.539 414749
Chimp Biaka Iran_Hotu Israel_Natufian 0.000981 1.199 70803
Chimp Biaka Iran_Neolithic Israel_Natufian -0.000223 -0.566 367632

Chimp Mbuti.DG Anatolia_Neolithic Israel_Natufian -0.000312 -1.113 481333
Chimp Mbuti.DG Iran_Hotu Israel_Natufian 0.000703 0.906 81688
Chimp Mbuti.DG Iran_Neolithic Israel_Natufian -0.000043 -0.104 425175

Chimp Mota Anatolia_Neolithic Israel_Natufian -0.000734 -1.933 481191
Chimp Mota Iran_Hotu Israel_Natufian 0.000686 0.644 81676
Chimp Mota Iran_Neolithic Israel_Natufian -0.000388 -0.768 425056

Chimp Yoruba Anatolia_Neolithic Israel_Natufian -0.000407 -1.407 414749
Chimp Yoruba Iran_Hotu Israel_Natufian 0.000552 0.654 70803
Chimp Yoruba Iran_Neolithic Israel_Natufian 0.000026 0.063 367632


The above shows that they don't have any "shift" toward Africans away from other pre-historic Eurasians and thus don't share any alleles with Africans that Neolithic Iranians (who don't seem to show such admixture in tree-mixes and such) or Neolithic Anatolians do not which is in line with what the Lazaridis Pre-Print was asserting.

It's quite strange because essentially all analyses other than D/f4 stats show that the Natufians in particular supposedly have African admixture. It's also interesting how the tree-mixes David's released show two migration edges (arrows) going toward the Neolithic Levantines and Natufians:



One tends to look more overtly African and comes in from either Mota's branch or, as it is above, from in-between Biakas and Yorubas whilst the other tends to sit between Mota and all the Eurasians present in the tree. David suggests that the latter is a sign of "Basal Eurasian", and it is interestingly the migration edge that tends to go directly into the Natufians themselves. The other seems more overtly African and tends to go to the root of both Natufians and Neolithic Levantines while being less significant.

We've had Natufian~Neolithic Levantine-like samples show African affinities via tree-mixes in the past such as Stuttgart (an Early European Farmer):



And interestingly; Stuttgart's African admixture in that tree-mix above (also made by David) looks Hadza-like which is interesting because of the similarity between Hadzas and Mota (see here). But, there aren't two migration edges or two distinct elements like the African samples present going into Stuttgart. 

Perhaps it's a quirk of  Stuttgart's heightened WHG-related admixture in comparison to Neolithic Levantines and Natufians or perhaps the latter two groups of samples actually do have some African admixture alongside their Basal Eurasian ancestry and the formal stats and such are wrong? Or all of this, in all the tree-mixes, including the new ones; are caused by Basal Eurasian and we shouldn't make too much of this? 

Though, it's worth noting that Stuttgart's PCA position does not imply a pull toward African populations when compared to various West Eurasians and North Africans. There is a slight pull in the case of Neolithic Levantines and a more overt one in the case of Natufians, however (see here). And, whilst I haven't seen the ADMIXTURE results of a Neolithic Levantine sample, unlike that one Natufian from earlier, Stuttgart clearly shows no African admixture via ADMIXTURE runs.

Puzzle indeed.


Reference List:



Notes:

1. I said Stuttgart/Early European Farmers have Natufian-like ancestry because that's what this new Pre-print actually implies with it's ADMIXTURE run. The Neolithic Anatolian Farmers and Neolithic/Early European Farmers look like they're a mixture between the blue cluster which dominates Neolithic Levantines and Natufians (Epipaleolithic Levantines) and the red cluster which dominates "WHGs". The farmers who hit up Europe clearly seem more related to those in the Levant than those in Iran if that ADMIXTURE and mostly Pan-West Eurasia PCA are any indication.

2. Stuttgart's Gedmatch kit number: F999916

3. At this point; ADMIXTURE & PCAs imply African admixture in these Natufian samples but the formal stats just aren't picking up on this which is odd. The formal stats impart that Africans don't share more alleles with Natufians when compared to other pre-historic Eurasians as this new Pre-Print asserts at one point.