Friday, August 19, 2016

How things are looking so far

As David's recently pointed out, it does seem as though the old fateful triangle still remains:




Most West Eurasian populations still look like they're mostly, on a basal level, divvied up between a Basal Eurasian rich component (rather similar to the old "ENF" cluster), Ancient North Eurasian-related ancestry and Western European Hunter-Gatherer/Villabruna-cluster-related ancestry.

It seems that, in this set-up, the main distinction between groups such as Neolithic Iranians (alongside Caucasus Hunter-Gatherers) & Neolithic Levantines is that one is "Basal-rich + ANE"  and the other is "Basal-rich + Villabruna":

The first PCA where the triangle is clearly visible is directly based on David's new Basal-rich K=7 ADMIXTURE run but it's data is still in line with what we see in a PCA directly based on autosomal SNPs like the one below:


Here, you can see relatively the same population structure. The more north a population pulls correlates with how much ANE-related ancestry they have, the more east a population pulls correlates with how much "Basal-rich" type ancestry they have and, finally, the more west a population pulls correlates with how much Villabruna-related ancestry they have.

So it seems David's indeed come up with a decent model here. What we really need now is to figure out exactly what "Basal Eurasian" is and to understand some of the earlier pre-history of West Asia to a point where we can grasp how the substantial Ancient North Eurasian-related and Villabruna-related ancestry, found in its Neolithic and Epipaleolithic inhabitants, got there. [note]


References:

1. The genetic structure of the world's first farmers, Lazaridis et al. 2016

2. The Demographic Development of the First Farmers in Anatolia, Kılınç et al.

Notes:

1. The "Basal-rich" cluster itself is likely to be a mixture between something related to European Hunter-Gatherers like those of the "Villabruna/WHG" cluster and Basal Eurasian ancestry. Even the new Lazaridis pre-print implies as much via this figure when demonstrating what makes up the Natufian samples.

Some new Neolithic Anatolians

Well, thanks to a new study, we now have some new Neolithic Anatolian samples. This time they're from South-Central Anatolia and date to between 8300 and 5800 BCE.




Seems the Boncuklu samples (~8300-7500 BCE) are nearly identical to the Barcın (Northwestern) Neolithic Anatolians in terms of WHG/Villabruna-related ancestry and "ENF-like/Basal-rich"-related ancestry whilst the Tepcek samples (~7500-5800 BCE) are less Villabruna-shifted and thus pull a bit more toward Neolithic Levantines and Natufians. You can see this in David's PCA (Principal Component Analysis based on autosomal SNPs) below:



The more left a population pulls; the greater the affinity for Villabruna-type Hunter-Gatherers whilst the more right the pull indicates how ENF-like/Basal-rich they are. Finally, the more north a population pulls indicates how "ANE" shifted they are. Neolithic Anatolians, Neolithic Levantines and Natufians pull the absolute least toward the north as they seem to lack ANE-related admixture.

References

1. The Demographic Development of the First Farmers in Anatolia, Kılınç et al.

Notes:

1. The mtDNA diversity among these samples is rather interesting to me as someone from the Horn of Africa, to be honest. N1a1a1, N1b, K1a, K1a12a, U3: these are quite close to or directly overlap with the mtDNA Haplogroups you can find among Somalis and other Horn African populations. 

Sunday, July 31, 2016

East Asians are part Ancient North Eurasian?

This was one helluva bomb the new Lazaridis et al. paper managed to drop at the end of their supplemental where they claimed East Asians are a mixture between MA-1-related peoples and a truly "Eastern Non-African" population:


To be fair, the ANE-related admixture doesn't seem substantial (10-15% ANE + 85-90% ENA/South Eurasian-like) when dealing with East Asians like the Japanese and Han Chinese who still seem mostly Eastern Non-African derived:




Once again we have formal stat based methods like qpAdm picking up on gene-flow which wasn't really caught by ADMIXTURE, to my knowledge. 

Now, it is intriguing to point out that, at least at the higher Ks of some runs, the Japanese & Han did show some Nganasan-like ancestry:




However, the Nganasan-like admixture seems minuscule in the Japanese, Korean and Northern Han Chinese samples and doesn't even show up in the non "Northern" Han Chinese sample-set. Nganasans are a Siberian population whom you'd expect to show some notable ANE-related ancestry but the amounts of ancestry the Japanese, Korean and Northern Han samples are showing from a population like them aren't enough to explain the levels of ANE-related ancestry we see with this study's qpAdm models.

So, ADMIXTURE, in this case, really didn't pick up on something qpAdm did, as far as I can see. David Wesolowski's own K=8 at most had the Japanese at about ~1-3% Ancient North Eurasian and those are negligible/noise levels.

ADMIXTURE might've mildly caught wind of this ANE-related admixture but it's results are definitely not consistent with what these new qpAdm runs are implying.




But I suppose the argument one can make is that how East Asian populations usually form their own cluster at the very early Ks is the problem with ADMIXTURE. It's being heavily skewed by how drifted East Asians are from West Eurasians.

That same substantive genetic drift helps form the genetic structure in the above PCA. The Y-axis marks the divide between African populations (with very little to no Eurasian admixture) and Out-of-Africa populations whilst the X-axis marks the divide between Eastern Non-Africans and West Eurasians (East Asians pulling farther away from West Eurasians than Papuans do in this case).

So, it is possible, I suppose, that all prior ADMIXTURE analyses were being fooled by this substantive and perhaps somewhat recently cemented drift (last 10,000-20,000 or so years?) which analyses like qpAdm are more resistant to. But, what's even more intriguing is that tree-mixes from David Wesolowski support this new paper's claims & data:




Nevertheless, I'm remaining somewhat skeptical about this until we have ancient DNA from around East Asia that might refute or reaffirm these new findings. We might discover something more interesting than these populations simply being part ANE-related. 

What I find especially odd is that even Southeast Asians show such admixture at levels comparable to those of the Japanese, Han and Koreans whom they're somewhat distinct from in terms of their genetic history. ~14% for the Thai and ~12% for Cambodians? Why is everyone so uniformly Ancient North Eurasian here (between 10-15%)? This somewhat brings the whole Mota debacle to mind, actually. [note]

Everyone (other than Siberians, Mongolians & Central Asians whom we've known for a while now have some Ancient North Eurasian-related ancestry) is turning up as 10-15% "ANE" and that's honestly a bit suspicious and is why I'm skeptical about this new data.


References:


2.  Ancient human genomes suggest three ancestral populations for present-day Europeans, Lazaridis et al. 

Notes:

1. I think this is honestly a quirk of the qpAdm model they used. Basically modeling these populations as "Onge + ANE"; it could be that all we're seeing is that these populations have an understandable shift away from Onge-like Eastern Non-Africans that may not necessarily be characterized by Ancient North Eurasian-related admixture. However, it does say something that the same pattern presented itself (when dealing with the Han) via tree-mixes. At any rate, we should wait on some East Asian ancient DNA.

Tuesday, July 19, 2016

Natufians were dark-skinned?

The creator of the PuntDNAL ancestry project (Abdullahi Warsame) managed to convert one of the Natufian samples' raw data (I1072) into a .txt file [note] and it seems, as per his findings, that the sample is GG for the SLC24A5 gene's rs1426654 SNP.





I've somewhat gone over this before but that would mean he lacks the derived A allele for the SNP which is important in modern West Eurasians' de-pigmentation. It's responsible for 1/3 of the skin pigmentation difference between Europeans and mostly non-Eurasian admixed African populations.

Lacking it as well the derived alleles for SLC45A2's rs1891982 SNP is why the reconstruction of La Braña-1 above has him being rather pigmented/dark-skinned. Western European Hunter-Gatherers like him had the alleles required for light-eyes but not for light-skin:


Neolithic farmers from Western Anatolia are responsible for bringing the derived alleles required from SLC24A5 and Eastern European Hunter-Gatherers seemingly carried the derived alleles required from SLC45A2. So, Bronze Age pastoralists from the steppe and Neolithic Farmers from West Asia ultimately brought de-pigmentation/light-skin to much of Peninsular Europe.

Anyway, if this Natufian is really GG for that SNP (he's most likely ancestral for SLC45A2's rs1891982 SNP as well), and none of this is caused by DNA damage and is quite legit, then it seems like this particular Natufian was an individual lacking in modern West Eurasians' de-pigmentation. No idea if the other samples are the same but they might be and it'd be pretty intriguing if they are.


General spread of the Natufian culture

What's even more interesting is that the Neolithic Levantines whose data I've managed to sift through (4 samples) are AA for SLC24A5's rs1426654 SNP, making them more similar to Western Neolithic Anatolians and Early European Farmers in this respect.

If this Natufian's result is truly legitimate and the other samples and numerous future Natufian samples prove to be just like him in this respect; it seems like something from outside the Levant brought the derived allele to the region as Caucasus Hunter-Gatherers carried it, Neolithic Iranians seemingly carried it and Western Neolithic Anatolians did as well. For Neolithic and not Epipaleolithic Levantines to carry it could mean an outside population brought it in perhaps from somewhere more north or east. But we'll see what future data on Natufians reveals.

Reference List: 

1. The genetic structure of the world's first farmers, Lazaridis et al. 2016

2. The genetic history of Ice Age Europe, Fu et al.

3. Eight thousand years of natural selection in Europe, Mathieson et al.

Tuesday, July 5, 2016

Somali qpAdm models using new ancient genomes

So, I asked David over at Eurogenes to run Somalis as a mixture between South Sudanese people and Natufians in order to see how well the model would fit using a formal statistical method like qpAdm and he got some pretty surprising results overall:


Natufian + Sudanese (south):

Sudanese: 54%
Natufian: 46%

Neolithic Levant + Sudanese (south):

Sudanese: 54%
Neolithic Levant: 46%

Neolithic Levant + Chalcolithic Iran + Sudanese (south):

Sudanese: 55%
Neolithic Levant:  34%
Chalcolithic Iran: 11%


Now, what's going to surprise you is that the third model is the one that fits the best, and by a long shot when compared to the first model. Natufian + Sudanese (south) fits the worst (chisq: 26.256, tail prob: 0.09%, std. errors: 0.009), Neolithic Levant + Sudanese (south) fits much better (chisq: 7.593, tail prob: 47%, std. errors: 0.006) and Neolithic Levant + Chalcolithic Iran + Sudanese (south) fits even better (chisq: 4.975, tail prob: 66%, std. errors: 0.057).

The last one almost fits as well as a Corded Ware sample being modeled as ~70% Yamnaya & ~30% Esperstedt Middle-Neolithic (chisq: 2.621, std. errors: 0.060) which roughly fits with the data from peer-reviewed studies like Haak et al. 2015:





This oddly reminds me of some models the new Lazaridis et al. pre-print shared where they were asserting that Somalis were a mixture between Mota & population along the Iran_ChL→Levant_BA cline:




I didn't make much of the above at the time. For one, Mota is a poorer fit for Somalis' African ancestry than the South Sudanese (due to various reasons alluded to here), and it made little sense that Somalis' West Eurasian ancestry corresponded better with Bronze Age Levantines and Copper Age Iranians than Neolithic Levantines, for example. At least in my humble opinion.

I figured Lazaridis & company just didn't try a different model that would probably fit better but it's now odd that this fits a bit well with what the above qpAdm models imply which is that "Neolithic Levantine + Chalcolithic Iranian + Sudanese (south)" fits much better than "Natufian + Sudanese (south)" and somewhat better than "Neolithic Levant + Sudanese (south)".



Future analyses and data will be needed but I should point out that some current ADMIXTURE runs don't seem entirely supportive of such a model but ADMIXTURE is not necessarily as precise as qpAdm can be.

For one, qpAdm is preferable because it outright allows you to take a Natufian and then a Neolithic Levantine and see which one you have a greater affinity for (mixture wise) but ADMIXTURE is a lot more messy in that it allows all of these clusters to form among various modern & pre-historic populations and could thus be more prone to producing perhaps more clunky results. Formal statistical methods like qpAdm also seem to be "drift resistant" / resistant to being skewed by recent genetic drift and can thus notice deeper ancestry better than ADMIXTURE to a certain degree.

But we should see what some other analyses say like d-stats and tree-mix. I'm skeptical about the third model in particular (for the time being), despite how well it fits.

Reference List:



Notes: 

1. Link to the full qpAdm results.

Saturday, July 2, 2016

David's Treemix results for Natufians & Neolithic Levantines

I hesitated to make this post as David made it himself adequately enough and you can check his post out for all the tree-mixes but it is worth-noting that even tree-mixes show that Natufians and even Neolithic Levantines have some African ancestry:




This is a strange puzzle, quite frankly. Especially given the Lazaridis et al. 2016 Pre-Print's claim below:



"However, no affinity of Natufians to sub-Saharan Africans is evident in our genome-wide analysis, as present-day sub-Saharan Africans do not share more alleles with Natufians than with other ancient Eurasians (Extended Data Table 1). (We could not test for a link to present-day North Africans, who owe most of their ancestry to back-migration from Eurasia)."


In fact, as David's found, formal stats don't imply that Natufians have African admixture the way these tree-mixes, ADMIXTURE & PCAs do:


Chimp Biaka Anatolia_Neolithic Israel_Natufian -0.000422 -1.539 414749
Chimp Biaka Iran_Hotu Israel_Natufian 0.000981 1.199 70803
Chimp Biaka Iran_Neolithic Israel_Natufian -0.000223 -0.566 367632

Chimp Mbuti.DG Anatolia_Neolithic Israel_Natufian -0.000312 -1.113 481333
Chimp Mbuti.DG Iran_Hotu Israel_Natufian 0.000703 0.906 81688
Chimp Mbuti.DG Iran_Neolithic Israel_Natufian -0.000043 -0.104 425175

Chimp Mota Anatolia_Neolithic Israel_Natufian -0.000734 -1.933 481191
Chimp Mota Iran_Hotu Israel_Natufian 0.000686 0.644 81676
Chimp Mota Iran_Neolithic Israel_Natufian -0.000388 -0.768 425056

Chimp Yoruba Anatolia_Neolithic Israel_Natufian -0.000407 -1.407 414749
Chimp Yoruba Iran_Hotu Israel_Natufian 0.000552 0.654 70803
Chimp Yoruba Iran_Neolithic Israel_Natufian 0.000026 0.063 367632


The above shows that they don't have any "shift" toward Africans away from other pre-historic Eurasians and thus don't share any alleles with Africans that Neolithic Iranians (who don't seem to show such admixture in tree-mixes and such) or Neolithic Anatolians do not which is in line with what the Lazaridis Pre-Print was asserting.

It's quite strange because essentially all analyses other than D/f4 stats show that the Natufians in particular supposedly have African admixture. It's also interesting how the tree-mixes David's released show two migration edges (arrows) going toward the Neolithic Levantines and Natufians:



One tends to look more overtly African and comes in from either Mota's branch or, as it is above, from in-between Biakas and Yorubas whilst the other tends to sit between Mota and all the Eurasians present in the tree. David suggests that the latter is a sign of "Basal Eurasian", and it is interestingly the migration edge that tends to go directly into the Natufians themselves. The other seems more overtly African and tends to go to the root of both Natufians and Neolithic Levantines while being less significant.

We've had Natufian~Neolithic Levantine-like samples show African affinities via tree-mixes in the past such as Stuttgart (an Early European Farmer):



And interestingly; Stuttgart's African admixture in that tree-mix above (also made by David) looks Hadza-like which is interesting because of the similarity between Hadzas and Mota (see here). But, there aren't two migration edges or two distinct elements like the African samples present going into Stuttgart. 

Perhaps it's a quirk of  Stuttgart's heightened WHG-related admixture in comparison to Neolithic Levantines and Natufians or perhaps the latter two groups of samples actually do have some African admixture alongside their Basal Eurasian ancestry and the formal stats and such are wrong? Or all of this, in all the tree-mixes, including the new ones; are caused by Basal Eurasian and we shouldn't make too much of this? 

Though, it's worth noting that Stuttgart's PCA position does not imply a pull toward African populations when compared to various West Eurasians and North Africans. There is a slight pull in the case of Neolithic Levantines and a more overt one in the case of Natufians, however (see here). And, whilst I haven't seen the ADMIXTURE results of a Neolithic Levantine sample, unlike that one Natufian from earlier, Stuttgart clearly shows no African admixture via ADMIXTURE runs.

Puzzle indeed.


Reference List:



Notes:

1. I said Stuttgart/Early European Farmers have Natufian-like ancestry because that's what this new Pre-print actually implies with it's ADMIXTURE run. The Neolithic Anatolian Farmers and Neolithic/Early European Farmers look like they're a mixture between the blue cluster which dominates Neolithic Levantines and Natufians (Epipaleolithic Levantines) and the red cluster which dominates "WHGs". The farmers who hit up Europe clearly seem more related to those in the Levant than those in Iran if that ADMIXTURE and mostly Pan-West Eurasia PCA are any indication.

2. Stuttgart's Gedmatch kit number: F999916

3. At this point; ADMIXTURE & PCAs imply African admixture in these Natufian samples but the formal stats just aren't picking up on this which is odd. The formal stats impart that Africans don't share more alleles with Natufians when compared to other pre-historic Eurasians as this new Pre-Print asserts at one point.

Thursday, June 30, 2016

PCA & ADMIXTURE results for Natufians




That's a 3D interactive PCA (Principal Component Analysis) based on autosomal SNPs made by David Wesolowski who authors the Eurogenes genome blog and ancestry project. What's particularly interesting to me about it are the PCA positions of the Natufians and the Neolithic Levantines... With the former group pulling southwards toward African populations such as North, East & West-Central Africans. 


Eurogenes ANE K7

ENF: 77%
East African: 8%


Hunter_Gatherer vs. Farmer

Middle Eastern Herder: 64%
Mediterranean Farmer: 30%
East African Pastoralist: 7%


Eurogenes K12b

Southwest Asian: 54%
Mediterranean: 38%
East African: 8%



That pull along with the above ADMIXTURE results (via Gedmatch) of one Natufian seem to contradict what Lazaridis et al. was saying about the Natufians lacking African admixture but I would caution against using modern PCA positions (like those of Bedouins) and, of course, modern ADMIXTURE runs (with modern clusters based on modern genetic diversity) to gauge how "African" or "Eurasian/Out-of-Africa" an ~11,000-14,000 year old population was.

I.e. These Natufians are, of course, not "Southwest Asian" + "Mediterranean" but, instead, they're just showing the greatest affinity for these modern clusters. As in, populations probably quite like them to some degree; contributed to the formation of clusters like Southwest Asian & Mediterranean.  But, it's still strange that they'd show such an affinity for an African cluster like the East African one.



"However, no affinity of Natufians to sub-Saharan Africans is evident in our genome-wide analysis, as present-day sub-Saharan Africans do not share more alleles with Natufians than with other ancient Eurasians (Extended Data Table 1). (We could not test for a link to present-day North Africans, who owe most of their ancestry to back-migration from Eurasia)."


If anything, it makes what Lazaridis et al. noticed with their analyses, as noted in the above quote, all the more interesting. It could, in my opinion, just be a quirk of their age. Mota, according to the academics who sampled him, didn't have any "Eurasian" admixture based on formal statistics (from what I recall) yet his PCA position (pulling toward Out-of-Africa populations more than the Southern Sudanese or Western Africans) on a global PCA implied otherwise:




Mota's ADMIXTURE results also implied some vague and broadly Out-of-Africa/Eurasian admixture:



Eurogenes ANE K7:

ANE: 3%
ASE: 2%    
East Eurasian: 2%
West African: 20.30
East African: 65.23
ENF: 7.65


Hunter_Gatherer vs. Farmer

Baltic Hunter Gatherer: 1%
South American Hunter Gatherer: 1%
South Asian Hunter Gatherer: 2%
East African Pastoralist: 56%
Oceanian Hunter Gatherer: 1%
Pygmy Hunter Gatherer: 25%
Bantu Farmer: 14%


Eurogenes K12b

East African: 54%
West African: 40%
South Asian: 2%
Siberian: 1%
Western European: 1%
East Asian: 1%



And Mota is only a ~4,500 year old Southwestern Ethiopian sample. So, I'd remain rather skeptical about what global PCA positions, where these pre-historics are thrown in alongside modern populations, have to say. And ADMIXTURE runs that were usually (not always) based on modern populations can also produce somewhat dubious results.

The fact that they lack African admixture (discounting whatever Basal Eurasian's cause will one day turn out to be) may still be quite the case if they truly do not share more alleles with African populations than other pre-historic Out-of-Africa samples do (such being Lazaridis et al. 2016's finding).





On another note, David's placement of them in his mostly Pan-West Eurasia PCA is rather intriguing as well. In this case, they do not cluster with any modern West Eurasian populations and; like NW Neolithic Anatolians, Early European Farmers and Sardinians; they break off from other West Eurasians as they seemingly lack the eastern (it's "northern" in this PCA, I suppose) pulling affinities (I.e. ANE-related admixture) which somewhat pull all the other populations away from them. 

The Early European Farmers and Neolithic Anatolians pull much more west toward WHG/Villabruna-type peoples while the Natufians cluster just south of Negevite Bedouins which implies that they have the least "WHG" related affinities when compared to the Neolithic Levantine, Anatolian and European samples. Which is, roughly, what Lazaridis et al.'s ADMIXTURE run implied:



It really does, as I pointed out earlier, look like these Natufians (and their Neolithic Levantine counterparts) might, for the most part, be the modern Southwest Asian cluster or David's old ENF/ Near Eastern cluster (or Lazaridis et al. 2013-2014's Near Eastern component) in the flesh. The Neolithic Levantines, in the Pan-West Eurasia PCA, just look like somewhat WHG-shifted versions of the Natufians. Perhaps this shift is why they are supposedly about ~15% less Basal Eurasian than their Natufian predecessors:




At any rate, I'll leave it at that for now... I'll be interested in seeing what analyses from third parties like David such as d-stats turn up on these Natufians (haven't really sifted through those yet).


Reference List:


Notes:

1. Gedmatch kit number for the Natufian: M041601 AND the Gedmatch kit number for Mota: M261275

2. I'm open to these Natufians having African ancestry. Just pointing out that PCA positions and the current ADMIXTURE runs we have might not be the way to go. Nevertheless, it is interesting that these Natufians and not the later (~7,500 year old) Neolithic European samples like Stuttgart (F999916) show such African affinities in ADMIXTURE runs. Might just be because of the very heightened WHG-shift in a Neolithic European Farmer like Stuttgart and perhaps the sample's younger age by ~4,000-7,000 years.

3. Mota's Eurasian admixture in those calculators also, unlike the Natufians' African admixture, looks more broad (I.e. ANE + East Eurasian + ENF etc.) and looks more like an affinity. The Natufians, on the other hand, seem to mostly lean toward the East African cluster which looks less like a broadly African affinity and more like actual admixture, quite frankly.

4. Most of the Natufians do clearly have a bit of a western-shift though. They don't cluster exactly where David's "ENF" cluster theoretically would (only one of them does).