Testing how Vahaduo differentiates Admixed Nilo-Saharan from Cushitic

I've always seen uses of Cushitic proxies, particularly in reference to the look into the values of Sudanese (usually Nubian) samples. Alongside this, a Nilotic Proxy is used as well as a directly West Eurasian proxy. However, I have always questioned the ability of Vahaduo to differentiate between these different sources and pondered if they just lumped values together based on their principal components. I believe this isn't discussed enough and have conducted my own little investigation into the effectivity of this differentiation and trying to better understand how to interpret data from Vahaduo using the correct proxies and sources, especially when dealing with populations that possess numerous and divergent different components apart of their DNA that are usually over-simplified. I would like to add, I am an amateur and am open to accepting any criticism and I also welcome anyone to build on what I have started and help provide more accurate methods or ideas to help in coming to a more nuanced and deepened understanding of the Vahaduo tool and how to elevate its accuracy to another level.

First and foremost I will be using two East African samples bearing Ancestral East African ancestry as well as a significant West Eurasian component to compare their effectivity as a proxy for Cushtic. The two samples are "Daju_Nyala" and "Kenyan_Early_Pastoral_N". The former sample is sourced from the Daju of Darfur, a Nilo-Saharan people with the presence of Eurasian admixture, which I admittedly have no current understanding of the source of such admixture. The Daju are among Sudan's western tribes that did not fall into the process of Arabisation that the majority of the country did and they are concentrated particularly in the South Darfur region, quite isolated from regions where Eurasian admixture has been introduced to Sudan and likewise the Horn, however, they are in quite close proximity to some Sudanese-Arab tribes that do bear significant amounts of Eurasian admixture and so it is possible some of this admixture may have been sourced from the Arab tribes that exist in the region of South Darfur. My research on the Daju is very brief and I do not wish to spend too much time trying to understand the sources of this admixture, especially when considering the very Nomadic lifestyle of the Historic Daju, the presence of this admixture is not a surprise to me and I have come across sources that state previous assimilation within the Kingdom of Kush at some point in history before dispersing out west into the Marra mountains of Darfur and then more recent southern migrations into what is now the city of Nyala. What we can be confident about, the West Eurasian admixture in the Daju is more or less closely associated, or possibly directly sourced, from the West Eurasian admixture present in most of Sudan's other tribes. Our second sample is the Kenyan pastoral sample which shows a much more typical model for Cushitic with an even split of SSA Proto-Nilotic and Levant-originated Eurasian ancestry. Due to obvious reasons, the Kenyan sample is the actual proxy for Cushitic, and the Daju is simply but a comparison to see how it affects data, so the Daju in this sense is a make-shift Cushitic proxy.

The below images show the admixture present in both of these samples. I have used Levant_ISRC for Eurasian admixture due to its close association with Medieval Nubian Eurasian admixture, as well as the admixture present in Horners. For anyone who is critical of this specific proxy, I am open to suggestions for an alternative as accuracy for a Levantine sample is appreciated in this investigation. (But forget about the Daju, this is not me trying to legitimately use it as a Cushitic proxy, it serves as nothing more than a point of comparison to see what happens to certain values). I used the good old Dinka samples as a proxy for Proto-Nilotic making sure to use various samples in the source to reduce the notable but simultaneously minor region of error possibly produced by the West African admixture.
1675270227380.png

1675270234621.png

Now I would like to conduct the exact same run on the Kulubnarti R and S averages to see whereabouts they are in comparison to both samples.
1675270293443.png

Just from a quick glance at the values we can see the Kulubnarti sample averages for both cemeteries in which they were sourced, are modeled much more similarly to the Kenyan Pastoral sample.

Now to observe how Vahaduo will differentiate between an Admixed Nilo-Saharan and Cushitic, I use both samples as proxies for "Cushitic" and test them with the Kulubnarti samples. I also use the same Dinka and Levant samples (+ a west African proxy because why not) to also maybe see if any Eurasian admixture/Nilotic Admixture unrelated to a Cushitic component is picked up, however, I have seen several errors previously produced with this specific area of investigation such as 0% Egyptian-associated Eurasian component being found in Nubian samples when a Levantine sample is already being used to pick up any identified Eurasian admixture, so we must be cautious and skeptical when observing the values produced from these side-samples. Here are the results from the Daju Cushitic Proxy and the Kenyan Pastoral Cushitic Proxy.
1675271026079.png

0s were not printed for this and we can see that there is no presence of Dinka when Daju is used as a Cushitic Proxy. However, we are able to identify a separate Eurasian component unrelated to the Daju Eurasian admixture. These results are to be taken with a grain of salt and are to only be used in trying to understand the use of Vahaduo as a tool and identifying methods.
Now for the Kenyan pastoral results.
1675271263347.png

The first difference that is noted is the severe drop in the percentage of what is identified as "Cushitic". This will be discussed in more detail so I will quickly move on from this and mention the second difference. We can now see the presence of a Dinka component from two different Dinka samples both with slight variations in West African admixture. This separation of Nilotic-related ancestry from Cushitic, is different from what was observed in the Daju sample where all Nilotic-related ancestry was forced into the Daju/Make-shift Cushitic component. Here we can see that an acknowledgment of differences between Nilotic components has taken place and we have two sources (possibly 3 judging by the presence of two separate Dinka components) of differing Nilotic ancestry, Cushitic-related, and Non-Cushitic. Interestingly Eurasian admixture was split into its own two sources for both tests, those two sources also being Cushitic-related and Non-Cushitic. And what's more is the general continuity in the abundance of this Eurasian admixture for both samples, unrelated to the Cushitic component. We see a total difference of 1.6 and an average difference of 0.8 in this particular component of Eurasian admixture between both tests.

Now I want to focus more on the drop in Cushitic and I would seriously appreciate as much input as possible from anyone knowledgeable enough to assist with this specifically. From my observations, the Kenyan sample is modeled much closer to the Kulubnarti samples than the Daju is, characterized by a high Eurasian component with a roughly similar amount of Proto-Nilotic. The Daju sample on the other hand is positioned much further from the Kulubnarti samples, characterized by a much higher Proto-Nilotic component (≈70%). The ancestry break down to my knowledge, simplified, is different components of the target sample being allocated to source samples based on their closeness/association to the given source samples. The fact that the Daju Cushitic proxy produced a much higher value for Cushitic than the Kenyan Cushitic proxy is contradictory to what I expected and is probably caused by several other factors that we can acknowledge and suggest but not 100% be able to establish as legitimate reasons. For instance, what I expected was a higher value from the Kenyan pastoral due to it being much more similarly modeled to the Kulubnarti sample, logically it would coincide with the ancestry of Kulubnarti and therefore produce a much higher value than a proxy that is much more distant such as the Daju sample. However, the opposite happened and a decrease was observed, and along this decrease came an increase in Non-Cushitic related Nilotic. This interesting split in the Nilotic component seems to be what has contributed to the drop in the Cushitic value, however, for what reason the previously Cushitic-related Nilotic was cut out and assigned as Isolate Nilotic in the second test? As I earlier stated, I do not know and again re-iterate I'd love some feedback on this. The Isolate Eurasian component continuity is symbolic of the fact that the source of this Isolate Eurasian admixture is almost totally absent in both the Daju and Kenyan pastoral samples, with the Isolate proving to be almost homogenous in abundance throughout both tests. Judging by the fact that both tests with both proxies showed little to no impact on the Isolate Eurasian component this could be suggesting partial homogeneity and uniformity between the two samples regarding their Eurasian component.
 
To finish, Vahaduo seems very able to differentiate Nilotic from that associated with Cushitic or Isolated Nilotic, however to what effectivity? That is yet to be determined and I have opened this thread for the discussion of methods/ideas revolving around this and to eventually promote and make use of more accurate methods in Vahaduo as well as improve my and other people's understanding of the tool and hopefully minimizing errors we make when running our own little investigations. If anyone has some samples worthy of mention feel free to put them down here for other people to use.

Samples used are attached on the following link, download the file
 

Trending

Top