In order to understand the origin and history of the Iraqi population of Eastern U.P. accurately in detail, we need to continue our efforts in the form of a common Ancestry Project. Furthermore, obtaining informations outlined below and characterization of the most recent common ancestors (TMRCAs) consistent with data on ethnological, historical migration timeline. It may mean determining the actual time, 20-30 generation earlier (more than 500 years ago) our ancestors existed in India, following migration reported in history (see Section A). This and the others as below are parts of the Ancestry Project. Again, it is our understanding as members of the Society in taking interest by ordering voluntary DNA Tests upon which will depend the actual success of the project. This is to clarify that participation is needed from across the whole Society in order to achieve our aims under the proposed Project.

Finding gene pool of extant Iraqi population of Eastern U.P: Identification of Y-haplogroups and subclades, allele frequencies; haplogroup comparison with others populations such as neighbouring i.e. upper caste Hindu, and historical Muslim populations within U.P; maternal ancestry and composition of mitochondrial haplogroups; gene admixture in extant Iraqi population -identification of gene flow from Hindu castes and other ethnic Muslim populations found in U.P. India; Y chromosome and autosomal gene comparison with West Asian, more importantly Near East-Iraqi population or subpopulations; similarity to DNA from ancient cultures. STR data on living Iraqi individuals: Phylogenetic structure; relationship with world population. Therefore, the proposed Iraqi Indian Ancestry Project has the following aims. 


(1)   Link between living Iraqi families and Sunni Converts under the Genealogical tree of Syed Masud Al Hussaini (migration/post-migration period: 1330-1366 AD)

(2)   Genetics of Extant Iraqi Population: Characterization of Y Haplogroups and autosomal SNPs

(3)   Maternal Ancestry composition of Extant Iraqi Population

(4)   Y-STR Data on individuals from Iraqi population

RESULTS: DNA Genealogy Data, Genealogical Tree Data

TIMELINE:  (Milestones achievements, project completion -reasonable time, 2-3 Years)

PUBLICATIONS: Genetic composition, affinity and phylogenetic tree and genome similarity/sharing within community of known population groups of U.P. and worldwide.  



Aim 1: Link between living Iraqi families and Sunni Converts under the Genealogical Tree of Syed Masud Al Hussaini (Sunni converts born circa 1517 AD)

The history of the early honorific Sunni Converts (born, 1517 AD) as ancestors of Iraqis of Eastern U.P. with respect to biography, civil records in Ghazipur, and more importantly genealogical links to both ascending Shia Syeds and deep descending Iraqi families are our objectives under this project.  

The background information from the genealogical tree of Syed Masud will be used because few of his descendants converted to Sunni religion in early 1500 AD. Those Syed converts, together with other Arab and West Asian noblemen, arguably formed the foundation of surviving Iraqis of Eastern U.P. (1-3). Early Sunni Converts in the genealogical tree of Syed Masud are shown in Figure 1, from which a link may be generated to or from the living Iraqi families in India and Pakistan.

It can be seen that Syed population in Nonehra is predominantly descended from his son Syed Qutubuddin from early 1330 AD to todate (Figure 1). The present population in Nonehra are known to include booth Shia Syeds and Iraqi Muslims. In Nonehra, it can be seen that descendant of Syed Masud corresponding to 6th and 7th generation, respectively, are Syed Saloni and his Sons Syed Piyare and Syed Ladle. It is also noted that all descendants of Syed Ladle had settled down in Nonehra town (Figure 1).  It is probable that Syed Ladle and his children have lived in early to middle 1500 AD.  As shown by arrow, this is the time period given in history that the earliest  descendants of Syed Masud Al Husaini converted to Sunni religion; some names similar to Sunni descendant in Figure 1 is shown in red colour (3).  In order to verify this honorific ancestor, downstream descendants names need to be investigated. 

It is reported that Syed Nooruddin, son of Syed Masud Al Husaini had settled down in Gangauli town, Ghazipur. As names of his descendants are not open yet, at least, one or two Sunni Convert family is expected (see Figure 1).  It is worth adding here that Ganguli town is referred to as an imaginary mini India in the fiction “Aadha Gaon”. The latter story successfully recapitulated a commentary as a horror history of the partition of India in 1947; Needless to mention, along the main characters is being included Iraqi as Raqi trader family in the fiction. The latter novel is a famous work of renowned writer Rahi Masoom Raza. Rahi who was born in 1927 in a Shia Syed (Zamindar) family, a descendant of Syed Nooruddin/Syed Masud in Gangauli town.  

Genetic drift is common after a population bottlenecks, which are events that drastically decrease the size of a population. A similar or a random genetic drift can result in the loss of rare alleles i.e. Sunni Converts/Iraqi Syed as opposed to dominant Shia Syeds of Ghazipur.  If this phenomenon occurred it would affect the number of Iraqi/Syed even within the Iraqis of Eastern U.P. population. It could be because of the rare nature of the Sunni descendants of Syed Masud Al-Hussaini within the Iraqi Biradri from the origin.  The propensity of the loss of Sunni Covert allele was great in comparison to the descendants of other kinds of descendant peoples such as IE-West Asian/Persian or other Arab alleles. In view of the random genetic drift, the growth of Iraqi Biradri in the last 500 years could be imagined as follows.  Firstly, loss of genetic diversity/or reduction of rare Sunni Convert (Syed) allele as opposed to other alleles would not be surprising. On one hand, this change in the Iraqi population makes it a vastly distinct from an overlapping Shia descendant population of Syed Masud.  On the other hand, it is unclear whether a probability of finding the deep Sunni descendant of Syed Masud exists in living Iraqi population. If this exists what will be the number of surviving descendant (from the earliest Sunni ancestor, Honorific Syed Abu Bakr) within the population after 500-700 years of growth.

The probability of finding the Sunni descendants in the extant Iraqi population may be possible.  Many Iraqi families are known to have deep roots in Nonehra, Gangauli and other places; how many 18th generation Iraqi families with ascending links to honorific Converted Sunnis exist. If it is not written, it becomes very hard to create a genealogical list.  To the knowledge of the most of us, it  is not known, and it is our objective to find those individuals/families under this project. It has to be seen whether such family exists in India or Karachi, Pakistan (4). One approach is that we search by getting feedback from families with history akin to Iraqi Syed, Hashmi, Quraishi, Shaikh or Iraqi titles within Iraqi Biradri. The feed back information may lead us in forming a partial or complete link between Iraqis as descendants of Syed Al Hussaini and latter's genealogical tree (see below Figure 1, to be investigated under this section).  

Alternatively, the small number of Sunni Converts and later descendants as Iraqi Syeds are reduced completely by random genetic drift in the relatively small size of the Iraqi/Indian population today, after growth of many generations during 500-700 years. Finally, this small effective size of Iraqi population of Eastern U.P. is involved, for some time, in inbreeding and subsequently has less paternal (and maternal) haplogroup diversity compared to initial Iraqi Population in early 1330-1500 AD in India. This trend and its effects on our Society must be understood clearly, sooner the better. It bears consequences related not only to its optimal growth rate but also susceptibility to recessive autosomal hereditary diseases i.e. higher frequency of newborns with errors in extant Iraqi Biradri in India.


Figure 1:  The genealogical tree of Syed Masud Al-Hussaini.  The names of the descendants of Syed Masud in India from 1330 AD to early 1500 AD in the tree is constructed from data published earlier (1-3).

Note: Progress on this goal is challenging and will certainly constitute a milestone under this project; tree form of genealogy of the postulated Arab ancestor from Larestan or non-Arab West Asian Iraqi ancestors corresponding to lines Y40/M560, and L657/Y6+ from origin 1500 AD to date are not included under this project for clarity.


Aim 2: Genetics of Iraqi Population of Eastern U.P. India

In Indian population history, four separate waves of migration into the subcontinent took place: (i) an ancient Palaeolithic migration by modern humans, (ii) an early Neolithic migration, probably via Proto-Dravidian speakers from the eastern horn of the Fertile Crescent, (iii) an influx of Indo-European speakers, and (iv) a migration from East Asia, Southeast Asia i.e. Tibeto-Burman speakers. In addition to these, migration pattern by Medieval Muslim Preaches, Invaders/Warriors or travellers to India and later in 18th century by Europeans have also contributed to the ethnic multiplicity. Furthermore, it has been reported that Y- lineages of Indian castes are more closely related to Central Asians/East Europe than to Indian tribal/native populations, suggesting the former is primarily the descendants of Indo-European migrants. Adequate genetic studies have not been conducted on ethnic Muslim populations including the Iraqi Muslims of Eastern U.P in India as yet.

Y-chromosome, uniparental, binary markers characterize evolution of human being during the past several thousand years in the form of a family tree DNA. Thus, the existence of “Single Nucleotide Polymorphisms” (SNPs) in a population, by comparing the Non-recombining Y-chromosome regions, NYR (or for the same reason mitochondrial DNA, and Y-chromosome STR markers), is used in DNA genealogy.  The latter DNA features are free from issues such as randomness created by not knowing which of the two autosomal chromosomes of each pair were inherited by the offspring during population growth, or the randomness from the the recombination process during meiosis of paternal sperm and maternal egg cells.  Any specific stable inheritable SNP is known as a haplotype. Using Y-chromosome haplotypes, the ancestry of any male can be traced through his paternal lineage.  A group of people who share similar haplotypes is known as a haplogroup (5).

Since a mutation at a single base is very rare compared to changes in STRs (short tandem repeats), Y-DNA SNP tests are able to trace both ancient anthropological migrations and more recent prehistoric/historic movements. A Y-DNA SNP test also identifies the haplogroup, which may represent deep ancestral origins (tens of thousands of years before present, ybp). The evolution of modern human being from very early time to different prehistoric time over to contemporary period has been studied by knowing genetic families as haplogroups that specifically originate at a time (in terms of ybp) and at a geographic location. The haplogroups in ‘letter’ forms make a genealogical Family of a modern human being (Homo Sapiens) in which letter A is at the root of the family tree. The letter A arose from Africa, 60,000 ybp, as the earliest human, has gone thru evolution, giving birth to later branches denoted by following ‘letters’, each thousands of year before present (Figure 2-Top Panel). When a defining SNP under a haplogroup overlaps detection in an individual, it identifies a particular haplogroup from which the origin, as age of the distant ancestor as TMRCA, and geographic spread within thousand years from the origin, could be learned. A big cohort of Y- chromosome haplogroup data in conjunction with autosomal SNPs data from a population group/or family may be used to know DNA relatives i.e. very old, common ancestors as TMRCAs. Similar data base are used by Ancestry Companies in finding DNA relatives. 

A Y-chromosome data from different population groups found in U.P. India has been presented below as a relevant example for the background information (6).

          Iraqis of Eastern U.P., Preliminary study, n=5 ...........................  5R1a  

Figure 2:  Genetic families of Modern Human being in the form of a tree and haplogroup data from a genetic study on different population groups from U.P. (India).  The data are reproduced from the work described in detail elsewhere (6). Iraqi data is taken from Table 2.

A genetic study alluded to above, on distinct population groups living in U.P. India, namely, upper caste Hindu Brahmins, Awasthis, Chaturvedis, Syeds (Shia)  and Non Syed Muslims with recent ancestry outside from India (i.e. Sunnis) has been carried out (6). Using Y-chromosome binary markers i.e. a total 32 Y-chromosome unique event polymorphisms (UEPs) that include 27 single nucleotide polymorphisms (SNPs), four insertion/deletion and one Alu repeat, 560 individuals from the above 5 groups were examined for these markers (6). These mutations have been found to be stable and as variants are known to segregate into distinct haplotypes (6). In Figure 2, we show family tree of human and haplogroup branches; defining mutations under each haplogroup branch is also shown (Top Panel).  Detection of SNPs/ID mutations in 560 Y chromosomes from five caste groups constituted several haplotypes which are formulated into 13 haplogroups as described in detail elsewhere (6).  The details of Y-haplogroups found in five population groups are also summarized in Figure 2-Lower Panel.

The Haplogroups observed in five populations were: J2, R1a, R2, H, F, C, K. O and P, all nine haplogroups were found in each population group, albeit in a unique composition. Although J2, R1a, R2 were dominant haplogroups, others alleles such H,F, C, K, O and P were present in small percentage in all populations. Haplogroups G, L, R1b and E1b1b1 were not commonly observed in all but were present in some populations (6).

For example, haplogroup J2 was highest in Shia Muslim group along with E1b1b1 haplogroup; the latter is a Shia specific haplogroup, as other groups lack this haplotype. Haplogroup J2 is observed in Middle East region and in India is linked to early Neolithic migration and to some extent to historical migration of Muslim Shia population group. More importantly, E1b1b1  the most frequently observed haplogroup in Africa/Mid-East region. Therefore, this observation suggests that Shia Muslims might carry some North African and Middle-Eastern ancestral alleles that were brought in India during Muslim rule.

Preliminary result on Iraqis of Eastern U.P. is shown in Figure 2 (23&Me and YSEQ, Table 2). However, detection of haplogrup J (M267 and M172 branches) in extant Iraqi population of Eastern U.P. is yet to be found.  If it is found in actual study, it may provide further link toward existence of the survivor descendant families of Syed Masud Al Hussain (circa 1330-1360) or as descendants of Arab Sunni converts who existed from circa 1500 AD (see the section under  A). It is interesting to note that all four individuals tested belonged to R1a1a haplogrup.  As the R1a carriers clearly belonged to the distinct clades, i.e. M560+ and L657/Y6+ (under Asian R1a-Z93-Z95), these individuals will also belong to two separate family tree as descendants from two distinct common ancestors under one Biradri.  

More Haplogroup Background information:

Haplogroup E

  • Haplogroup: E, a subgroup of D/E

  • Age: 30,000 years

  • Region: Africa, Europe, Near East

  • Example Populations: Bantu-speakers, African Americans, Berbers, Bantu-speakers

  • It has two known branches,E-V68, E-Z827 which contain by far the majority of all modern E-M215 men. E-V68 and E-V257 have been found in highest numbers in North Africa and the Horn of Africa; but also in lower numbers in parts of the Middle East and Europe, and in isolated populations of Central Asia.

Haplogroup J

Paternal haplogroups are families of Y chromosomes that all trace back to a single mutation at a specific place and time. By looking at the geographic distribution of these related lineages, we learn how our ancient male ancestors migrated throughout the world.

  • Haplogroup: J, a subgroup of F

  • Age: 20,000 years

  • Region: Southern Europe, Near East, Northern Africa

  • Example Populations: Bedouins, Ashkenazi Jews, Greeks

  • Highlight: Haplogroup J was carried out of the Near East by Muslims and Jews during the first millennium AD.

Haplogroup J (P209) Branches

  1. J1 (M267)     2. J2  (M172)


J   12f2.1, L134/PF4539, M304/Page16/PF4609, P209/PF4584, S6/L60, S34, S35

•    J*   -

•   ; J1   L255, L321/PF4646, M267/PF4782

•   • J1*   -

•   • J1a   CTS5368/Z2215

•   • •    J1a*   -

•   • •    J1a1   M365.1

•   • •    J1a2   L136

•   • •   • J1a2*   -

•   • •   • J1a2a   P56

•   • •   • J1a2b   P58/Page8/PF4698

Haplogroup J1, which is also known as M267, is a subclade of Y-DNA haplogroup J- P209 (Haplogroup J). Haplogroup J1 separated from haplogroup J approximately 31,500 years ago (YFull, 2015). Today, haplogroup J1 is mostly seen in Caucus, Mesopotamia, Levant and Arabian Peninsula, but it is also present moderately or slightly in Turkey, Azerbaijan, Iran, Europe, Central Asia and Indian subcontinent.

Distribution: Haplogroup J-M267

Haplogroup J-M267[Phylogenetics 3] defined by the M267 SNP is in modern times most frequent in the Arabian Peninsula: Yemen (up to 76%),[11] Saudi (up to 64%) (Alshamali 2009), Qatar (58%),[12] and Dagestan (up to 56%).[13] J-M267 is generally frequent among Arab Bedouins (62%),[14] Ashkenazi Jews (20%) (Semino 2004), Algeria(up to 35%) (Semino 2004), Iraq (28%) (Semino 2004), Tunisia (up to 31%),[15] Syria (up to 30%), Egypt (up to 20%) (Luis 2004), and the Sinai Peninsula. To some extent, the frequency of Haplogroup J-M267 collapses at the borders of Arabic/Semitic-speaking territories with mainly non-Arabic/Semitic speaking territories, such as Turkey(9%), Iran (5%), Sunni Indian Muslims (2.3%) and Northern Indian Shia (11%) (Eaaswarkhanth 2009). However, it should be noted that some figures above tend to be the larger ones obtained in some studies, while the smaller figures obtained in other studies are omitted. It is also highly frequent among Jews, especially the Kohanim line (46%) (Hammer 2009). ISOGG states that J-M267 originated in the Middle East. It is found in parts of the Near East, Anatolia and North Africa, with a much sparser distribution in the southern Mediterranean flank of Europe, and in Ethiopia. But not all studies agree on the point of origin. The Levant has been proposed but a 2010 study concluded that the haplogroup had a more northern origin, possibly Anatolia.

The origin of the J-P58 subclade is likely in the more northerly populations and then spreads southward into the Arabian Peninsula. The high Y-STR variance of J-P58 in ethnic groups in Turkey, as well as northern regions in Syria and Iraq, supports the inference of an origin of J-P58 in nearby eastern Anatolia. Moreover, the network analysis of J-P58 haplotypes shows that some of the populations with low diversity, such as Bedouins from Israel, Qatar, Sudan and the United Arab Emirates, are tightly clustered near high-frequency haplotypes. This suggests that founder effects with star burst expansion into the Arabian Desert (Chiaroni 2010).

Haplogroup: J2,

  • a subgroup of J, defining mutation M172 (Age: 18,000 year before present)

  • Region: Southern Europe, Near East, Northern Africa

  • Example Populations: Ashkenazi Jews, Sephardic Jews, Lebanese

  • Highlight: Haplogroup J2 is found in nearly one-quarter of Sephardic Jewish men.


R1a Haplogroup: 

The Y chromosome haplogroup R is divided into two main components: R1, R2-M173 and R2-M479. R1 is then divided into two major groups: R1a-M420 and R1b-M343. R1a is more common in the East Europe/West Asia/Central South Asia, while R1b is prevalent in the West Europe. Previous studies have suggested that this division reflects the expansion of populations in a post-ice age particularly during the Neolithic period. Over 10% of men living in a region extending from South Asia to Scandinavia share a common ancestor belonging to haplogroup R1a-M420 (5).

One sub-clade of R1a (haplogroup R1a1) is much more common than the others in all major geographical regions. R1a1, defined by the single-nucleotide polymorphism (SNP) mutation M17, (and sometimes alternatively defined as R-M198), is particularly common in a large region extending from South Asia and southern Siberia to Central Europe and Scandinavia (5). The R1a family is defined most broadly by the SNP mutation M420, which was discovered after M17. The discovery of M420 resulted in a reorganization of the lineages, in particular, establishing a new paragroup (designated R-M420*) and R-M420 clade as R1a that leads to R-SRY10831.2 branch as R1a1; next branch is R-M17/R-M198 as R1a1a  (5).

R1a and R2 were the most observed haplogroup in Sunni, and upper caste Hindu groups. R1a, or haplogroup R-M420 is specially found largely in North India compared to other regions, most likely have West Asian origin and are linked to migration of Indo-European speakers from its origin West Asia to South Asia 4-5 K ybp (6).

Origin and Distribution of R1a and R1a1 (from paper by Underhill et al 2014)

This study is based on the analysis of the genetic tests on 16,244 men from 126 Eurasian populations. (13 of these samples belonging to haplogroup R were completely sequenced, i.e. 9.99 million base pairs on the Y chromosome). Of all the samples, 2923 individuals belong to haplogroup R1a-M420. These were then tested for the following SNPs: SRY10831.2, M17/M198, M417 -Page7, Z282 / Z280, Z284 , M458, M558 / CTS3607, Z93 / Z94, Z95, Z2125, M434, M560, M780, M582, M746, M204 and L657 (see Figure 1, ref 7). The geographic frequency of R1a was calculated for all samples: Of the 2923 samples; 2893 belong to the sub-clade, R1a-M417 / Page7 and 1693 are European and 1200 are Asian, respectively. Among them the rare sequences are: 24 * R1a -M420 (xSRY10831.2), 6 * R1a1 -SRY10831.2,  (xM17/M198), and 12 -M417 / Page7 *(xZ282, Z93).  Among European 1693 R1a-M417, over 96% belong to the sub-clade: R1a-Z282.

Among the 490 South Asian R1a-M417, more than 98.4% belong to the sub-clade R1a-Z93. These two subclades, R1a-Z282 and R1a-Z93, are among the 560 samples from the Near East, the Middle East or the Caucasus (7). The sub-clade R1a-Z282 * is located mainly in the north of Ukraine, Belarus and Russia. The sub-clade R1a-Z284 is confined in Scandinavia. R1a-M458-M558 and R1a have a similar distribution with their highest frequencies in Central Europe and the East Europe. However R1a-M558 is present in the Volga-Urals region, unlike R1a-M458 (7).

The sub-clade * R1a-Z93 is more common in southern Siberia, in the Altai region. The sub-clade R1a-Z2125 is found in Kyrgyzstan and among Pashtuns of Afghanistan and in the population of the Caucasus and Iran. The sub-clade R1a-M780 (or L657) is located in South Asia: India, Pakistan, Afghanistan and the Himalayas. We find this group also in Iran and among Roma in Croatia and Hungary. Finally rare sub-clade R1a-M560 is found in two individuals speaking Burushaski in northern Pakistan, 1 individual Hazara from Afghanistan, and 1 individual Iranian Azeri (7).

R1a samples about 1335 were then tested on 10-19 STR markers. A network was built for R1a-Z282 and Z93-R1a. However, little sub-structures have been identified in these networks. The lowest differences appear in the sub-clade: R1a-Z93 *, R1a-M582 among Jews and among R1a-M780 Roma. These results are consistent with founder effects in these groups. A principal components analysis was done to differentiate the European and Asian groups on the first component, and Jewish groups (especially Ashkenazi) on the second component (7).

To find a link between ancient civilization in Europe and origin of R1a haplogroup, the oldest sample of R1a in Europe was tested to a survival age that corresponds to the individuals of the Roped Culture dated to 2600 BC. Previous samples from the early Neolithic time are F * and G2a. These suggest a rapid spread of haplogroup R1a-Z282 in Europe in Chalcolithic/Copper Age or Early Bronze Age, along the River Volga to the Rhine. The equivalent old culture R1a in the Middle East and South Asia is more obscure. However samples from Indus civilization could match this period (see in particular in the geographical distribution of R1a-M780) (7).

To evaluate the potential role of R1a clades to study archaeological events, the authors used two approaches to estimate the age of the common ancestors of the different subclades of R1a. The first uses the STR markers of the mutation rate. However the dates obtained with the former are greatly underestimated. The dates obtained with the Zhivotovsky rates are over-estimated and should be considered upper bounds. The second approach is based on the complete sequencing of the Y chromosome. For this, 13 individuals (8 R1a and 5 R1b) from younger European and Asian clades of R1a and R1b were tested, and 928 SNPs belonging to the R1 tree were used.

There is no consensus yet on the mutation rate to be used for sequencing of 9.99 million base pairs. This varies from one mutation every 100 years up to 1 mutation every 162 years. Using a mutation rate of 1 every 122 years, the authors estimated the separation of R1a and R1b to about 25,000 years. The age of R1a-M417 is estimated to be 5,800 years old. The shape of the tree obtained for R1a clades suggested a rapid diffusion of those haplogroups. In conclusion, the authors of this study believe that R1a haplogroup have spread from Iran and eastern Turkey to other geographic locations, they are there about 5800 years. This implies a dispersion/growth during Bronze Age (7).

R1a and Subclades

To infer the geographic origin of hg R1a-M420, authors in the above paper identified populations harbouring at least one of the two most basal haplogroups and possessing high haplogroup diversity. Among the 120 populations with sample sizes of at least 50 individuals and with at least 10% occurrence of R1a, just 6 met these criteria, and 5 of these 6 populations reside in modern-day Iran. Haplogroup diversities among the six populations ranged from 0.78 to 0.86 (Supplementary Table 4) (7). Of the 24 R1a-M420*(xSRY10831.2) chromosomes in our data set, 18 were sampled in Iran and 3 were from Eastern Turkey. Similarly, five of the six observed R1a1-SRY10831.2*(xM417/Page7) chromosomes were also from Iran, with the sixth occurring in a Kabardin individual from the Caucasus. Owing to the prevalence of basal lineages and the high levels of haplogroup diversities in the region, we find a compelling case for the Middle East, possibly near present-day Iran, as the geographic origin of haplogroup R1a (7).

In Figure 3 and ref 8, it is pointed out that there is at least one West Asian sequence (from Turkey) within M420 paragroup which seems an independent R1a1a- seen as Branch B. Similarly there is an Indian and one Norwegian sequence which are shown North European R1a1a as Branch A. However, these can be interpreted with West Asian centrality within this key paragroup (see Figure 3) (8).

  • Branch A went back to West Asia from where it spread again to Eastern Europe and Central South Asia.

  • Branch B is actually at the origin of the two derived and highly spread subhaplogroups.

Whatever the case it is understood that there are good reasons to think that these spread first from West Asia, at the very least Z93 and very likely also East European Z282 (Figure 3) (ref 8).

R1a1a1b2 Subclade (i.e. Z93, see Figure 3)

There is nothing European in this lineage: only some lesser terminal branches at the Southern Urals, roughly where the Kurgan phenomenon began some 6000 years ago.

This detail is indeed remarkable because, if, as often argued, R1a or some of its subclades spread from there, we should expect at least some basal diversity being retained. Instead all we see are some highly derived branches. So the main conclusion must be that the expansion of R1a does not seem related to the Kurgan phenomenon, except maybe in some secondary instances.

As mentioned before, this lineage is Central and South Asian and comprises the vast majority of R1a in those two regions.

The detailed haplotype network can be seen in Supp. Info fig. 2 (7)..

Downstream Z93 or terminal clades:

  • Z93* has three apparent distinct branches stemming from West Asia (incl. Caucasus) and another one from South Asia/Altai (1).

  • Z95* has two apparent distinct branches:

    • A small one with presence in West Asia and Southern Europe

    • Another one (pre-M780?) stemming from South or West Asia

  • M780 (L657) has clear origins in South Asia (incl. most Roma lineages)

  • Z2125 also appears to originate in South Asia, even if it has a greater spread outside it, notably to Central Asia

  • M560 and M582 appear related and surely originated in West Asia

Therefore the origin of Z95 should be though as West-South Asian but undecided between either region. Say Afghanistan for example.

  • Z93

In this case I would say that West Asia is almost certainly the origin, although tending to Central/South Asia. For example: Iran again.

So, regardless of whether the previous stage (M417) represents a stay in West Asia or a back-migration from Europe into West Asia, West Asia is clearly at the origin of Z93. It does not represent any Kurgan migration but an Asian phenomenon with origins towards the West (around Iran) (Figure 3; ref 8).


Figure 3:  Schematic presentation of Origin of R1a and younger downstream clades of R1a. Migration of R1a-M417 clade along with downstream branches based on phylogenetic and geographic data on Y chromosome R1a haplogroup by Underhill (7-8).   

 R2 Haplogroup

  • R2 is linked to early prehistoric Neolithic migration to India, its distribution was found in all populations studied but was not specifically linked to any one group in India (6).

  • Possible time of Origin 12000 ybp.

  • Possible place of origin South Asia or Central Asia

Ancestor  R

  • Descendants R2a

  • Defining mutation M479

  • Highest Frequency South Asia

G and L Haplogroups

Small number of haplogroups G and L are found in Shia, Sunni, and Hindu caste populations in U.P. India, haplotype geographic spread include North East region of Middle East, Iran and Iraq, and in India are linked to Neolithic migration (6).  


Haplogroup R1b, also known as haplogroup R-M343, is the most frequently occurring Y-chromosome haplogroup in Western Europe, as well as some parts of Russia (the Bashkir minority), Central Asia (e.g. Turkmenistan) and Central Africa(e.g. Chad and Cameroon) (5).

Haplogroups C, H, F, I, K, O and P

Haplogroup H and F are found in indigenous/tribals in Indian. Contributions from the above haplogroups  were small, but were present, in both Hindu caste and Muslim populations (6).


Aim 3.  Y-STR haplotypes

Unlike the UEPs, the Y-STRs mutate much more easily, which allows them to be used to distinguish recent (also ancient) genealogy.  For the same reason, i.e. STR mutations are not rare/stable, segregation of Y-STR haplotypes in population are likely to have spread apart, to form a cluster of more or less similar results. Typically, this cluster will have a definite most probable center, the modal haplotype (presumably similar to the haplotype of the original founding event), and also a haplotype diversity-the degree to which it has become spread out.

if the population growth has taken place earlier more in the past from the time of STR defining event i.e. modal haplotype, for a particular number of descendants, the haplotype diversity is greater. However, if the haplotype diversity is smaller for a particular number of descendants, this may indicate a more recent common ancestor, or a recent population expansion.

It is important to note that, unlike for UEPs/SNPs, two individuals with a similar Y-STR haplotype may not necessarily share a similar ancestry. Y-STR events are not unique/stable. Instead, the clusters of Y-STR haplotype results inherited from different events and different histories tend to overlap.  In most cases, it is a long time since the haplogroups' defining events, so typically the cluster of Y-STR haplotype results associated with descendents of that event has become rather broad. These results will tend to significantly overlap the (similarly broad) clusters of Y-STR haplotypes associated with other haplogroups. This makes it impossible for researchers to predict with absolute certainty to which Y-DNA haplogroup a Y-STR haplotype would point.  

STR tests are done by Family Tree DNA and are able to trace a male lineage within genealogical times/ historic times. Your genealogical connections will be shown on the Y-DNA – Matches page of your myFTDNAaccount. The Y-DNA – Ancestral Origins page of your myFTDNA account will point towards possible countries of origin” (9).

Aim 4: Maternal Ancestry

mtDNA tests can be used to test direct maternal lineage i.e. mother's > mother's mother haplogroup.  mtDNA mutates much more slowly than Y-DNA, so it is really only useful for determining distant maternal ancestry. mtDNA results are generally compared to a common reference sequence called the Cambridge Reference Sequence (CRS), to identify specific haplotype, a set of closely linked alleles (variant forms of the same gene) that are inherited as a unit. People with the same haplotype share a common ancestor somewhere in the maternal line. This could be as recent as a few generations, or it could be dozens of generations back in the family tree. mtDNA testing is generally done in two regions of the genome known a hyper-variable regions: HVR1 (16024-16569) and HVR2 (00001-00576). HVR1 and HVR2 test results also identify the ethnic and geographic origin of the maternal line. From evolutionary tree of human mitochondrial DNA (mtDNA) haplogroups, one can see more informations on haplotype/haplogroups to be discussed below (10). Extant Iraqi Muslim population in Eastern U.P., like other Indo-Muslim groups i.e. Shia and Sunni in U.P. may represent descendants of Hindu converts and offsprings of Hindu mothers.  Alternatively, the Iraqis are comprised entirely of descendants of Middle Eastern (Arab/Iraq) or Central Asia (Iran,Turkic clans) migrants without admixture of surrounding populations. This project is intended to test these and other hypotheses.

The genetic studies on maternal ancestry in the extant Iraqi population are not done before. However, some information on Shia and Sunni Muslim groups in U.P. and other population is summarized here (also ref 11).  Dominant maternal haplogroups found in two Muslim populations are M, (more than 50%); R and U types, which together are known as South Asian haplogroups for the following reasons. The high frequency of these haplogroups are also found in Hindu castes, Brahmans, Bhargavas, Chaturvedis in U.P. and Hunza people in Pakistan. These haplogroups are reported to be minimal to nonexistent in Middle Eastern, Central Asian, and North East African populations; West Asian maternal haplogroups include, respectively, H, I, J, K and T.  

It is also important to know that none of the Indo Muslim groups possesses any of the three ancient mitochondrial hapolog M2, U2i and R5, R6 with a coalescence time of more than 50, 000 ybp; these are found in 15% of the general Indian population. In addition, Shia population shows greater proportion of Indian specific mitochondrial M haplogroup and have subclasses, e.g.  M3. M4, M25,etc.  By contrast the Sunni group lacks M subhaplogroup but maternal haplogroup R is more frequent in Sunni population (11).   

These data are consistent with a view that both Sunni and Shia Muslim groups in U.P. extend mitochondrial haplogroup affinity much more toward upper caste Hindu population groups. The finding that M, and R haplogroup frequencies are minimal in Middle Eastern, Central Asian, and North East African populations supports the view that M and R alleles carried by Muslim population group is linked to female population of South Asia as opposed to outside.

The affinity due to similar maternal ancestry between Sunni, Shia populations and Hindu caste groups suggests that former two groups show gene admixture as a result of gene flow from Hindu groups. This may be result of conversion of Hindu population, both male and female- i.e. family oriented, to Sunni Islamic faith; the conversion to Shia faith may have been female biased (see ref 11).



Preliminary DNA Test Data


Table 2:  List of Haplogroup/subhaplogroup/ STR-Haplogroup/Maternal Haplogroup based on Genetic Tests performed on Iraqi individuals native to Eastern Uttar Pradesh, India.

td> td> td>
ID Origin/orBirth Residence/Age Y Haplogroup Sub-Haplogroup and SNP Y-DNA STR* Maternal Haplogroup DNA Tests*
001 Lar/Deoria USA/27 R1a1a Z95+/Y40+/Y39+/YP294-/Y42- M3c 23andMe/ YSEQ
002 Lar/Deoria Canada/60 R1a1a Y40+/M39+ or M560+/Y42- +* A4b 23andMe/ YSEQ
003 Nawanagar/Ballia India/35 R1a1a Z95+/L657+/Y6+/Y7- M3c 23andMe/ YSEQ
004 Gorakhpur Pakistan R1a1a
005India(F) Karachi,Pakistan U7 23 and Me
006 India(F) Karachi,Pakistan R8a 23 and Me
007 U.P. India Karachi R1a1a R8a 23 and Me
008 (50% South Asia,L Nawaz,F) D/ID 7, SNAWAZ T2a1 23 and Me
009 Nawanagar/Ballia,F India/52 M3c 23 and Me
010 India unknown J1a U2c1 23 and Me
011 (50%South Asia,C Hobson,M) unknown R1a1a L657+/Y6+ H7 23 and Me

*B. BI-PARENTAL AUTOSOMAL DNA TEST: The 23&Me results on autosomal DNA test on IDs denoted Lar001 and Lar002 in Table2 provided ethnic composition which is mainly (98-100%) South Asian with minor proportions of European and Middle Eastern ethnicity. Lar001, Lar002, Karachi005 and Karachi006 are shown as DNA relatives from close to distant cousins.

Table 2 shows the list of Iraqi Biradri individuals by ID#s, DNA relationship between individuals, in addition to Y-chromosome haplogroup and maternal DNA haplogroup, and is mainly based on DNA Test by 23&Me. It is established by autosomal DNA sequencing data that Lar001 shares 47% DNA, i.e. each half chromosome 1-22 is similar to Lar002 Chromosomal DNA. Lar001 has a close relationship with Lar002 (known sibling/parent). Similarly, Karachi005 shares 15% DNA sequences with Karachi006 (Karachi005 and Karachi006 are known first cousins). It is presumed that DNA relationship below is characterized as shown by identical half DNA at multiple chromosome location, segment size in cM, between the two/more individuals.

Table 2a: *Detailed chromosomal segment (half) sharing between Known Families. IDs Lar001/or Lar002 shares DNA with IDs Karachi005. The % autosomal identical DNA is estimated by 23&Me which is about 1% equivalent to 3rd to 4th cousin; Between Lar001 and Karachi006 is less than 0.5%). Note that ID Karachi005 and possibly Karachi006 descended paternally from distinguished ancestor ZH Lari, a reputed family of U.P. in the past, and fraternal to well known family of Haji Qaiyum/Tayyab Sahab/Ms Ghazala at Lar/Gorakhpur. From the oral/civil records, Lar001 and Karachi005 individuals have no recent common ancestor in the past 8 generations, although families existed side by side until 60 years ago.

Conclusion: Identical DNA between individuals Karachi005/Lar001/Lar002 is estimated by 23&Me is about 1% . From the actual knowledge of the most recent common ancestor after 10 or more generations as noted above, the DNA relative estimate would be a distant cousin relation and not the 3rd-4rth cousin relation estimated by 23&Me.

-----------------------------------------------------------------------Comparison                   Chrom Start            End Genetic # of     Identity                                                                  Position Position Distance,cM SNPs ____________________________________________________

Karachi005 (Female U7)        5 126897060       142129500 12.9 2733 Half

Lar002(R1a1a/A4b) &

Karachi005 (Female U7)       5 127375931         142133585 11.9 2626 Half

Lar001 (R1a1a, M3)&

Karachi005( Female U7)      13 73921855           78662693 5.94 1140 Half


Karachi005(Female U7)       13 73911462           78408655 5.91 1110 Half

More details of chromosomal DNA

Karachi006(Female R8a)&

Lar001(R1a1a  /M3)      5 126897060       141801216 11.42 2584   Half

Karachi006(Female R8a)&

Lar002(R1a1a/A4b)                 5 126897060    141801216 11.42 2586    Half


Table 2b: *Chromosomal DNA sharing between known Lar001/Lar002 and an uncharacterized kits under 23&Me Ancestry-DNA family Section. +The latter kit belongs to a Female (European maternal haplogroup T2a1/South Asian Father Shahid Nawaz-R1a1a haplogroup, R8a maternal). Both Father and daughter share 1% DNA, are related as 3rd to 4th cousins. Major sharing at multiple chromosomal locations are shown below+. Conclusion: This match presented below may possibly indicate that a common paternal ancestor existed 10-15 generations earlier in Eastern U.P. India in South Asia.

Lar001(R1a1a, M3) &

Female (European T2a1)         7 39610117           45523448 6.58 1077 Half

Lar002(R1a1a, A4b) &

Female (European T2a1)          7 39422795         45370510 6.57 1100 Half

Lar001(R1a1a, M3) &

Female (European T2a1)           12 27921968      52163789 16.28 4161 Half

Lar002(R1a1a, A4b) &

Female(European T2a1)             12 27921968    51816998 16.5 4192   Half

Lar001(R1a1a, M3)&

Female (European T2a1)            16 13300207      19028549 8.37 929    Half

Lar002(R1a1a, A4b) &

Female(European T2a1)              16 13665563     19192727 8.57 863  Half


Table2c: *Chromosomal Sharing between individuals from Iraqi families. First cousin Iraqi families Karachi005/Karachi006 share about 0.10%-0.32% DNA with that of putative Iraqi family+ (daughter, T2a1/Father Shahid Nawaz, R1a1a, R8a); chromosomal sharing pattern is shown below.

Conclusion: Considering these two set of individuals from distinct families, it is concluded that the DNA sharing pattern is consistent with a common ancestor that existed grater than 15 generations ago.

Karachi005(Female, U7)/

Female (EuropeanT2a1) 4 181492677 184529193 7.66 879 Half

Karachi006(Female, R8a)/

Female(European T2a1) 4 110910016 126915614 13.76 2849 Half

Karachi006(Female R8a1)/

Female (EuropeanT2a1) 8 99374659 109477449 9.81 2023 Half——————————————————————————————————————

Table 2d: *Lesser amount 0.10-0.12% DNA sharing between Biradri Kits 001/002 and other samples as distant/unknown cousins. The latter samples share DNA (7-9 cM)at a single common or different locations . This type of DNA match may however highlight the migration pattern of our presumed common ancestors existed in Bronze age 1000-2000 CE. At that time, the ancestors mainly carried marker such as M417, downstream R1a (in combination with others contemporary Y-/or -maternal clades/haplogroups) as steppe nomads, migrated to West Asia, and descendants expanded rapidly, in separate branches, to East Europe, Central Asia and South Asia.

North America/East Europe Match

Female (Poland, H5a1)             18 11879931  22492049 7.83 1156   Half

Lar002(R1a1a, A4b)/

Female(U.S.A, U3b1)                   18 11289231 22492049                9.44 1289 Half


Female(U.S.A, H1)                        18 11283187 21702106                8.32 1045 Half


Male(R-S660, J1c7a)                    18 1082775 7 18789301              7.56 732 Half


Female(U.S.A, U5b2a1a)               18 11283157   22143129 9.02 1144   Half


Male(R-U152,T2b)                           18 11289231 20248798              7.14 804 Half

Lar002(R1a1a, A4b)/

Male,Ukraine(R-CTS3403,H)       18 10927139  19626195 7.54 772    Half


Male (Sweden, I-L205.1, V)             18 11186392 20700413             7.68 864 Half

Lar002(R1a1a, A4b)/

Female(Finland, H11a)                       18 13358206 23786855            7.68 1175 Half


DW,Female (Europe, U5a1a1) 16 7657432 11218606 8.75 1079 Half


DW, Female(Europe, U5a1a1) 16 7953180 11218606 7.78 955 Half

South Asia/India/Match


NO,Female(South Asia, N1e’1)        11 71221574 80025521           9.05 1585 Half

Lar002(R1a1a, A4b)/

NO, Female(South Asia, N1e’1)        11 71231233 80099803         9.38 1606 Half

Lar002(R1a1a, A4b)/

SS,Female (South Asia, U7) 2 18206898 23664914 7.42 1239 Half

Lar001(R1a1a, M3)/

SS, Female(South Asia, U7) 2 18206898 23653691 7.41 1237 Half


AJ,Male(R-L266,M3a) 14 88715918 92875213 7.10 908 Half


AJ, Male (R-L266,M3a) 14 88734134 92923034 7.02 933 Half


C.  Ancient European Origins (Result of a representative individual from the Family Finder Test performed by Family Tree DNA , FTDNA, is shown below)

     Hunter and Gatherers DNA 00

Early Farmers DNA                12%

      Metal-Age Invader DNA       65%

      Non-European DNA               23%

D.  Ethnicity proportions (Autosomal DNA by Family Finder Test-FTDNA)

      South Central Asian DNA 44% 

      Central Asian DNA, ANI,  54%

       Others as Minor DNA2%



Autosomal DNA SNPs Test:  Admixture DNA Proportions related to ANCESTRAL INDIAN and WEST ASIAN  proportions  (GEDMatch Applications on 23&Me DNA test)

The Ancestral Indian proportion in a Iraqi Biradri Kit (Autosomal DNA Test by 23&Me) is found in the range from 17% t0 44% by two separate Calculators. However, examining the population geography (1, +2,+3, or +4 population groups), each with less genetic distance is helpful explain the admixture data.  Consequently, the DNA herein is consistent with 44% Indian DNA as ASI including  ~17% found in very distant common ancestors -an ancient ancestral population (s)'; DNA is closer to modern GujaratiB/U.P.Brahmin population more than North Pakistani Burusho. 

Caucausus_Hunter_Gatherer , CHG (i.e. genetically similar to people living in Western Iran in Zagros Mountain region earlier, recognized here as Iran_Neolithic Farmers distinct from Early_Neolithic European Farmers) is also noteworthy.  The modern or even Bronze age populations from South Asia show DNA proportions, similar to Iran Neolithic Farmer gene that is found more in West Asian/Caucuses, Central Asian and in European populations.  However, this 36% Iran Neolithic noted in the Kit described below, possibly includes 10% DNA modal to a West Asian population, possibly inherited from proto-Iranic population/or steppe pastoralist, distinct from  the earlier neolithic West Asian population with J2 haplogroup. (10% ancestral proportion comes out by phasing autosomal DNA into paternal and maternal halves; 23&Me, GedMatch/MLDP-World 22 calculator). These observations suggest that the original kit carries two kinds of ancestral autosomal DNA (SNP) stretches and that the admixed proportion of ancestral Indian/Hindu DNA is > the ancestral West Asian DNA; a genetic effect due to gene flow, during the last 700 years of growth, from other populations native to U.P. India [A].

1.  Eurasia K9 ASI 4­Ancestors Oracle   (This calculator has been designed for individuals of predominantly South Asian and West Asian ancestry)

This program is based on 4­Ancestors Oracle Version 0.96 by Alexandr Burnashev. Questions about results should be sent to him at: Alexandr.Burnashev@gmail.com Original concept proposed by Sergey Kozlov.
Many thanks to Alexandr for helping us get this web version developed.

Admix Results (sorted):

# Population Percent

  1. Caucausus_Hunter_Gatherer 50.53 

  2. Ancestral_South_Indian 17.61 

  3.   SE_Asian 12.3

  4.  Eastern_Hunter_Gatherer 8.84 

  5.  WHG 3.13

  6. Siberian_E_Asian 3.13

  7.  Early_Neolithic_Farmers 2.75

  8.  W_African 1.01

Least­squares method.

Using 1 population approximation:
1. Burusho @ 6.267201    2. Punjabi @ 9.009089
3. Pathan @ 11.293012    4. Bengali @ 13.289005
5. Kurd_SE @ 14.754052    6. Pashtun_Afghan @ 15.224401    7.  Kalash @ 15.929560
8. Tajik_Afghan @ 18.837255    9. Balochi @ 20.230923
10.  Brahui @ 20.682590    11. Uzbek_Afghan @ 21.399572    12. Tajik_Pomiri @ 22.135880
13. Makrani @ 22.377605   14. Hazara_Afghan @ 25.918503  15. Paniyas @ 28.288652
16. KOTIAS @ 28.7028    17. Puliyar @ 29.717783
18. Lezgin @ 31.874170   19. Azeri_Dagestan @ 32.499008   20. Turkmen @ 33.221107

Using 2 populations approximation:
1 50% Bengali +50% Kalash @ 2.506377

Using 3 populations approximation:
1 50% Burusho +25% Kurd_SE +25% Paniyas @ 1.856417

Using 4 populations approximation: +++++++++++++++++++++++++++++++++++++
1 Burusho + Burusho + Kurd_SE + Paniyas @ 1.856417
2 Burusho + Kurd_SE + Paniyas + Punjabi @ 2.213182
3 Ho + Kalash + Punjabi + Punjabi @ 2.252892
4 Kalash + Kalash + Kharia + Punjabi @ 2.319063
5 Kalash + Kharia + Punjabi + Punjabi @ 2.336960
6 Bengali + Bengali + Kalash + Pathan @ 2.350164
7 Burusho + Burusho + Kurd_SE + Puliyar @ 2.451984
8 Ho + Kalash + Kalash + Punjabi @ 2.486270
9 Burusho + Kurd_SE + Paniyas + Pathan @ 2.489619
10 Ho + Kalash + Kurd_SE + Punjabi @ 2.493097
11 Burusho + Burusho + Kalash + Paniyas @ 2.505259
12 Bengali + Bengali + Kalash + Kalash @ 2.506377
13 Bengali + Bengali + Kalash + Pashtun_Afghan @ 2.674645                                                        14Bengali + Bengali + Burusho + Kalash @ 2.684473
15 Bengali + Bengali + Kalash + Punjabi @ 2.701431
16 Kalash + Kharia + Kurd_SE + Punjabi @ 2.713263
17 Burusho + Ho + Kalash + Kalash @ 2.767573
18 Ho + Kalash + Pathan + Punjabi @ 2.841585
19 Burusho + Kalash + Kalash + Kharia @ 2.901459
20 Burusho + Ho + Kalash + Punjabi @ 2.903419

2. Near East Neolithic K13 4­Ancestors Oracle

This program is based on 4­Ancestors Oracle Version 0.96 by Alexandr Burnashev. Questions about results should be sent to him at: Alexandr.Burnashev@gmail.com Original concept proposed by Sergey Kozlov.
Many thanks to Alexandr for helping us get this web version developed.

Admix Results (sorted):

# Population Percent


  2.  IRAN_NEOLITHIC 35.18

  3.   EHG 8.98

  4.  CHG_EEF 3.43

  5.  PAPUAN 2.06

  6.   POLAR 1.90

  7.  SE_ASIAN 1.83

Least­squares method.

Using 1 population approximation:

1. GujaratiB @ 3.860208                                                                                                                                  2. Punjabi @ 4.983410   3. GujaratiA @ 7.722285    4. .GujaratiC @ 8.861904  5. Sindhi @ 11.800466   6. Burusho @ 12.736398   7.  GujaratiD @ 13.240778    8.  Bengali @ 15.237478  9. Pathan @ 18.255499     10.  Kurd_SE @ 20.662148   11. Kalash @ 23.601383
12.   Balochi @ 26.028467    13. Brahui @ 26.840475
14.  Pashtun_Afghan @ 29.363560   15. Makrani @ 29.772766
16.  Iranian_Bandari @ 34.5227    17. Tajik @ 37.116909
18.  Iranian_Shirazi @ 43.943119   19.  Iranian @ 44.875771
20.  Iranian_Mazandarani @ 45.189472

Using 2 populations approximation:
1 50% GujaratiB +50% GujaratiB @ 3.860208

Using 3 populations approximation:
1 50% Bengali +25% GujaratiA +25% Kurd_SE @ 2.330745

Using 4 populations approximation: +++++++++++++++++++++++++++++
1 . Bengali + Bengali + GujaratiA + Kurd_SE @ 2.330745
2.  Bengali + Burusho + GujaratiB + GujaratiB @ 2.756356
3.  Bengali + GujaratiA + GujaratiC + Kurd_SE @ 2.802253  4. Bengali + Bengali + Burusho + Kurd_SE @ 2.874341   5. Bengali + Bengali + Kurd_SE + Sindhi @ 2.915078                          6. Bengali + GujaratiB + Kurd_SE + Punjabi @ 2.9294
7. Bengali + Kurd_SE + Punjabi + Punjabi @ 2.953123
8. Bengali + GujaratiB + GujaratiB + Pathan @ 2.962659
9. Bengali + Bengali + Kurd_SE + Pathan @ 2.988634
10. Bengali + GujaratiB + GujaratiC + Kurd_SE @ 3.011317  11 .Bengali + Burusho + GujaratiD + Kurd_SE @ 3.042461   12. Bengali + GujaratiA + GujaratiB + Sindhi @ 3.069682   13.  Bengali + GujaratiA + GujaratiA + GujaratiB @ 3.092335   14. Bengali + GujaratiB + GujaratiB + Kurd_SE @ 3.093217    15. Bengali + GujaratiA + GujaratiD + Kurd_SE @ 3.133299   16.  Bengali + Burusho + GujaratiB + Punjabi @ 3.140735   17. Bengali + GujaratiB + Pathan + Punjabi @ 3.171373    18. Bengali + Burusho + GujaratiA + GujaratiB @ 3.175251  19.  Bengali + Bengali + GujaratiB + Kalash @ 3.191180   20.  Bengali + GujaratiB + GujaratiB + Sindhi @ 3.193273

PuntDNAL K10 Ancient Admixture Proportions

This calculator incorporates the newly discovered Caucasus HG as well as Early Neolithic Farmers and Western European HG. The description about the components and the modern populations it peaks are given in the link below. For more information about the Caucasus HG click HERE. Questions and comments about this calculator should be directed to Abdullahi Warsame at puntdnalking@gmail.com

CHG is believed to make their imprint on modern populations from the Caucasus and also central and south Asia possibly marking the arrival of Indo-European languages.

puntDNAL K10 Ancient 4­Ancestors Oracle

This program is based on 4­Ancestors Oracle Version 0.96 by Alexandr Burnashev.Questions about results should be sent to him at: Alexandr.Burnashev@gmail.comOriginal concept proposed by Sergey Kozlov.

Many thanks to Alexandr for helping us get this web version developed. puntDNAL K10 Ancient Oracle

Admix Results (sorted):

# Population Percent

  1.   ASI   53.21

  2.   CHG   29.75

  3.   WHG   7.13

  4.  E_Asian   2.82

  5.  Amerindian   2.45

  6.   ENF   1.72

  7.   Beringian   1.16

Finished reading population data. 108 populations found.

10 components mode. Least­squares method.

Using 1 population approximation:

1 UP_Brahmin @ 4.176692     2. Punjabi @ 9.576191     3 . Velama @ 15.593409
4.   Tamil_Nadu @ 17.645811   5.  Sindhi @ 17.960485    6.  Pathan @ 19.4969    7. Burusho @ 20.761257  8.  Piramalai_Kallars @ 23.869427     9. Kalash @  28.102774   10. Brahui @ 35.144211   11. Pashtun @ 35.712833   12. Makrani @ 37.952305   13. Balochi @ 38.648911   14. Pulliyar @ 47.644615   15. Hazara @ 53.721985 16.   Uzbek @ 54.214390   17.  Iranian @ 55.362373    18.  Chechen @ 56.438847   19. Nogai @ 58.634083     20. Kurdish @ 58.839500

Using 2 populations approximation:
1 50% UP_Brahmin +50% UP_Brahmin @ 4.176692

Using 3 populations approximation:
1 50% UP_Brahmin +25% Burusho +25% Piramalai_Kallars @ 2.708749

Using 4 populations approximation:

1 Kalash + Punjabi + Punjabi + Punjabi @ 1.922829   2 UP_Brahmin + Burusho + Punjabi + Punjabi @ 2.275809    3 UP_Brahmin + Pathan + Punjabi + Punjabi @ 2.372204    4 UP_Brahmin + UP_Brahmin + Burusho + Piramalai_Kallars @ 2.708749    5 Punjabi UP_Brahmin + Sindhi + Punjabi + @ 2.716629    6 Kalash + UP_Brahmin + Punjabi + Piramalai_Kallars @ 2.759054          7 Kalash + UP_Brahmin + Punjabi + Tamil_Nadu @ 2.784717    8 UP_Brahmin + UP_Brahmin + Burusho + Tamil_Nadu @ 3.023813     9 UP_Brahmin + Burusho + Punjabi + Tamil_Nadu @ 3.037355      10 Pashtun + Punjabi + Punjabi + Punjabi @ 3.089234    11 UP_Brahmin + Burusho + Punjabi + Velama @ 3.097992     12 3.167916UP_Brahmin + UP_Brahmin + Pathan + Piramalai_Kallars @     13 Burusho + Punjabi + Punjabi + Punjabi @ 3.215161    14 Kalash + Punjabi + Punjabi + Velama @ 3.341131    15 Kalash + Punjabi + Punjabi + Tamil_Nadu @ 3.345859            16 UP_Brahmin + UP_Brahmin + Pathan + Tamil_Nadu @ 3.355330    17 Pashtun + Punjabi + Punjabi + Velama @ 3.356923     18 Kalash + UP_Brahmin + Punjabi + Punjabi @ 3.377925           19.   Pathan + Punjabi + Punjabi + Punjabi @ 3.388728    20 UP_Brahmin + UP_Brahmin + UP_Brahmin + Punjabi @ 3.408858

Note: The Caucuses Hunter Gatherer gene linked to (ANI), and Onge/Andamanese as ancestral South Asian/Indian ASI, the ratio of the two (ANI/ASI), is found to be the highest in Burusho ~ Pathan ~ Kalash in South Asia, and is the highest in Brahmins versus other castes in India.   

"It can also be Noted that two of the Indian populations that are best modelled with D­stats as mixtures of Kotias (one of the two CHG genomes) and Onge, are Dravidian speakers (Mala and Vishwabrahmin or Viskwakarma, a Malayali community), D­Stats (Yoruba x* Onge value) = 0.04.  Another three are Indo­Aryans (GujaratiC, GujaratiD and Lodhi), but with high levels of Ancestral South Indian (ASI) admixture, D­Stat (Yoruba x* Onge value) = 0.05 (is <0.06, see below).  On the other hand, the three populations that are best modelled as Afanasievo (a pastoralist group from the Early Bronze Age steppe, like Yamnaya Culture) and Onge,  are all Indo­-Aryans (GujaratiA, GujaratiB and Brahmin/Tiwari), D­Stat (Yoruba x* Onge value)=0.06 ." 


* Y-DNA STR Data--Family Tree DNA :                                                                                                               (DNA matches and Estimation of the time the common ancestor or TMRC thereof) - (FTDNA kit from a Biradri/individual, also see Table 2, ID002)

-Y-DNA 12 Marker

Allele: (DYS-393, -390, -19, -391, -385, -426, -388, -430, -389-I, -392, -389-II)

          13  25  16 10   11- 14   12  13   10   13   11  29

-Y-DNA 25 Marker

Allele: (DYS-458, -459, -455, -454, -447, -437, -448, -449, -464)

           16  9- 10   11  11  24  14  20  33  12-15 15-16

-Y-DNA 37 Marker

Allele: (DYS-460, GATA-H4, YCAII, DYS-456, -607, -576, -570, CDY, DYS-442, -438)

          12  12  19-23  17  16  18   19  35-36   14  11

-Y-DNA 67 Marker/111 Marker : 

Allele: (DYS- 531,578, DYF395S1, DYS590, 537, 641, 472, DYF406S1, DYS511)

11 8 17-17 8 12 10 8 11 10

Allele: (DYS425, -413, -557, -594, -436, -490, -534, -450, -444, -481, -520, -446)

12 22-22 15 10 12 12 14 8 13 23 22 13

Alleles: (DYS617, -568, -487, -572, -640, -492, -565)

12 11 13 11 11 12 13      

Note: This Iraqi-Indian Kit Matches: 12 Marker (exact 12/12 match with 6 kits, 1 Latvia, 1 Russian Federation, 1 Poland, 3 Ukraine; 25 Marker (-2 Genetic distance 23/25with 4 kits, 1Poland, 1 Ukraine, 1 Germany, 1 Luxembourg; 37 Marker (exact or allowed matches- None)

Conclusion: The DNA matches between Iraqi/Indian Kit and individuals noted as origin of ancestral countries may mean migration of ancient common R1a1 ancestors (as M417 haplogroup) as opposed to a contribution by recent ancestors who existed in India.  The latter haplogroup are carriers of basal M420/M512 marked chromosomes which existed 4-5 K years ago as Yamnaya Culture in north Caucuses region from where they migrated to steppes/Russia and East Europe, Central Asia, South Asia and are found in Middle East. The lack of STR data from Biradri individuals in India/or Pakistan, unavailability of which may be the reason for lack of DNA matches compatible to recent common ancestors existing 2-26 generation before. 



1.  http://en.wikipedia.org/wiki/Ghazipur

2.  Sayyid - Wikipedia, the free encyclopedia

3. connemara.topicS Taqi Husaini - Connemara Public Library Catalog.gov.in/cgi-bin/.../opac-search.pl?q...Taqi%20HusainiSyeds of Ghazipur - Vol.1.Author: TAQI HUSAINI NON HARVEY ( S M ) . Copies available for loan: Connemara Public Library (1) Call Number: 297.64 TAQ.1 …;  "Malikus Sadat" - Syed Masood al-Husaini "Malikus Sadat ..

4.  Welcome to the Official Website of Anjuman Iraqi Biradri …

5. Human Y-chromosome DNA haplogroup - Wikipedia, the ...


7.  European Journal of Human Genetics - The phylogenetic .

8. For what they were... we are: Y-DNA R1a spread from Iran

9. Y-STR - Wikipedia, the free encyclopedia ; Family Tree DNA - N Y-DNA Haplogroup Project

10. Human mitochondrial DNA haplogroup - Wikipedia, the free …

11. North Indian Muslims: enclaves of foreign DNA or Hindu converts? Terreros MC,  Rowold D, Luis JR, Khan F, Agrawal S, Herrera RJ Am J Phys Anthropol. 2007 Jul; 133(3):1004-1