[B]. IRAQI INDIAN GROUP ANCESTRY PROJECT
In order to understand the origin and history of the Iraqi population of Eastern U.P. accurately in detail, we need to continue our efforts in the form of a common Ancestry Project. Furthermore, obtaining informations outlined below and characterization of the most recent common ancestors (TMRCAs) consistent with data on ethnological, historical migration timeline. It may mean determining the actual time, 20-30 generation earlier (more than 500 years ago) our ancestors existed in India, following migration reported in history (see Section A). This and the others as below are parts of the Ancestry Project. Again, it is our understanding as members of the Society in taking interest by ordering voluntary DNA Tests upon which will depend the actual success of the project. This is to clarify that participation is needed from across the whole Society in order to achieve our aims under the proposed Project.
Finding gene pool of extant Iraqi population of Eastern U.P: Identification of Y-haplogroups and subclades, allele frequencies; haplogroup comparison with others populations such as neighbouring i.e. upper caste Hindu, and historical Muslim populations within U.P; maternal ancestry and composition of mitochondrial haplogroups; gene admixture in extant Iraqi population -identification of gene flow from Hindu castes and other ethnic Muslim populations found in U.P. India; Y chromosome and autosomal gene comparison with West Asian, more importantly Near East-Iraqi population or subpopulations; similarity to DNA from ancient cultures. STR data on living Iraqi individuals: Phylogenetic structure; relationship with world population. Therefore, the proposed Iraqi Indian Ancestry Project has the following aims.
(1) Link between living Iraqi families and Sunni Converts under the Genealogical tree of Syed Masud Al Hussaini (migration/post-migration period: 1330-1366 AD)
(2) Genetics of Extant Iraqi Population: Characterization of Y Haplogroups and autosomal SNPs
(3) Maternal Ancestry composition of Extant Iraqi Population
(4) Y-STR Data on individuals from Iraqi population
RESULTS: DNA Genealogy Data, Genealogical Tree Data
TIMELINE: (Milestones achievements, project completion -reasonable time, 2-3 Years)
PUBLICATIONS: Genetic composition, affinity and phylogenetic tree and genome similarity/sharing within community of known population groups of U.P. and worldwide.
Aim 1: Link between living Iraqi families and Sunni Converts under the Genealogical Tree of Syed Masud Al Hussaini (Sunni converts born circa 1517 AD)
The history of the early honorific Sunni Converts (born, 1517 AD) as ancestors of Iraqis of Eastern U.P. with respect to biography, civil records in Ghazipur, and more importantly genealogical links to both ascending Shia Syeds and deep descending Iraqi families are our objectives under this project.
The background information from the genealogical tree of Syed Masud will be used because few of his descendants converted to Sunni religion in early 1500 AD. Those Syed converts, together with other Arab and West Asian noblemen, arguably formed the foundation of surviving Iraqis of Eastern U.P. (1-3). Early Sunni Converts in the genealogical tree of Syed Masud are shown in Figure 1, from which a link may be generated to or from the living Iraqi families in India and Pakistan.
It can be seen that Syed population in Nonehra is predominantly descended from his son Syed Qutubuddin from early 1330 AD to todate (Figure 1). The present population in Nonehra are known to include booth Shia Syeds and Iraqi Muslims. In Nonehra, it can be seen that descendant of Syed Masud corresponding to 6th and 7th generation, respectively, are Syed Saloni and his Sons Syed Piyare and Syed Ladle. It is also noted that all descendants of Syed Ladle had settled down in Nonehra town (Figure 1). It is probable that Syed Ladle and his children have lived in early to middle 1500 AD. As shown by arrow, this is the time period given in history that the earliest descendants of Syed Masud Al Husaini converted to Sunni religion; some names similar to Sunni descendant in Figure 1 is shown in red colour (3). In order to verify this honorific ancestor, downstream descendants names need to be investigated.
It is reported that Syed Nooruddin, son of Syed Masud Al Husaini had settled down in Gangauli town, Ghazipur. As names of his descendants are not open yet, at least, one or two Sunni Convert family is expected (see Figure 1). It is worth adding here that Ganguli town is referred to as an imaginary mini India in the fiction “Aadha Gaon”. The latter story successfully recapitulated a commentary as a horror history of the partition of India in 1947; Needless to mention, along the main characters is being included Iraqi as Raqi trader family in the fiction. The latter novel is a famous work of renowned writer Rahi Masoom Raza. Rahi who was born in 1927 in a Shia Syed (Zamindar) family, a descendant of Syed Nooruddin/Syed Masud in Gangauli town.
Genetic drift is common after a population bottlenecks, which are events that drastically decrease the size of a population. A similar or a random genetic drift can result in the loss of rare alleles i.e. Sunni Converts/Iraqi Syed as opposed to dominant Shia Syeds of Ghazipur. If this phenomenon occurred it would affect the number of Iraqi/Syed even within the Iraqis of Eastern U.P. population. It could be because of the rare nature of the Sunni descendants of Syed Masud Al-Hussaini within the Iraqi Biradri from the origin. The propensity of the loss of Sunni Covert allele was great in comparison to the descendants of other kinds of descendant peoples such as IE-West Asian/Persian or other Arab alleles. In view of the random genetic drift, the growth of Iraqi Biradri in the last 500 years could be imagined as follows. Firstly, loss of genetic diversity/or reduction of rare Sunni Convert (Syed) allele as opposed to other alleles would not be surprising. On one hand, this change in the Iraqi population makes it a vastly distinct from an overlapping Shia descendant population of Syed Masud. On the other hand, it is unclear whether a probability of finding the deep Sunni descendant of Syed Masud exists in living Iraqi population. If this exists what will be the number of surviving descendant (from the earliest Sunni ancestor, Honorific Syed Abu Bakr) within the population after 500-700 years of growth.
The probability of finding the Sunni descendants in the extant Iraqi population may be possible. Many Iraqi families are known to have deep roots in Nonehra, Gangauli and other places; how many 18th generation Iraqi families with ascending links to honorific Converted Sunnis exist. If it is not written, it becomes very hard to create a genealogical list. To the knowledge of the most of us, it is not known, and it is our objective to find those individuals/families under this project. It has to be seen whether such family exists in India or Karachi, Pakistan (4). One approach is that we search by getting feedback from families with history akin to Iraqi Syed, Hashmi, Quraishi, Shaikh or Iraqi titles within Iraqi Biradri. The feed back information may lead us in forming a partial or complete link between Iraqis as descendants of Syed Al Hussaini and latter's genealogical tree (see below Figure 1, to be investigated under this section).
Alternatively, the small number of Sunni Converts and later descendants as Iraqi Syeds are reduced completely by random genetic drift in the relatively small size of the Iraqi/Indian population today, after growth of many generations during 500-700 years. Finally, this small effective size of Iraqi population of Eastern U.P. is involved, for some time, in inbreeding and subsequently has less paternal (and maternal) haplogroup diversity compared to initial Iraqi Population in early 1330-1500 AD in India. This trend and its effects on our Society must be understood clearly, sooner the better. It bears consequences related not only to its optimal growth rate but also susceptibility to recessive autosomal hereditary diseases i.e. higher frequency of newborns with errors in extant Iraqi Biradri in India.
Figure 1: The genealogical tree of Syed Masud Al-Hussaini. The names of the descendants of Syed Masud in India from 1330 AD to early 1500 AD in the tree is constructed from data published earlier (1-3).
Note: Progress on this goal is challenging and will certainly constitute a milestone under this project; tree form of genealogy of the postulated Arab ancestor from Larestan or non-Arab West Asian Iraqi ancestors corresponding to lines Y40/M560, and L657/Y6+ from origin 1500 AD to date are not included under this project for clarity.
Aim 2: Genetics of Iraqi Population of Eastern U.P. India
In Indian population history, four separate waves of migration into the subcontinent took place: (i) an ancient Palaeolithic migration by modern humans, (ii) an early Neolithic migration, probably via Proto-Dravidian speakers from the eastern horn of the Fertile Crescent, (iii) an influx of Indo-European speakers, and (iv) a migration from East Asia, Southeast Asia i.e. Tibeto-Burman speakers. In addition to these, migration pattern by Medieval Muslim Preaches, Invaders/Warriors or travellers to India and later in 18th century by Europeans have also contributed to the ethnic multiplicity. Furthermore, it has been reported that Y- lineages of Indian castes are more closely related to Central Asians/East Europe than to Indian tribal/native populations, suggesting the former is primarily the descendants of Indo-European migrants. Adequate genetic studies have not been conducted on ethnic Muslim populations including the Iraqi Muslims of Eastern U.P in India as yet.
Y-chromosome, uniparental, binary markers characterize evolution of human being during the past several thousand years in the form of a family tree DNA. Thus, the existence of “Single Nucleotide Polymorphisms” (SNPs) in a population, by comparing the Non-recombining Y-chromosome regions, NYR (or for the same reason mitochondrial DNA, and Y-chromosome STR markers), is used in DNA genealogy. The latter DNA features are free from issues such as randomness created by not knowing which of the two autosomal chromosomes of each pair were inherited by the offspring during population growth, or the randomness from the the recombination process during meiosis of paternal sperm and maternal egg cells. Any specific stable inheritable SNP is known as a haplotype. Using Y-chromosome haplotypes, the ancestry of any male can be traced through his paternal lineage. A group of people who share similar haplotypes is known as a haplogroup (5).
Since a mutation at a single base is very rare compared to changes in STRs (short tandem repeats), Y-DNA SNP tests are able to trace both ancient anthropological migrations and more recent prehistoric/historic movements. A Y-DNA SNP test also identifies the haplogroup, which may represent deep ancestral origins (tens of thousands of years before present, ybp). The evolution of modern human being from very early time to different prehistoric time over to contemporary period has been studied by knowing genetic families as haplogroups that specifically originate at a time (in terms of ybp) and at a geographic location. The haplogroups in ‘letter’ forms make a genealogical Family of a modern human being (Homo Sapiens) in which letter A is at the root of the family tree. The letter A arose from Africa, 60,000 ybp, as the earliest human, has gone thru evolution, giving birth to later branches denoted by following ‘letters’, each thousands of year before present (Figure 2-Top Panel). When a defining SNP under a haplogroup overlaps detection in an individual, it identifies a particular haplogroup from which the origin, as age of the distant ancestor as TMRCA, and geographic spread within thousand years from the origin, could be learned. A big cohort of Y- chromosome haplogroup data in conjunction with autosomal SNPs data from a population group/or family may be used to know DNA relatives i.e. very old, common ancestors as TMRCAs. Similar data base are used by Ancestry Companies in finding DNA relatives.
A Y-chromosome data from different population groups found in U.P. India has been presented below as a relevant example for the background information (6).
Iraqis of Eastern U.P., Preliminary study, n=5 ........................... 5R1a
Figure 2: Genetic families of Modern Human being in the form of a tree and haplogroup data from a genetic study on different population groups from U.P. (India). The data are reproduced from the work described in detail elsewhere (6). Iraqi data is taken from Table 2.
A genetic study alluded to above, on distinct population groups living in U.P. India, namely, upper caste Hindu Brahmins, Awasthis, Chaturvedis, Syeds (Shia) and Non Syed Muslims with recent ancestry outside from India (i.e. Sunnis) has been carried out (6). Using Y-chromosome binary markers i.e. a total 32 Y-chromosome unique event polymorphisms (UEPs) that include 27 single nucleotide polymorphisms (SNPs), four insertion/deletion and one Alu repeat, 560 individuals from the above 5 groups were examined for these markers (6). These mutations have been found to be stable and as variants are known to segregate into distinct haplotypes (6). In Figure 2, we show family tree of human and haplogroup branches; defining mutations under each haplogroup branch is also shown (Top Panel). Detection of SNPs/ID mutations in 560 Y chromosomes from five caste groups constituted several haplotypes which are formulated into 13 haplogroups as described in detail elsewhere (6). The details of Y-haplogroups found in five population groups are also summarized in Figure 2-Lower Panel.
The Haplogroups observed in five populations were: J2, R1a, R2, H, F, C, K. O and P, all nine haplogroups were found in each population group, albeit in a unique composition. Although J2, R1a, R2 were dominant haplogroups, others alleles such H,F, C, K, O and P were present in small percentage in all populations. Haplogroups G, L, R1b and E1b1b1 were not commonly observed in all but were present in some populations (6).
For example, haplogroup J2 was highest in Shia Muslim group along with E1b1b1 haplogroup; the latter is a Shia specific haplogroup, as other groups lack this haplotype. Haplogroup J2 is observed in Middle East region and in India is linked to early Neolithic migration and to some extent to historical migration of Muslim Shia population group. More importantly, E1b1b1 the most frequently observed haplogroup in Africa/Mid-East region. Therefore, this observation suggests that Shia Muslims might carry some North African and Middle-Eastern ancestral alleles that were brought in India during Muslim rule.
Preliminary result on Iraqis of Eastern U.P. is shown in Figure 2 (23&Me and YSEQ, Table 2). However, detection of haplogrup J (M267 and M172 branches) in extant Iraqi population of Eastern U.P. is yet to be found. If it is found in actual study, it may provide further link toward existence of the survivor descendant families of Syed Masud Al Hussain (circa 1330-1360) or as descendants of Arab Sunni converts who existed from circa 1500 AD (see the section under A). It is interesting to note that all four individuals tested belonged to R1a1a haplogrup. As the R1a carriers clearly belonged to the distinct clades, i.e. M560+ and L657/Y6+ (under Asian R1a-Z93-Z95), these individuals will also belong to two separate family tree as descendants from two distinct common ancestors under one Biradri.
More Haplogroup Background information:
Haplogroup: E, a subgroup of D/E
Age: 30,000 years
Region: Africa, Europe, Near East
Example Populations: Bantu-speakers, African Americans, Berbers, Bantu-speakers
It has two known branches,E-V68, E-Z827 which contain by far the majority of all modern E-M215 men. E-V68 and E-V257 have been found in highest numbers in North Africa and the Horn of Africa; but also in lower numbers in parts of the Middle East and Europe, and in isolated populations of Central Asia.
Paternal haplogroups are families of Y chromosomes that all trace back to a single mutation at a specific place and time. By looking at the geographic distribution of these related lineages, we learn how our ancient male ancestors migrated throughout the world.
Haplogroup: J, a subgroup of F
Age: 20,000 years
Region: Southern Europe, Near East, Northern Africa
Example Populations: Bedouins, Ashkenazi Jews, Greeks
Highlight: Haplogroup J was carried out of the Near East by Muslims and Jews during the first millennium AD.
Haplogroup J (P209) Branches
J1 (M267) 2. J2 (M172)
J 12f2.1, L134/PF4539, M304/Page16/PF4609, P209/PF4584, S6/L60, S34, S35
• J* -
• ; J1 L255, L321/PF4646, M267/PF4782
• • J1* -
• • J1a CTS5368/Z2215
• • • J1a* -
• • • J1a1 M365.1
• • • J1a2 L136
• • • • J1a2* -
• • • • J1a2a P56
• • • • J1a2b P58/Page8/PF4698
Haplogroup J1, which is also known as M267, is a subclade of Y-DNA haplogroup J- P209 (Haplogroup J). Haplogroup J1 separated from haplogroup J approximately 31,500 years ago (YFull, 2015). Today, haplogroup J1 is mostly seen in Caucus, Mesopotamia, Levant and Arabian Peninsula, but it is also present moderately or slightly in Turkey, Azerbaijan, Iran, Europe, Central Asia and Indian subcontinent.
Distribution: Haplogroup J-M267
Haplogroup J-M267[Phylogenetics 3] defined by the M267 SNP is in modern times most frequent in the Arabian Peninsula: Yemen (up to 76%), Saudi (up to 64%) (Alshamali 2009), Qatar (58%), and Dagestan (up to 56%). J-M267 is generally frequent among Arab Bedouins (62%), Ashkenazi Jews (20%) (Semino 2004), Algeria(up to 35%) (Semino 2004), Iraq (28%) (Semino 2004), Tunisia (up to 31%), Syria (up to 30%), Egypt (up to 20%) (Luis 2004), and the Sinai Peninsula. To some extent, the frequency of Haplogroup J-M267 collapses at the borders of Arabic/Semitic-speaking territories with mainly non-Arabic/Semitic speaking territories, such as Turkey(9%), Iran (5%), Sunni Indian Muslims (2.3%) and Northern Indian Shia (11%) (Eaaswarkhanth 2009). However, it should be noted that some figures above tend to be the larger ones obtained in some studies, while the smaller figures obtained in other studies are omitted. It is also highly frequent among Jews, especially the Kohanim line (46%) (Hammer 2009). ISOGG states that J-M267 originated in the Middle East. It is found in parts of the Near East, Anatolia and North Africa, with a much sparser distribution in the southern Mediterranean flank of Europe, and in Ethiopia. But not all studies agree on the point of origin. The Levant has been proposed but a 2010 study concluded that the haplogroup had a more northern origin, possibly Anatolia.
The origin of the J-P58 subclade is likely in the more northerly populations and then spreads southward into the Arabian Peninsula. The high Y-STR variance of J-P58 in ethnic groups in Turkey, as well as northern regions in Syria and Iraq, supports the inference of an origin of J-P58 in nearby eastern Anatolia. Moreover, the network analysis of J-P58 haplotypes shows that some of the populations with low diversity, such as Bedouins from Israel, Qatar, Sudan and the United Arab Emirates, are tightly clustered near high-frequency haplotypes. This suggests that founder effects with star burst expansion into the Arabian Desert (Chiaroni 2010).
a subgroup of J, defining mutation M172 (Age: 18,000 year before present)
Region: Southern Europe, Near East, Northern Africa
Example Populations: Ashkenazi Jews, Sephardic Jews, Lebanese
Highlight: Haplogroup J2 is found in nearly one-quarter of Sephardic Jewish men.
The Y chromosome haplogroup R is divided into two main components: R1, R2-M173 and R2-M479. R1 is then divided into two major groups: R1a-M420 and R1b-M343. R1a is more common in the East Europe/West Asia/Central South Asia, while R1b is prevalent in the West Europe. Previous studies have suggested that this division reflects the expansion of populations in a post-ice age particularly during the Neolithic period. Over 10% of men living in a region extending from South Asia to Scandinavia share a common ancestor belonging to haplogroup R1a-M420 (5).
One sub-clade of R1a (haplogroup R1a1) is much more common than the others in all major geographical regions. R1a1, defined by the single-nucleotide polymorphism (SNP) mutation M17, (and sometimes alternatively defined as R-M198), is particularly common in a large region extending from South Asia and southern Siberia to Central Europe and Scandinavia (5). The R1a family is defined most broadly by the SNP mutation M420, which was discovered after M17. The discovery of M420 resulted in a reorganization of the lineages, in particular, establishing a new paragroup (designated R-M420*) and R-M420 clade as R1a that leads to R-SRY10831.2 branch as R1a1; next branch is R-M17/R-M198 as R1a1a (5).
R1a and R2 were the most observed haplogroup in Sunni, and upper caste Hindu groups. R1a, or haplogroup R-M420 is specially found largely in North India compared to other regions, most likely have West Asian origin and are linked to migration of Indo-European speakers from its origin West Asia to South Asia 4-5 K ybp (6).
Origin and Distribution of R1a and R1a1 (from paper by Underhill et al 2014)
This study is based on the analysis of the genetic tests on 16,244 men from 126 Eurasian populations. (13 of these samples belonging to haplogroup R were completely sequenced, i.e. 9.99 million base pairs on the Y chromosome). Of all the samples, 2923 individuals belong to haplogroup R1a-M420. These were then tested for the following SNPs: SRY10831.2, M17/M198, M417 -Page7, Z282 / Z280, Z284 , M458, M558 / CTS3607, Z93 / Z94, Z95, Z2125, M434, M560, M780, M582, M746, M204 and L657 (see Figure 1, ref 7). The geographic frequency of R1a was calculated for all samples: Of the 2923 samples; 2893 belong to the sub-clade, R1a-M417 / Page7 and 1693 are European and 1200 are Asian, respectively. Among them the rare sequences are: 24 * R1a -M420 (xSRY10831.2), 6 * R1a1 -SRY10831.2, (xM17/M198), and 12 -M417 / Page7 *(xZ282, Z93). Among European 1693 R1a-M417, over 96% belong to the sub-clade: R1a-Z282.
Among the 490 South Asian R1a-M417, more than 98.4% belong to the sub-clade R1a-Z93. These two subclades, R1a-Z282 and R1a-Z93, are among the 560 samples from the Near East, the Middle East or the Caucasus (7). The sub-clade R1a-Z282 * is located mainly in the north of Ukraine, Belarus and Russia. The sub-clade R1a-Z284 is confined in Scandinavia. R1a-M458-M558 and R1a have a similar distribution with their highest frequencies in Central Europe and the East Europe. However R1a-M558 is present in the Volga-Urals region, unlike R1a-M458 (7).
The sub-clade * R1a-Z93 is more common in southern Siberia, in the Altai region. The sub-clade R1a-Z2125 is found in Kyrgyzstan and among Pashtuns of Afghanistan and in the population of the Caucasus and Iran. The sub-clade R1a-M780 (or L657) is located in South Asia: India, Pakistan, Afghanistan and the Himalayas. We find this group also in Iran and among Roma in Croatia and Hungary. Finally rare sub-clade R1a-M560 is found in two individuals speaking Burushaski in northern Pakistan, 1 individual Hazara from Afghanistan, and 1 individual Iranian Azeri (7).
R1a samples about 1335 were then tested on 10-19 STR markers. A network was built for R1a-Z282 and Z93-R1a. However, little sub-structures have been identified in these networks. The lowest differences appear in the sub-clade: R1a-Z93 *, R1a-M582 among Jews and among R1a-M780 Roma. These results are consistent with founder effects in these groups. A principal components analysis was done to differentiate the European and Asian groups on the first component, and Jewish groups (especially Ashkenazi) on the second component (7).
To find a link between ancient civilization in Europe and origin of R1a haplogroup, the oldest sample of R1a in Europe was tested to a survival age that corresponds to the individuals of the Roped Culture dated to 2600 BC. Previous samples from the early Neolithic time are F * and G2a. These suggest a rapid spread of haplogroup R1a-Z282 in Europe in Chalcolithic/Copper Age or Early Bronze Age, along the River Volga to the Rhine. The equivalent old culture R1a in the Middle East and South Asia is more obscure. However samples from Indus civilization could match this period (see in particular in the geographical distribution of R1a-M780) (7).
To evaluate the potential role of R1a clades to study archaeological events, the authors used two approaches to estimate the age of the common ancestors of the different subclades of R1a. The first uses the STR markers of the mutation rate. However the dates obtained with the former are greatly underestimated. The dates obtained with the Zhivotovsky rates are over-estimated and should be considered upper bounds. The second approach is based on the complete sequencing of the Y chromosome. For this, 13 individuals (8 R1a and 5 R1b) from younger European and Asian clades of R1a and R1b were tested, and 928 SNPs belonging to the R1 tree were used.
There is no consensus yet on the mutation rate to be used for sequencing of 9.99 million base pairs. This varies from one mutation every 100 years up to 1 mutation every 162 years. Using a mutation rate of 1 every 122 years, the authors estimated the separation of R1a and R1b to about 25,000 years. The age of R1a-M417 is estimated to be 5,800 years old. The shape of the tree obtained for R1a clades suggested a rapid diffusion of those haplogroups. In conclusion, the authors of this study believe that R1a haplogroup have spread from Iran and eastern Turkey to other geographic locations, they are there about 5800 years. This implies a dispersion/growth during Bronze Age (7).
R1a and Subclades
To infer the geographic origin of hg R1a-M420, authors in the above paper identified populations harbouring at least one of the two most basal haplogroups and possessing high haplogroup diversity. Among the 120 populations with sample sizes of at least 50 individuals and with at least 10% occurrence of R1a, just 6 met these criteria, and 5 of these 6 populations reside in modern-day Iran. Haplogroup diversities among the six populations ranged from 0.78 to 0.86 (Supplementary Table 4) (7). Of the 24 R1a-M420*(xSRY10831.2) chromosomes in our data set, 18 were sampled in Iran and 3 were from Eastern Turkey. Similarly, five of the six observed R1a1-SRY10831.2*(xM417/Page7) chromosomes were also from Iran, with the sixth occurring in a Kabardin individual from the Caucasus. Owing to the prevalence of basal lineages and the high levels of haplogroup diversities in the region, we find a compelling case for the Middle East, possibly near present-day Iran, as the geographic origin of haplogroup R1a (7).
In Figure 3 and ref 8, it is pointed out that there is at least one West Asian sequence (from Turkey) within M420 paragroup which seems an independent R1a1a- seen as Branch B. Similarly there is an Indian and one Norwegian sequence which are shown North European R1a1a as Branch A. However, these can be interpreted with West Asian centrality within this key paragroup (see Figure 3) (8).
Branch A went back to West Asia from where it spread again to Eastern Europe and Central South Asia.
Branch B is actually at the origin of the two derived and highly spread subhaplogroups.
Whatever the case it is understood that there are good reasons to think that these spread first from West Asia, at the very least Z93 and very likely also East European Z282 (Figure 3) (ref 8).
R1a1a1b2 Subclade (i.e. Z93, see Figure 3)
There is nothing European in this lineage: only some lesser terminal branches at the Southern Urals, roughly where the Kurgan phenomenon began some 6000 years ago.
This detail is indeed remarkable because, if, as often argued, R1a or some of its subclades spread from there, we should expect at least some basal diversity being retained. Instead all we see are some highly derived branches. So the main conclusion must be that the expansion of R1a does not seem related to the Kurgan phenomenon, except maybe in some secondary instances.
As mentioned before, this lineage is Central and South Asian and comprises the vast majority of R1a in those two regions.
The detailed haplotype network can be seen in Supp. Info fig. 2 (7)..
Downstream Z93 or terminal clades:
Z93* has three apparent distinct branches stemming from West Asia (incl. Caucasus) and another one from South Asia/Altai (1).
Z95* has two apparent distinct branches:
A small one with presence in West Asia and Southern Europe
Another one (pre-M780?) stemming from South or West Asia
M780 (L657) has clear origins in South Asia (incl. most Roma lineages)
Z2125 also appears to originate in South Asia, even if it has a greater spread outside it, notably to Central Asia
M560 and M582 appear related and surely originated in West Asia
Therefore the origin of Z95 should be though as West-South Asian but undecided between either region. Say Afghanistan for example.
In this case I would say that West Asia is almost certainly the origin, although tending to Central/South Asia. For example: Iran again.
So, regardless of whether the previous stage (M417) represents a stay in West Asia or a back-migration from Europe into West Asia, West Asia is clearly at the origin of Z93. It does not represent any Kurgan migration but an Asian phenomenon with origins towards the West (around Iran) (Figure 3; ref 8).
Figure 3: Schematic presentation of Origin of R1a and younger downstream clades of R1a. Migration of R1a-M417 clade along with downstream branches based on phylogenetic and geographic data on Y chromosome R1a haplogroup by Underhill (7-8).
R2 is linked to early prehistoric Neolithic migration to India, its distribution was found in all populations studied but was not specifically linked to any one group in India (6).
Possible time of Origin 12000 ybp.
Possible place of origin South Asia or Central Asia
Defining mutation M479
Highest Frequency South Asia
G and L Haplogroups
Small number of haplogroups G and L are found in Shia, Sunni, and Hindu caste populations in U.P. India, haplotype geographic spread include North East region of Middle East, Iran and Iraq, and in India are linked to Neolithic migration (6).
Haplogroup R1b, also known as haplogroup R-M343, is the most frequently occurring Y-chromosome haplogroup in Western Europe, as well as some parts of Russia (the Bashkir minority), Central Asia (e.g. Turkmenistan) and Central Africa(e.g. Chad and Cameroon) (5).
Haplogroups C, H, F, I, K, O and P
Haplogroup H and F are found in indigenous/tribals in Indian. Contributions from the above haplogroups were small, but were present, in both Hindu caste and Muslim populations (6).
Aim 3. Y-STR haplotypes
Unlike the UEPs, the Y-STRs mutate much more easily, which allows them to be used to distinguish recent (also ancient) genealogy. For the same reason, i.e. STR mutations are not rare/stable, segregation of Y-STR haplotypes in population are likely to have spread apart, to form a cluster of more or less similar results. Typically, this cluster will have a definite most probable center, the modal haplotype (presumably similar to the haplotype of the original founding event), and also a haplotype diversity-the degree to which it has become spread out.
if the population growth has taken place earlier more in the past from the time of STR defining event i.e. modal haplotype, for a particular number of descendants, the haplotype diversity is greater. However, if the haplotype diversity is smaller for a particular number of descendants, this may indicate a more recent common ancestor, or a recent population expansion.
It is important to note that, unlike for UEPs/SNPs, two individuals with a similar Y-STR haplotype may not necessarily share a similar ancestry. Y-STR events are not unique/stable. Instead, the clusters of Y-STR haplotype results inherited from different events and different histories tend to overlap. In most cases, it is a long time since the haplogroups' defining events, so typically the cluster of Y-STR haplotype results associated with descendents of that event has become rather broad. These results will tend to significantly overlap the (similarly broad) clusters of Y-STR haplotypes associated with other haplogroups. This makes it impossible for researchers to predict with absolute certainty to which Y-DNA haplogroup a Y-STR haplotype would point.
STR tests are done by Family Tree DNA and are able to trace a male lineage within genealogical times/ historic times. Your genealogical connections will be shown on the Y-DNA – Matches page of your myFTDNAaccount. The Y-DNA – Ancestral Origins page of your myFTDNA account will point towards possible countries of origin” (9).
Aim 4: Maternal Ancestry
mtDNA tests can be used to test direct maternal lineage i.e. mother's > mother's mother haplogroup. mtDNA mutates much more slowly than Y-DNA, so it is really only useful for determining distant maternal ancestry. mtDNA results are generally compared to a common reference sequence called the Cambridge Reference Sequence (CRS), to identify specific haplotype, a set of closely linked alleles (variant forms of the same gene) that are inherited as a unit. People with the same haplotype share a common ancestor somewhere in the maternal line. This could be as recent as a few generations, or it could be dozens of generations back in the family tree. mtDNA testing is generally done in two regions of the genome known a hyper-variable regions: HVR1 (16024-16569) and HVR2 (00001-00576). HVR1 and HVR2 test results also identify the ethnic and geographic origin of the maternal line. From evolutionary tree of human mitochondrial DNA (mtDNA) haplogroups, one can see more informations on haplotype/haplogroups to be discussed below (10). Extant Iraqi Muslim population in Eastern U.P., like other Indo-Muslim groups i.e. Shia and Sunni in U.P. may represent descendants of Hindu converts and offsprings of Hindu mothers. Alternatively, the Iraqis are comprised entirely of descendants of Middle Eastern (Arab/Iraq) or Central Asia (Iran,Turkic clans) migrants without admixture of surrounding populations. This project is intended to test these and other hypotheses.
The genetic studies on maternal ancestry in the extant Iraqi population are not done before. However, some information on Shia and Sunni Muslim groups in U.P. and other population is summarized here (also ref 11). Dominant maternal haplogroups found in two Muslim populations are M, (more than 50%); R and U types, which together are known as South Asian haplogroups for the following reasons. The high frequency of these haplogroups are also found in Hindu castes, Brahmans, Bhargavas, Chaturvedis in U.P. and Hunza people in Pakistan. These haplogroups are reported to be minimal to nonexistent in Middle Eastern, Central Asian, and North East African populations; West Asian maternal haplogroups include, respectively, H, I, J, K and T.
It is also important to know that none of the Indo Muslim groups possesses any of the three ancient mitochondrial hapolog M2, U2i and R5, R6 with a coalescence time of more than 50, 000 ybp; these are found in 15% of the general Indian population. In addition, Shia population shows greater proportion of Indian specific mitochondrial M haplogroup and have subclasses, e.g. M3. M4, M25,etc. By contrast the Sunni group lacks M subhaplogroup but maternal haplogroup R is more frequent in Sunni population (11).
These data are consistent with a view that both Sunni and Shia Muslim groups in U.P. extend mitochondrial haplogroup affinity much more toward upper caste Hindu population groups. The finding that M, and R haplogroup frequencies are minimal in Middle Eastern, Central Asian, and North East African populations supports the view that M and R alleles carried by Muslim population group is linked to female population of South Asia as opposed to outside.
The affinity due to similar maternal ancestry between Sunni, Shia populations and Hindu caste groups suggests that former two groups show gene admixture as a result of gene flow from Hindu groups. This may be result of conversion of Hindu population, both male and female- i.e. family oriented, to Sunni Islamic faith; the conversion to Shia faith may have been female biased (see ref 11).
Preliminary DNA Test Data
A. HAPLOGROUP DATA
Table 2: List of Haplogroup/subhaplogroup/ STR-Haplogroup/Maternal Haplogroup based on Genetic Tests performed on Iraqi individuals native to Eastern Uttar Pradesh, India.
|ID||Origin/orBirth||Residence/Age||Y Haplogroup||Sub-Haplogroup and SNP||Y-DNA STR*||Maternal Haplogroup||DNA Tests*|
|002||Lar/Deoria||Canada/60||R1a1a||Y40+/M39+ or M560+/Y42-||+*||A4b||23andMe/ YSEQ|
|005td>||India(F)||Karachi,Pakistan||U7td>||23 and Me|
|006||India(F)||Karachi,Pakistan||R8atd>||23 and Me|
|007||U.P. India||Karachi||R1a1a||R8a||23 and Me||008||(50% South Asia,L Nawaz,F)||D/ID 7, SNAWAZ||T2a1||23 and Me||009||Nawanagar/Ballia,F||India/52||M3c||23 and Me||010||India||unknown||J1a||U2c1||23 and Me||011||(50%South Asia,C Hobson,M)||unknown||R1a1a||L657+/Y6+||H7||23 and Me|