 | | European History European History Forum - Western and Eastern Europe including the British Isles, Scandinavia, Russia |
May 27th, 2012, 10:06 AM
|
#61 | | Historian
Joined: Apr 2008 From: Sodom and Begorrah Posts: 2,192 | Quote:
Originally Posted by authun And is that what you base it on? An assumption, a guess? It must be the same because they both use 25 sample centres?
The sample centres used in Blood of the Vikings, according to the BBC, include some which are the same, but also the some which are not used by Capelli, eg. London, Glasgow, Liverpool, Horncastle, Sherringham. Similarly, Capelli uses some not used in Blood of the Vikings eg. Southwell, Llangefni, Chippenham.
Chippenham in particular is not sampled by Blood of the Vikings and yet is the centre of the high frequency of continental input found in Capelli's MCMC admixture analysis 70.8% and shown on Thomas' map. Thomas's map is clearly not the same as you have guessed.
Moreover, you should not assume that even in those centres which are the same that the microsatellites used will be the same. Data which is collected is stored and re-analysed. I have never seen the raw data for Blood of the Vikings but if it analyses the same microsatellite data as Jim Wilson did, and he was involved, locus DYS388 was not counted. So, even where the same individuals contribute, it might be that different data is used.
Your accusation therefore, that Dr Mark Thomas' map is "bogus", is based on no more than a false assumption. I think you owe him an apology. | Ok my bad, I assumed wrong and the locations used for the 2 studies were not exactly the same but it does look like most of them were. Now in the case of Ireland it looks like Thomas has used the exact same 2 locations.
The reason I think the map is misleading is that in the Harke article it has the following title. Quote:
Modern densities of introgressiveY-chromosomeDNA indicative of immigration fromcontinental NW Europe.
Created by M Thomas,based on data from Capelli et al 2003
| Colouring Ireland black based on Castlerea and Rush is a massive oversimplification. If the samples had been taken in say south Dublin or Downpatrick the results would be a lot different. Also I seem to remember that even when they took the Irish samples they excluded people with English surnames. Since there are lots of modern Irish people with English surnames how can they call the map modern?
Perhaps the map would stand if they explained that it shows how Ireland looked a long time ago but they couldn't even be sure of that because they are only using 2 locations. if they are only using the Irish samples as a base to represent indigenous Britons they are still making a big assumption as there is no guarantee that the indigenous Irish were the same as the indigenous Britons.
Another thing which struck me at first glance is Galloway. I mean they have coloured Galloway white based on what? The sample in the Isle of Man by the looks of it, and so on and so forth.
| | |
| |
May 27th, 2012, 11:08 AM
|
#62 | | Lecturer
Joined: Aug 2011 Posts: 332 | Quote:
Originally Posted by galteeman Another thing which struck me at first glance is Galloway. I mean they have coloured Galloway white based on what? The sample in the Isle of Man by the looks of it, and so on and so forth. | As I wrote earlier, those synthetic maps can be misleading because the clines fill in the blanks as you point out. However, in 2003, the time of Capelli, clines were the features being studied. What is Hg1 in Wilson in 2001 and Hg1 in Weale in 2002 is R1b with 3 haplotypes in 2003. Today we have dozens of markers. Things move on. Back aound 2002, the question was, does the technique work at all? You are right though, Wilson's hypothesis that North Wales represented the British Isles as far as East Anglia, is stretching it. However, Weale points this out as a caveat.
Harding and Jobling cited earlier who were involved in the collection of the data in Ireland and used Wilson's study did so on the basis of gaelic surnames. They have continued with their surname based methods and by using a combination of surnames found in medieval sources, such as Manor Roles produced a follow up to Blood of the Vikings with their Wirral Viking project, see Excavating Past Population Structures by Surname-Based Sampling: The Genetic Legacy of the Vikings in Northwest England (2008)
They are currently writing up the results of the Old Norway Project which will be published shortly.
The method of using surnames to create medieval proxies is proving popular. The Peoples of the British Isles Project have published the first of several papers, one which deals with this method: People of the British Isles: preliminary analysis of genotypes and surnames in a UK-control population
As you can see from the map below, one gets a much better coverage: Quote:
Originally Posted by galteeman if they are only using the Irish samples as a base to represent indigenous Britons they are still making a big assumption as there is no guarantee that the indigenous Irish were the same as the indigenous Britons. | Agreed, that's why it is important to read the studies carefully. As I wrote earlier, this is about verifiable and repeatable experiment as much as testing new hypotheses.
Wilson uses the population of North Wales as a signature for the indigenous population. Capelli uses a combination of Ireland and the Basques. In Capelli's MCMC analysis of Orkney, Shetland and the Western Isles, he finds the genetic scandinavian paternal lineages at: 55.3%, 68.3% and 61.6%. Goodacre et al however found in Genetic evidence for a family-based Scandinavian settlement of Shetland and Orkney during the Viking periods
found a lower scandinavian input, 31%, 44.5% and 22.5% although this too uses the same markov chain method. The differences may be due to one of three things, sampling, unlikely on these islands, Goodacre's inclusion of private alleles and a different base for assessing the base indigenous population, Ireland and parts of western Scotland.
Very rarely do these studies compare like with like. The early studies wanted to vary the parameters to see the effect. Usually this is explained in the study but sometimes it is necessary to ask the authors.
| | |
| |
May 27th, 2012, 12:18 PM
|
#63 | | Lecturer
Joined: Aug 2011 Posts: 332 | Quote:
Originally Posted by Frank81 The paper is so recent that I can't find new studies discussing the hypothesis, so imho we should wait until then. | It receives support in Per Sjödin et al. 2011 Wave-of-Advance Models of the Diffusion of the Y Chromosome Haplogroup R1b1b2
Balaresque's study was not presenting a new idea, it was using new dating techniques to settle an old argument where the date range error was too large to settle the matter. Sjödin compared Balaresque's use of Germline Mutation Rates with Evolutionary Mutation Rates proposed by Zhivotovsky and Morelli. Sjödin's conclusion was: "We report that a range expansion dating to the Paleolithic is unlikely to explain the observed geographical distribution of microsatellite diversity, and that whether the data is informative with respect to the spread of agriculture in Europe depends on the mutation rate assumption in a critical way." The caveat about the mutation rate is critical however. Published shortly after, Balaresque receives a challenge in Busby et al, The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269 "We further investigate the young, STR-based time to the most recent common ancestor estimates proposed so far for R-M269-related lineages and find evidence for an appreciable effect of microsatellite choice on age estimates. As a consequence, the existing data and tools are insufficient to make credible estimates for the age of this haplogroup, and conclusions about the timing of its origin and dispersal should be viewed with a large degree of caution. "
Busby et al basically state that in order to explain the observed data, R1b lineages either had to spread out of Iberia during the palaeolithic or enter europe very late, around the time of the chalcolithic. They think this is unlikely because of the very high frequencies of R1b found in some parts in the west. It is was a late arrival, it ought to be mixed with other Hgs. However, this view does not include the findings of population studies which suggest a large decline in the mid neolithic population. As stated earlier, early ancient dna is largely G2, although this is relatively low in frequency now. R1b lineages are much higher but don't appear in the aDNA record until the late neolithic, so it is still very much a possibility.
| | |
| |
May 27th, 2012, 11:41 PM
|
#64 | | Lecturer
Joined: Aug 2011 Posts: 332 | Quote:
Originally Posted by galteeman Another thing which struck me at first glance is Galloway. I mean they have coloured Galloway white based on what? The sample in the Isle of Man by the looks of it, and so on and so forth. | I've checked the Supplementary Data Sheet S1 for Capelli's Y Chromosome Census of the British Isles and it gives the following admixture figures:
Norway; 58.2%
Nrth Germany/Denmark; 75.7%
These are separate calculations using the MCMC method, see Estimation of Admixture Proportions: A Likelihood-Based Approach Using Markov Chain Monte Carlo
The sample centre was the Isle of Man. Comparison of the samples against 1. Indigenous vs Norway and 2. Indigenous vs North Germany/Denmark.
Galloway is simply the result of the synthetic gene frequency map influenced by neighbouring sample centres. All methods of displaying this type of information suffers from various drawbacks. To help with interpretation, you may like to read, Interpreting principal component analyses of spatial population genetic variation | | |
| |
May 27th, 2012, 11:59 PM
|
#65 | | Historian
Joined: Apr 2008 From: Sodom and Begorrah Posts: 2,192 |
It seems to me that firstly they should make a picture of today without excluding anyone of whatever origins. After that they could start to exclude people based on more recent migration using surnames perhaps or whatever and work their way back creating new pictures with clearly defined parameters. Now in order to get a true picture they will need a lot more centres and a lot more samples.
With so few samples they end up doing stuff like colouring Galloway white based on the Isle of Man results which gives a false picture which is worse than nothing.
| | |
| |
May 28th, 2012, 12:21 AM
|
#66 | | Lecturer
Joined: Aug 2011 Posts: 332 | Quote:
Originally Posted by galteeman It seems to me that firstly they should make a picture of today without excluding anyone of whatever origins. | Your not going to get coverage at that resolution yet due to costs. The Peoples of the British Isles project which gives the best coverage so far has a budget in excess of 3 million.
Costs are coming down dramatically. For example the first complete sequences for an individual cost around £1 million. Now it's more like £100,000. In another ten years it may be down to £1,000.
But, at the moment, you're only going to get analysis of short sequences and for small numbers of the population.
| | |
| |
May 28th, 2012, 02:48 AM
|
#67 | | Archivist
Joined: Dec 2011 From: Bucharest Posts: 133 | Quote:
Originally Posted by authun It receives support in Per Sjödin et al. 2011 Wave-of-Advance Models of the Diffusion of the Y Chromosome Haplogroup R1b1b2
Balaresque's study was not presenting a new idea, it was using new dating techniques to settle an old argument where the date range error was too large to settle the matter. Sjödin compared Balaresque's use of Germline Mutation Rates with Evolutionary Mutation Rates proposed by Zhivotovsky and Morelli. Sjödin's conclusion was: "We report that a range expansion dating to the Paleolithic is unlikely to explain the observed geographical distribution of microsatellite diversity, and that whether the data is informative with respect to the spread of agriculture in Europe depends on the mutation rate assumption in a critical way."
The caveat about the mutation rate is critical however. Published shortly after, Balaresque receives a challenge in Busby et al, The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269 "We further investigate the young, STR-based time to the most recent common ancestor estimates proposed so far for R-M269-related lineages and find evidence for an appreciable effect of microsatellite choice on age estimates. As a consequence, the existing data and tools are insufficient to make credible estimates for the age of this haplogroup, and conclusions about the timing of its origin and dispersal should be viewed with a large degree of caution. "
Busby et al basically state that in order to explain the observed data, R1b lineages either had to spread out of Iberia during the palaeolithic or enter europe very late, around the time of the chalcolithic. They think this is unlikely because of the very high frequencies of R1b found in some parts in the west. It is was a late arrival, it ought to be mixed with other Hgs. However, this view does not include the findings of population studies which suggest a large decline in the mid neolithic population. As stated earlier, early ancient dna is largely G2, although this is relatively low in frequency now. R1b lineages are much higher but don't appear in the aDNA record until the late neolithic, so it is still very much a possibility. | I found the full paper by Busby et al., I think this work is the most revealing. If I understood something, they find that different sets of STRs gave different values for T, the average coalescence time. Two properties of the STRs are necessary for the methods using av.st.dev.: the mutation rate and range of possible alleles that the STR can take. The mutation rate itself was taken as either EMR or GMR in the previous papers, without taking into account the properties of STRs. Most importantly, the cline in diversity for R-M269 is an artifact. The haplogroup M269 can be split by R-S127 into European and western Eurasian lineages which do not show a correlation with longitude. The cline in diversity was the central point of the previous papers you quote about Neolithic demic expansion. The basic line is that it's still too early to trace conclusions, and that the late Neolithic age estimates of R-M269 are likely to be younger than the true values.
ps I am not familiar, is it normal in wave of advance models to have the cline of diversity exactly inverse to the cline of frequency? While I admit it can happen, to me this is a second order effect to the first order, to have the same cline direction. In other words, I would expect this to not happen too often.
| |
Last edited by Eugen; May 28th, 2012 at 03:14 AM.
|
| |
May 28th, 2012, 03:44 AM
|
#68 | | Lecturer
Joined: Aug 2011 Posts: 332 |
The difficulties in modelling microsatellite mutation has been known about since the beginning. They used to add caveats such as Weale's:
"Finally, we accept that our inferences are based on population genetic analyses that assume a particular model of microsatellite evolution under selective neutrality and growth and that departures from these assumptions may influence our results."
Although such caveats seem to be absent from many later studies, it has been more a case of not having to repeat it rather than it being solved. The school of thought was however that increasing the number of microsatellites studied would iron out errors associated with a small number of microsatellites. However, Busby concludes that choice of microsatellite is more important than number of microsatellites.
This then affects how one measures diversity. It can be unreliable and diversity for the same group can appear different, depending on which microsatellite values one chooses. I've never been a fan because not only does each locus mutate at a different rate, and there are many to choose from, but the counts can go up or down. The type of mutation is called an indel, short for insertion or deletion. If for a particular locus two individuals have the same STR count, say 13 but for the state before, one may have been 12 whilst the other was 14. When they are both 13, they have a genetic distance of 0 for that locus, but it is expressed as a probability because previous states could have been 12,13; 13,12; 12,12; 14,14; 12,14; 14,12 and in some cases we get an insertion or deletion of 2 counts, eg. 15 mutates to 13. What can appear to be less diverse is actually masking a lot of diversity, in some cases, if the wrong microsatellites are chosen.
In addition to the large amount of uncertainty as to previous state, the mutation rate itself is far from certain, because we cannot base it on observations which we can test. It is a constant process of re-evaluating various hypotheses.
If you are interested in this aspect, have a look at this paper, now over ten years old: Genetic variation at twentythree microsatellite loci in sixteen human populations | | |
| |
May 28th, 2012, 10:29 PM
|
#69 | | Archivist
Joined: Dec 2011 From: Bucharest Posts: 133 | Quote:
Originally Posted by authun The difficulties in modelling microsatellite mutation has been known about since the beginning. They used to add caveats such as Weale's: "Finally, we accept that our inferences are based on population genetic analyses that assume a particular model of microsatellite evolution under selective neutrality and growth and that departures from these assumptions may influence our results."
Although such caveats seem to be absent from many later studies, it has been more a case of not having to repeat it rather than it being solved. The school of thought was however that increasing the number of microsatellites studied would iron out errors associated with a small number of microsatellites. However, Busby concludes that choice of microsatellite is more important than number of microsatellites.
This then affects how one measures diversity. It can be unreliable and diversity for the same group can appear different, depending on which microsatellite values one chooses. I've never been a fan because not only does each locus mutate at a different rate, and there are many to choose from, but the counts can go up or down. The type of mutation is called an indel, short for insertion or deletion. If for a particular locus two individuals have the same STR count, say 13 but for the state before, one may have been 12 whilst the other was 14. When they are both 13, they have a genetic distance of 0 for that locus, but it is expressed as a probability because previous states could have been 12,13; 13,12; 12,12; 14,14; 12,14; 14,12 and in some cases we get an insertion or deletion of 2 counts, eg. 15 mutates to 13. What can appear to be less diverse is actually masking a lot of diversity, in some cases, if the wrong microsatellites are chosen.
In addition to the large amount of uncertainty as to previous state, the mutation rate itself is far from certain, because we cannot base it on observations which we can test. It is a constant process of re-evaluating various hypotheses.
If you are interested in this aspect, have a look at this paper, now over ten years old: Genetic variation at twentythree microsatellite loci in sixteen human populations |
Thanks for the paper, I read it a bit. The second part, gene diversity analysis, was pretty educative. For myself, to have it here, I write what I have not to forget. In my notations:
Ht, total gene diversity (entire population)
Hg_i, i=1..5 withing group g_i total gene diversity
Hg_i_p, within population p inside group g_i total gene diversity
Gg, gene diversity between groups
Gg_i(p), gene diversity between populations within group g_i
Ht is a little larger than the averaged Hg_i_p, most of the genetic variation is formed by interindividual
differences
Gg>average Gg_i(p), but Gg_i(p) are not uniform
Smaller and isolated populations show a smaller Hg_i_p and much larger Gg_i(p) (follows the
predictions of the stepwise mutation-drift model of microsatellite variation)
Smaller and isolated populations might undergo population bottlenecks and genetic drift, thus
generating high allele frequency differences in these populations
Disease causing loci show higher Hg_i_p, therefore higher mutation rate
Large populations show low variation in Gg_i(p), which are small and uniform
Small populations show high variation in Gg_i(p), which are from small to large
Some loci have a tendency to produce smaller Gg_i(p), possibly due to homogenization of populations
by convergent forward-backward mutations of high rate
Hg_i is the highest in Africa
The mutational mechanisms at microsatellites are different from those of traditional serum protein
markers. Mutations at microsatellite loci cause contraction as well as expansion of allele size.
| |
Last edited by Eugen; May 28th, 2012 at 10:44 PM.
|
| |
May 29th, 2012, 12:06 AM
|
#70 | | Lecturer
Joined: Aug 2011 Posts: 332 | Quote:
Originally Posted by Eugen Thanks for the paper, I read it a bit. The second part, gene diversity analysis, was pretty educative. | Things will have changed to some degree since that was written as the population models have improved, things like founder effects, so don't take it all as given. It's best just to google microsatellite diversity in human populations and genetic diversity in human populations. For microsatellites were mean STRs and, mostly, genetic diversity means SNPs. So you could also google STR diversity and SNP diversity. Quote:
Originally Posted by Eugen Mutations at microsatellite loci cause contraction as well as expansion of allele size. | Yes, as I wrote above, these mutations are indels, insertions or deletions, so they can go down. If you have a look at this list of Y-STR markers, you can see what the repeat sequence is, GAAA for DYS385 for example, the range of the number of repeats, from 7 to 28 in the case of DYS385 and the mutation rate for that particular locus.
| | |
| | | Thread Tools | | | | Display Modes | Linear Mode |
Copyright © 2006-2013 Historum. All rights reserved.
|  |