A Small Y-DNA Haplogroup:  E1b1b1a1b1a6 (L540)

21 Apr 2014

Peter Gwozdz

pete2g2@comcast.net

News

 

           21 Apr 2014 update of the L540 Neighborhood table.  There are 14 samples confirmed positive with the L540 test;  12 of these are predicted C type and 2 are not.  There are 9 additional samples confidently predicted C type based on STR marker values;  these need the L540 test for confirmation, and in some cases more STR markers to improve C type prediction.  In addition there are 23 more samples predicted C type with lower probability;  these need more STR markers to improve C type prediction and these also need the L540 test for confirmation.  In addition there are many more “neighbors” listed in the table where L540 C type probability is very low for each individual sample;  these would benefit from the L540 test;  some of these would probably come out positive meaning L540 but not C type and most of these would come out L540 negative meaning STR neighbors.  The table also lists 9 STR neighbors that have tested L540-.

 

Abstract

           Rewrite 13 Mar 2014:

           This web document is a summary of my information on a small haplogroup of Y-DNA based on an SNP mutation named L540.  The subject is genetic genealogy.

           There is a Neighborhood table below with a list of samples (men) predicted to belong to the L540 haplogroup, and also samples predicted to be in the Neighborhood just outside L540.  The samples near the cutoff (borderline fit) are the ones that should be tested for the L540 SNP to see if they belong to the L540 haplogroup;  probability of belonging decreases with the step number, as explained in the discussion below that table.

           This Abstract is for people reasonably familiar with the jargon of genetic genealogy.  If you are new to genetic genealogy you might prefer to first read an Introduction that I wrote for another of my web documents.

           L540 was discovered in my Walk Through the Y (My WTY).  I purchased WTY, a commercial product that reads more than 200,000 base pairs of the Y chromosome.

           The clade that we now call L540 was originally called cluster C, a hypothetical cluster proposed as a division of haplogroup E1b1b1a1b1a, which is defined by the SNP named V13.  Hence I coined the name V13C.html for this document about cluster C in early 2010.  I rewrote this document and renamed it L540.html on 30 Apr 2011.  E1b1b1a1b1a (V13) is the largest haplogroup division of haplogroup E, but cluster C is small.

           I am not planning a separate L540 project, because it is more convenient to run this informally through the E-M35 project.  Villarreal and Lancaster have been very helpful.

           My equivalent name for V13C was “C type”, or just “C”.  I independently verified C type on 9 Jan 2010 as a good candidate type.  I use the word “type” to mean an STR cluster with statistical validity as established by my Mountain Method.  I published my methods in the Fall 2009 issue of JoGG.

           Two of the L540 positive samples in the Neighborhood table do not belong to C type, so C type is a predicted branch of L540.  I use “L type” to mean samples predicted to be L540 using STR values.

           C type is quite young, perhaps less than 1,000 years old.  The L540 mutation is older, maybe more than twice the age of C.  Such age estimates are uncertain, particularly with so little data in this case.

           L540 seems to be roughly 90% C type plus 10% other, older branches, as explained in the next topic about L540.

           Watch this document.  I’ll add more information as data accumulates.

 

L540

           Update 21 Apr 2014:

           L540 is the code name for an SNP that was discovered in my WTY, announced 29 March 2011.  On 27 Apr 2011 it was demonstrated that L540 defines a new haplogroup within E1b1b1a1b1a (V13).

           I use the code name L540 for the SNP, for the associated haplogroup, and for the samples (men) in that haplogroup.

           The L540 haplogroup includes C type, and C type is most of L540.  The C type samples that have been tested are all L540+.

           The Neighborhood table below has my predictions for L540 and for C type.

           My sample has been tested negative for all the 7 confirmed branches of V13, so L540 is an 8th branch of V13, not a branch of one of the prior 7 branches.

           Three of the 7 branches are too small - few or no samples available on-line for testing:  M35.2, V27, and P65.  In April 2014, V27 and P65 were dropped from the ISOGG tree (those two considered “private”) so L540 is now considered the 6th known branch of V13.

           Samples from the other 4 main branches of V13 are available;  I recruited and paid for testing as needed:  L143, L250, L17, and L241.  All are L540-.  That means none of these are branches of L540.           There are L241 samples in the Neighborhood table;  this is evidence (not proof) that L540 and L241 might be brothers, with a common undiscovered SNP branch of V13.

           Anyone in the Neighborhood table would benefit from ordering the L540 test.

           ISOGG names change as new SNP divisions are discovered.  L540 was officially added 10 Jul 2011.  The code name is now E1b1b1a1b1a6 (L540).  Code names change as new branches are discovered.  For example in early 2011 V13, the father of L540, was called E1b1b1a2, so the original code for L540 was E1b1b1a2h, but the codes for V13 and L540 changed as new older SNP branch nodes were discovered.

           There are 14 samples in that Neighborhood table confirmed positive for the L540 mutation, including 12 labeled “C” for predicted C type and 2 labeled “L” for L type L540 not predicted C.  An additional 9 samples without the L540 test are predicted C type with 80% or greater confidence based on STR values, labeled “C”.  An additional 23 samples in the table are predicted C type with low confidence, labeled “C?”.  One sample is predicted “L?”.I suppose a couple of those 23 might end up being L “paratype” samples when more STRs are measured, and perhaps many of them will be confirmed C type.  I suppose a couple of the samples not predicted L540 might turn out to be L540 positive. In other words, the Neighborhood table seems to represent roughly 35 L540 samples (men).  Watch that table for more as data accumulates.

 

Cluster C

           Rewrite 25 Mar 2013:

           Clusters are based on STR correlation.  There are 50 samples predicted C type in the Neighborhood table, labeled “C”.  Some of these are marginal, with fewer than 67 markers, with only 80% confidence of belonging.  26 of those are listed at the haplozone site, in the V13 + L540 branch.  Cluster C includes me and my 3rd cousin (Gwozdz).

           Friedman proposed cluster C more than 4 years ago, based on STR and SNP correlations, when the data was less than what is available today. 

           New samples appear when Friedman updates cluster C, or when I update the Neighborhood.

 

C Type

           Rewrite 24 Mar 2013:

           I use the word type for an STR cluster with statistical validity as established by my Mountain Method.  “Type” is my own term.  I chose the word “type” because it is not generally used in genetic genealogy and I wish to distinguish my types from haplogroups and from other clusters.  By “type” I mean the cluster data, the hypothetical clade, the modal haplotype, and the set of all possible haplotypes, at any number of markers.  Accordingly, by “C type” I mean any or all of these 4 things.  I sometimes use just “C” as short for “C type”.  I also have a previous C type identified in R1a;  unrelated;  please don’t get confused.

           My analysis files define C type.  Sorry, it can be a bit confusing because I have multiple STR definitions for C type, for various marker sets.  The number of markers in my definitions change slightly when new samples show up with unusual STR values.  I hope the meanings are clear from the context of my discussions in this web document.

           I also provide STR definitions for L540, discussed below, treating L540 as L type.

           C type is roughly 90% of L540.  My evidence:  Considering only the independent samples with 67 or more STR markers in the Neighborhood:  There are 17 type L samples and only 2 do not fit C type but have tested positive (L540+) for the L540 SNP test:  I say “independent” because some C type have been recruited based on known genetic relationship to C type men (my Gwozdz cousin, Kargol, Svercel);  these should not be counted in this estimate.  I do not include samples with fewer than 67 markers in this estimate, because type prediction has lower confidence with fewer markers, so I have been actively incouraging the L540 test primarily for samples with 67 markers.  Notice in the table that most borderline samples (near the L type cutoff) have L540 test results.  I say “roughly” for this estimate because I have been recruiting samples for this table at 67 markers with equal emphasis on borderline samples, in order to properly sample the STR borderline, but it is difficult to prove no bias toward samples that fit well;  also there is a chance of outliers showing up in the future;  either L540- among those samples that fit C type very well;  or L540+ among beyond the borderline.  Also, with only 2 L540+ outside C type, the sampling confidence is not good.

           The evidence today indicates that L540 is much older than C type - details below.

 

V13C

           Rewrite 12 Apr 2014:

           I coined the name V13C in 2010 to represent C type, cluster C, the hypothetical haplogroup, and the samples (men) in the hypothetical haplogroup.  This web document used to be named V13C.html.

           Now that C type is a subdivision of L540 I am editing away the name “V13C”, but I’ll continue to use “C type” for the hypothetical clade that is part of (most of) L540.

           V13 is the defining SNP for E1b1b1a1b1a, so I similarly use “V13” to mean the “father” haplogroup - the large branch in the Y-DNA tree from which L540 is a small twig.  I also use “V13” to mean the associated database of V13 samples at E-M35 or at Haplozone, or at other databases.

 

L67Type.xls Analysis File

C67Type.xls Analysis File

           Update 13 Mar 2014

           http://www.gwozdz.org/L67Type.xls is analysis of L type STRs, which are predicted equivalent to L540.

           http://www.gwozdz.org/C67Type.xls is analysis of C type STRs, which is a predicted branch of L540.

           L45(67) means a modal haplotype for the L540 haplogroup using 45 of the 67 standard markers.  My definition for L540 is L45(67), all samples less than the cutoff (genetic distance, or step) 6.  L45(67) was new on 21 Mar 2013.  I also typed L45 into Ysearch as 479H7.  My Mar 2014 analysis indicates that L45(67) is still my best STR definition for the L540 haplogroup.

           Similarly, C42(67) with cutoff 3 is my definition of C type, introduced 11 Mar 2014, Ysearch QAZ7P.

           Please refer to the Neighborhood table, where these definitions and others appear as columns.

           My previous definitions are available in those files, sheet “Haplotypes & Masks”.

 

37 Markers

 

           17 July 2011 comment:  This and the following 2 topics are based on my Feb 2011 analysis.  I update less often at fewer than 37 markers.  These 3 topics are C type only;  data is not good enough yet for L540 analysis below 67 markers.

           I also have a file using only 37 markers for analysis:  www.gwozdz.org/C37.xls.  189 samples.

           The SBP using all 37 markers is 28%.  That 37 marker column, cutoff 9, captures the 10 samples assigned to cluster C and none others.  However, the other columns, using the best markers, consistently capture only 8 of those samples, casting some doubt on the 2 marginal ones.  It seems 67 markers are statistically required for marginal cases.

           The C67.xls file does a correlation;  the data is copied here below:  The samples with all 67 markers are evaluated using only the first 37.

 

25 Markers

 

           This topic needs update.

           I also have a file using only 25 markers for analysis:  www.gwozdz.org/C25.xls.  228 samples.

           The standard 25 marker STR set is used by a number of Y-DNA testing companies.  The Haplozone data include Sorenson data that is not in the E3b data, and 3 of these land in cluster C.  Those samples have kit numbers starting with “S” in the Neighborhood table below.

           At 25 markers there is no valid C type.  SBP comes out greater than 100% for any combination of markers.  The SBP formula gives a result regardless, but the result is meaningless far above 50%.

           The modal haplotype C25 (all 25 markers) captures all 13 cluster C samples plus 7 others, cutoff 5.

           C18(25) is the best definition I found from the 25 marker set, but even that definition is not satisfactory, because it does not correlate with the results at 37 and 67 markers;  see the Excel analysis file.  Just about any definition using 25 markers captures the samples that fit well at 67 markers, but at 25 markers different definitions capture different marginal samples.

 

12 Markers

 

           I also have a file using only 12 markers for analysis:  www.gwozdz.org/C12.xls.  I used the full database but truncated that analysis to the closest 33 samples to keep the file small.  The 3 Sorenson samples in the table below are not in this database.

 

Best STR Markers

           New topic 14 July 2011:

           STR markers that mutate relatively slowly are statistical indicators for clades in which they are recently mutated, but they are not perfect because of subsequent independent mutations.  When a clade has a few such good STR markers those provide a signature set of STR markers.  A signature is statistically expected to be a more probable indicator of a clade than just one marker.  Indeed cluster C is characterized by the Friedman Signature.  The definitions of C type and L540 use other helpful markers, not just the signature.

           For example DYS389II is the best STR indicator for cluster C and C type because all but three of the cluster C and C type samples identified so far at the Haplozone site have the 32 value and very few other samples in the STR neighborhood have the 32.  The ancestral 30 value is most common in the neighborhood.  Those three exceptions have a 31 value, which is not common in the neighborhood.  We expect that subsequent mutations from 32 to 31 to 30 must occur rarely within C type, so eventually C type samples with 30 should show up, as data accumulates with time.

           My analysis files automatically rank markers using a method that I published.  The exact ranking of markers varies slightly from month to month due to the random nature of mutation values in new samples, and due to the somewhat arbitrary cutoff that I use to restrict the database to the neighborhood (using too many samples provides a ranking of the father clade instead of the clade of interest).  For example a sample that ranks 6th one month might come out 4th or 7th the next month.  For example 389II always comes out 1st for C type but ranks 5th (11 Jul 2011 analysis) for the father haplogroup L540.

           An SNP that defines a haplogroup is very unlikely to have happened exactly at the time of the most recent common ancestor (TMRCA) of a haplogroup.  Most likely the SNP is somewhat older, because usually there are many generations between nodes.  By definition an SNP cannot be younger than the haplogroup.  Similarly, we can consider a clade defined by a particular STR mutation, which is likely somewhat older than the TMRCA of that clade.  However, for clusters defined by signatures, and for types defined by definitions, one rare STR mutation that contributes to the signature might have happened shortly before or after the TMRCA of that cluster or type.

           Very slow mutators should make the best markers.  However the slowest are rarely mutated, so those with intermediate rank show up more often as signature markers.  My “Haplotypes & Mask” sheet in my analysis files has the mutation rate rank (slowest is 1st) for the 67 standard markers.  My publication has the Chandler reference.

           Usually it is silly to speculate about clusters defined by a single STR value.  In this case, however, we have a hypothetical haplogroup, C type, which seems quite young, with relatively little STR variation, so some speculation is in order:

 

DYS389II = 32  (389II minus 389I = 19);  Best Marker for C type

           Update 15 July 2011:

           DYS389II=32 is one of the Friedman markers for the C cluster.  It always ranks 1st in my C type analysis files.  The standard 12 marker set, used by all samples at most DNA companies, includes 389II.

           [Technical detail:  DYS389 is a compound marker, where 389I is the first STR chain and (389II minus 389I) is the second STR chain.  So the marker of interest here is really delta = 19 (389II minus 389I = 19).  However, 389I mutates more slowly and has the value 13 for all C samples and for most samples in the neighborhood.  At Ysearch or Haplozone, both 389 markers need to be used together;  if one is omitted both are ignored.  My analysis file allows 389-2 to be used alone, using 389-1 only to calculate the delta for comparison;  this is signaled by using a negative number in the “mask” in the analysis file.  In this discussion topic, by “32” I really mean 19 for the delta value.]

           The two L540+ samples that do not fit cluster C, Fredeen and Gebert, have the ancestral value 30.  Butman, the closest STR match with L540-, also has 30.  On this basis, it seems likely that the mutation(s) to from 30 to 32 happened close to the TMRCA for C type, and some time after the L540 mutation.

           DYS389II is the only signature marker that distinguishes C type from the 2 known L540+ samples that are not C type.  None of the other 110 markers in the standard set at FTDNA does this.  The other signature markers work well for both C type and L540.  It is possible with more data that another marker might statistically distinguish (weak correlation due to relatively rapid mutations).  On this basis, it seems likely (not certain) that the clade of descendants of the initial 389II mutation is the same as the C type clade.

           Those two non C L540 samples differ from C type by other markers that are not signature markers.  C type has only this one very good marker.  Actually, a type does not need any very good markers if it is very young and very isolated, so that neighborhood samples all differ at a significant number of STR markers, even if not the same markers for each sample.

           The 32 value is rare throughout V13 but shows up in E-M35 branches outside V13.

           DYS389II (actually the delta value) ranks 43rd in Chandler mutation rate.  Near the middle.  So exceptions are expected, due to recent mutations.

           Speculation:

           Model A:  A mutation from 30 to 31 happened close to the TMRCA for C type.  A little later in the history of C type another mutation happened from 31 to 32.  Most C type samples with 31 represent the oldest nodes, and only a minority are back mutations from 32 to 31.  The samples in the STR neighborhood with 31 that do not match C are independent mutations.  This seems to me the simplest model, so I favor it, but only tentatively.

           Model B:  There was a double mutation from 30 to 32 in one man close to the TMRCA.  Or two single mutations too close in time to be distinguished.  Almost all 31 in C are back mutations, most of them from a single subclade.

           Model C:  The 32’s that do not match C belong to the same 389II=32 clade, but there was a population bottleneck.  C is only one of two or more nodes, from MRCA’s who survived to produce descendants with corresponding STR clusters today.  Only C is large enough to be noticed so far, due to a population expansion for C.  This model predicts at least one other small 389II=32 clade will be discovered as L540+ branches with STR values different than C.

           Model D:  I can think of more complicated models.

           The data is not good enough to distinguish these models.  Maybe more data in the future will show correlation with other markers to distinguish a model like A through D.

           Model I:  The initial mutation to 31 (or double to 32) is very close to the same age as C type, so the mutation(s) defines C type.

           Model II:  Mutation(s) younger than C type.  Eventually samples with the 30 value will show up, isolated in haplospace together with the C samples.

           Model II:  Mutation(s) older than C type.  Eventually L540+ samples with 31 or 32 will show up that are too old to fit C type.

           Models I vs II vs III cannot be distinguished from back mutations and outliers until a new SNP is discovered to distinguish them.

 

DYS594 = 12;  Best Marker for L540

           Update 15 July 2011:

           In my analysis, DYS594=12 is the best marker for L540, and is also a good marker for C type.  594 is not in the 37 marker set.  594 helps a lot in defining C type and L540 using the 67 marker set, but does not distinguish C from the rest of L540.

           The 11 L540 samples with 67 markers, including 2 that are not C type, all have the 594=12 value.  Butman, the closest STR match with L540-, has the ancestral 11.

           Two samples in the neighborhood have 594=12 but are L540-.  These are not a random sample;  I recruited them based on STR matches closest to, but beyond the 10 closest matches to, C type at 67 markers.  Other 12 values have not been tested for L540.

           The 594=12 value is more common in the L540 neighborhood than in the rest of the V13 data.

           DYS594 ranks 12th in Chandler mutation rate.  Quite slow, so independent recent mutations should be rare.

           Speculation:

           Model A:  The 11 to 12 mutation in DYS594 is significantly older than L540.

           Model A1:  Quite a few branches, both younger and older than the 11 to 12 mutation, survived the population bottlenecks.  Only C had a significant subsequent population expansion, so only C stands out today.  The other 11 vs 12 branches will not be distinguished by STR values because they are too small and too old to be isolated in STR haplospace.

           Model A2:  The 11 vs 12 branches in the neighborhood will be distinguished by STR values when they are all evaluated for the L540 SNP, and when enough STR data is available to identify the signatures.

           Model B:  The 11 to 12 mutation is not much older than L540.  There is only one other significant independent 12 mutation in the neighborhood outside L540.  By luck.  That clade is the reason there are more 12s in the neighborhood.

           Future data will probably eliminate one or more of these models, and perhaps suggest other models.

           Model C:  The 11 to 12 is younger than L540.  An old 11 branch with L540+ will be found as data accumulates.  If this happens, STR data will not likely have enough correlation to distinguish if such a branch is really due to a back mutation.  A new SNP would probably need to be discovered.

 

Other Good Markers:  DYS390=25, DYS444=13, DYS406=11; DYS456=15, CDYb=33, DYS447=25

           Update 15 July 2011:

           These typically rank among the best in my analysis files, usually in about the order listed in the title here.  Good for both C type and L540 prediction.  The exact ranking is very sensitive to the choice of database.  In the close neighborhood of L540, 390=25 and 406=11 do very well.  Using the entire E-M35 database, 444=13 does better;  by luck 444 does not have any major clades with the 11 value;  there is a cluster in E1b1b1c1a (M84) that has samples with the L540 signature (389II minus 389I)=19 and 594=12, but that interfering cluster in M84 has 444=11, two steps away from L540 at that 444 marker.  This is an example of why the database should be restricted to reasonably close STR samples for analysis.

           Chandler rank for these, in the same order:  47th, 49th, 35th, 60th, 67th, 45th.

           390 is tied with 594 for 1st place in my current L540 analysis file, but I suppose it will end up in 2nd place because it has a higher published mutation rate.

           Models for 390 are similar to the models for 594.

           None of the other markers in the title here are as good as 389II for C type or as good as 594 or 390 for L540.  Each clearly has confounding mutation in the data.  For example, 444 has one L540+ sample with the ancestral 12 indicating that L540 is older, but there is also one L540- sample with the signature 13 indicating L540 is younger;  one of those must be an independent mutation (or an error).  More data will help this get sorted out.

           Many complicated models can be constructed combining 2 or more good markers.  Complication comes from figuring out the age order of the markers.  More data might point to a compelling model.

 

Signature C4

           Update 15 July 2011:

           An excellent signature for C type is (389I, 389II, 594, 444) = (13, 32, 12, 13).  Seven of 9 C type samples with 67 markers have this signature, and the two that miss are at step 1;  no other samples in the neighborhood have step 1;  Gebert is the only one at step 2.  In the vast E-M35 Haplozone database there is only one confounding sample at step 1, but that one is from E1b1b1c1a;  all others differ from this signature by 2 or more steps.  In other words, this 3 marker signature, cutoff 2, extracts all the C type samples and none others from V13 data.  Eventually, of course, exceptions will turn up.

           There are better markers than 389I.  I included that one because it enables C4 in the search function at the Haplozone site.

 

Friedman Signature

           Update 29 Mar 2014:

           The signature is (390, 389-2, 447) = (25, 32, 25).

           Friedman had been calling this the “characteristic marker values” for cluster C at the Haplozone site before I started working on this, back in 2008, when there were only 9 samples available in cluster C, including mine.

           This original Friedman signature by works surprisingly well by itself for samples with only 25 of the standard markers, but not with high confidence.  For more details, see the discussion about C3(25) below the Neighborhood Table.

           In early 2011 Friedman added 594=12 to the “characteristic marker values”, for 67 marker samples.  For more details, see the discussion about C4(67) below the Neighborhood Table.

           DYS389 is a compound marker, discussed above.

           Friedman used a more complicated analysis than just this simple signature in her C type assignments.  I do not know what her method was exactly, but most definitions (not all) that I tried, selecting well ranked markers, extracted the same samples that she did.

 

CDYb = 33;  Another Good Marker

           Update 16 July 2011:

           The marker CDYb is very unusual in L540;  10 of the 11 samples have the value 33.  See my 67 marker analysis file.  It ranks tied for 6th.  Just beyond L540 in STR step from the definition, less than half the samples have the 33 value, and some of them might turn out to be L540+ because most are not tested.

           The CDY pair is the most rapid mutator of the 67.  I have never seen a cluster or type that is so uniform for one of the CDY markers.

           It is possible but very unlikely this is a coincidence.  If it is just a lucky coincidence, then as data accumulates over the months lots of C type samples should show up with values other than CDYb = 33.  I originated this CDYb topic in April 2010, and the accumulated data since then has strengthened the evidence that CDYb=33 dominates what is now L540.

           Of the 5 tested among the 31 samples at steps 7 to 10 beyond L540:  2 are 33 and 3 are 34.

           My hypothesis:  There is a mutation within the CDYb chain.  Either a point mutation, or a foreign insertion, or a deletion (a deletion that is not a simple STR chain deletion of a motif, but a removal of only part of an STR motif).  It is known that a mutation within an STR that spoils the motif effectively splits that STR into two small STRs.  Smaller STRs have lower mutation rates.  A spoiler mutation near the middle of the main CDYb chain would turn that marker into a slower STR mutator.

           (By the way, 447 is known have two such defects, so 447, which looks like a long STR, is really 3 short STRs, so 447 is not very rapid, and in fact 447 is a good marker, discussed above.)

           CDY, also called DYS724, is a compound marker, so that means one of the pair often copies onto the other, providing equal values, CDYa = CDYb.  This is called recLOH, my publication has references if you want to read more about recLOH.  Or check Wiki, where there is an explanation that 459, 464, and CDY are all on the same “palindrome” P1, where P1 has two arms that are mirror copies.  Sometimes all three of these markers get the values copied from one arm of P1 onto the other arm.  A mutation at CDYb, making it unlike CDYa, would make copy mutations less common.

           Such seems to be the case, providing more evidence for a CDYb spoiler mutation.  There are no recLOH mutations in CDY or in 464 in the L540 data.  Of course there is not enough data yet to be compelling.  The 459 marker has both values = 9, so an recLOH would not be noticed there.

           It is not clear if the ancestral CDYb value is 33, or 34.  Both are common in the neighborhood.

           Model A:  More than a millennium ago, for a man in the L540 male line, not very long before or after the L540 mutation, a mutation destroyed the middle of the CDYb STR chain, turning CDYb into a unique marker, distinct from CDYa, and a much slower STR mutator than CDYa, and less likely to combine with CDYa in an recLOH event.  This rare mutation happened in a man who ended up with the equivalent net value of 33 at CDYb after the mutation.  A descendant of this man would be the most recent common ancestor (MRCA) for the clade corresponding to this mutation.  There were some normal STR mutations at CDYb in the descendants of that MRCA, and some of those mutated CDYb men were lucky enough to have male descendents living today, and one of them shows up in our C type data today with the 34 value.

           Although more data will add evidence to this model, I doubt the data will be good enough to determine if 33 or 34 was ancestral.  If 33, samples from old nodes might show up, which I doubt will be distinguished older or younger than that spoiler mutation, based on STR analysis.  If 34 is ancestral, I doubt the data will distinguish older branches from more recent back mutations.

           More models can be constructed along the lines of the models discussed in previous topics above.

           In a previous version of this web document I speculated that this CDYb mutation might define a new haplogroup.  I asked Thomas Krahn at FTDNA about sequencing my CDYb to prove there is an SNP in there.  Krahn explained that the P1 palindrome is very difficult to sequence with standard methods because the data is a mixture from the two arms of P1.  Krahn also pointed out that even if the SNP were proven it would not be accepted as a haplogroup division because an recLOH can still happen, and if CDYa gets copied onto CDYb that wipes out the SNP in the clade descending from that recLOH.

           My 2010 versions of this topic pointed out that the CDYb mutation is definitely older than C type, because the 33 value predominated in the very near neighborhood of C type.  That prediction has been validated.  The two L540+ from outside C type both have the 33 value.  With my new L540 definition, the 33 is not dominant beyond the L540 data, but of course a few of those that are there might end up in L540 when they are tested.

 

DYS636 = 12;  A New, Excellent Signature Marker for L540

           New topic 3 July 2011:

           DYS636 is not one of the standard 67 STR maker set, but has been available this year as part of the extended 111 STR marker set.  I have been encouraging men in the L540 neighborhood to purchase the extension to 111, helping out with the cost where necessary.  I was hoping that there would be a slowly mutating marker among those extra 44 with a mutation unique to the L540 haplogroup.  Sure enough, DYS636 provides such a marker.

           Of the 64 samples (2 Jul 2011) in the E-M35 database with all 111 markers, 9 are L540, and all of them have DYS636 = 12.  The others are DYS636=11 with only two exceptions at DYS636=12, but those two both have many STR mutation differences from L540 and V13, and are not predicted to belong to the V13 parent, so those are obviously independent rare mutations.
           The two L540 samples that are not in C type have all 111 markers now, and indeed carry the DYS636=12 value (Gebert & Fredeen).

           The one nearest neighbor to C type that came out L540- has all 111 markers now, and indeed carries the ancestral DYS636=11 value (Butman).

           It is not a coincidence that the critical samples (all the samples from the 67 marker data near the L540 cutoff) have all 111 markers so soon - I recruited the data and paid for it as needed.

           13 of the 111 marker samples are confirmed V13 (in addition to the 9 from L540), plus a few more are predicted V13.  (2 Jul data - will increase quickly because there are several more with panels beyond 67 obviously in process for 111).

           Actually, my main motive for encouraging 111 markers was to better subdivide L540.  No luck yet.  There is no slowly mutating marker among those new 44 that obviously mutated during the history of L540, like 389II for C type.  Recall that DYS389II=32 is the best marker for C type, distinguishing C from the parent L540.  None of the new 44 does this.  With more data I might find a reasonable way to further subdivide C type;  there are several hints in the data on how to do this with combined markers, but none of them are compelling.

           Recall that DYS594=12 is also unique to L540, also with ancestral 11.  This pair of STRs provide a firm foundation from which to notice any new clusters in the L540 neighborhood as data accumulates.

 

P Cluster:  (385a,439,447,464c,445) = (16,13,26,17,11)

           Update 30 Sep 2011:

           C type includes “P cluster”, a Polish cluster defined by that 5 marker signature.  The cluster is not convincing, because there are only 3 samples, me and two others that I recruited.  I call it a cluster because I reserve the word type for clusters with statistical significance.

           DYS445 is one of the markers recently available in the extension to 111 markers.  The other 4 markers in the signature are part of the standard 25 marker set.  As luck would have it, there are no other markers that correlate with these in the 67 marker set, so only 25 markers are required to match a sample to P cluster within the L540 haplogroup, using 4 of those markers.

           Each of these markers individually is variable.  There are quite a few E1b1b1a1b(V13) samples that match all 4 of those out of 25.  There are no matches in the STR neighborhood of L540 and C type, although eventually a match should show up just due to the luck of random mutations.  Within L540 and C type, only one sample so far matches the signature at two of these markers, and 4 other samples match at only one.

           DYS385a=16 is ancestral to L540, so most samples in the neighborhood outside L540 match this signature.  This marker apparently mutated to 17 near the time of the origin of L540, then mutated back to 16 near time of origin of C type.  Fredeen has the 16 value;  Gebert has the 17 value;  this is an inconclusive hint that Fredeen belongs to an older branch.  I say “inconclusive” because Fredeen also has the ancestral value at DYS444, but Gebert has the ancestral value at DYS406.  Both have numerous mutations so they are probably members of two separate old branches of L540,

 

The signature can only be used for L540+ samples, where the signature has no other matches in the data to date.  Fredeen matches at one of the two;  more such single matches are expected eventually with enough data, due to the luck of recent mutations.

           Because both values of the signature are ancestral, it is reasonable to wonder if cluster P corresponds to the oldest clade in the C type data.  However, Gebert has the mutated L540 values (17,25) for this pair;  Gebert is L540+ but does not match C type, so he seems to belong to a branch with node older than C.  It is not convincing to speculate that these two markers both mutated in C type close to the node for cluster P, and that Gebert by luck mutated to the same C type values in an older branch.  It is more convincing to assume that cluster P is not necessarily the oldest clade, and that these are the only 2 out of 111 STR markers where this young clade has unique markers, and by luck those 2 markers are both back mutations to the values ancestral to L540.  To add confusion, Fredeen, from another L540 branch older than C, has (16,25) for this pair, matching cluster P on the first but ancestral on the second;  it is reasonable to assume Fredeen has one independent mutation matching cluster P, because neither marker is slow;  both have variation of value in the neighborhood.

           Another unconvincing speculative model:  cluster P might be older than all the L540 data, with an independent 389II mutation of 2 steps matching C by coincidence.  This is unreasonable because cluster P matches C type at many markers;  the step count matches C even with 2 steps added.  Samples with older nodes than C each mismatch C at different markers, for higher step counts.

           Most C type men have German ancestry.  It makes sense that a C man moved to Poland and founded cluster P a few centuries ago.  Of course, future Polish samples need not match cluster P because another C type man may have also moved to Poland (or multiple male line ancestors may have each moved a short distance, diffusing into Poland).

 

L540 Neighborhood

           Update 21 Apr 2014:

           L540 is small enough that I can insert a complete table here, including neighbors just beyond in STR values.

           Those numbers are STR step, which is mutation count from the Modal Haplotype;  the columns are explained more in the notes below the table.

           Violet numbers are L “paratype” (L540 samples not in C type), where step less than the cutoff means predicted members of the L540 haplogroup.

           Boldface means confirmed:

                       + vs --- means confirmed positive vs negative by the L540 test.

                       L241 means positive for another haplogroup, implying negative for L540.

                       There are many more negative L540 results from outside this neighborhood (higher step).

           Red step numbers are predicted C type, a predicted branch of the L540 haplogroup.  Boldface means 80% or higher confidence that a future SNP will be discovered, confirming these samples in a future haplogroup branch.  C? means predicted C type at less than 80% confidence, based on the STR step number in boldface.

           Black step numbers are greater than the cutoff.  Without SNP testing, even with high step number, there is a low probability that a sample might be an outlier member of L540. 

           Data sources:  e = E-M35 project, h = Haplozone,  y = Ysearch

 

 

 

 

 

 

 

 

Modal>

C90

(111)

C111

(111)

L77

(111)

L3

(111)

C42

(67)

C67

(67)

L45

(67)

C4

(67)

LnotC3

(67)

C15

(37)

C37

(37)

C12

(25)

C25

(25)

C3

(25)

C12

(12)

Note

 

 

 

 

 

 

 

Cutoff >

7

16

8

2

3

12

6

2

1

6

8

2

3

2

1

 

Kit

Ysearch

L540

Ancestor

Origin

Data

Type

Markers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N16800

KFKGM

+(WTY)

Gwozdz 

Poland

ehy

C

111

3

9

3

0

0

7

1

1

5

3

5

1

4

1

2

 

N45041

UQR4B

+

Hochreiter

Germany

ehy

C

111

4

7

2

0

0

4

2

0

4

4

4

0

3

0

1

 

155155

 

+

Svercl

Czech

eh

C

111

4

15

5

0

1

8

1

0

5

1

5

0

2

0

2

 

N81304

 

(+)

Gwozdz

Poland

eh

C

111

4

11

4

0

1

9

1

1

6

4

7

1

5

1

3

1

140927

9JM9U

+

Donovan

Prussia

ehy

C

111

4

11

3

0

1

6

2

1

3

2

3

1

1

1

1

 

51282

 

+

Wion 

Germany

eh

C

111

6

13

1

0

2

10

3

0

4

5

7

1

4

0

1

 

199446

TK98K

+

Kargol

Poland

ehy

C

111

6

10

6

1

1

6

2

1

5

2

4

1

4

1

2

 

225596

6S4J6

+

Nowak

Poland

ehy

C

111

6

12

5

0

1

9

4

0

4

2

5

0

0

0

0

 

166692

8FTXT

+

Gebert

Germany

ehy

L

111

13

17

7

0

6

10

5

2

0

2

6

2

4

2

3

 

162917

 

+

Fredeen 

Sweden

eh

L

111

17

23

6

0

8

18

5

2

0

10

13

3

6

2

4

 

N91348

 

---

Butman

England

e

X

111

17

20

15

3

6

11

8

3

2

4

6

2

2

2

2

 

N39989

5N5MF

---

Hohnloser

Germany

ehy

X

111

20

28

20

4

11

18

12

3

2

7

9

2

3

2

3

 

5960

V93B3

 

Bartlett

England

ehy

X

111

20

27

18

3

9

16

11

3

2

7

11

3

6

2

4

 

N58717

CV7WB

 

Bartlett

Unknown

ehy

X

111

21

29

19

3

9

18

12

3

2

8

13

3

8

2

5

 

98212

98212

L241

Baber

England

ehy

X

111

22

33

21

3

10

22

11

5

3

11

15

5

9

4

5

 

105741

3FVPX

 

Malay

Slovakia

ehy

X

111

22

32

20

4

9

20

15

5

1

11

16

4

8

3

4

 

 

 

 

5 samples

 

e

X

111

23

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

---

5 samples

 

e

X

111

24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3 more

 

e

X

111

24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2 more

 

e

X

111

 

 

19

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

V13 Modal

 

e

X

111

23

25

21

3

9

17

14

6

3

8

12

4

7

4

4

 

 

 

 

L241 Modal

 

e

X

111

25

33

25

3

10

24

14

5

3

13

17

4

8

4

5

 

 

 

 

L143 Modal

 

e

X

111

29

34

25

3

10

20

15

5

2

11

15

4

7

4

4

 

171456

79QF7

 

Glasser 

Germany

ehy

C

67

 

 

 

 

0

3

1

0

4

1

3

0

1

0

0

 

320415

 

 

Micek

 

e

C

67

 

 

 

 

0

4

1

0

4

1

4

0

2

0

1

 

229581

 

 

Zinin

Unknown

eh

C

67

 

 

 

 

1

4

2

1

3

1

3

1

2

1

2

 

262750

 

+

Svercel

Slovakia

eh

C

67

 

 

 

 

2

8

2

0

4

1

4

0

1

0

1

 

175213

5XP46

 

Burlik Stelz

Germany

ey

C

67

 

 

 

 

2

7

2

0

4

2

5

1

2

0

0

 

243901

FSQXZ

 

Stubblefield

Unknown

ehy

C

67

 

 

 

 

2

10

2

0

3

5

9

1

6

0

2

 

E10751

 

 

Schulz

Germany

 

C

67

 

 

 

 

2

7

3

2

5

5

7

2

5

2

4

3

6104

4HJ3D 

 

Boyd

Unknown

ehy

C

67

 

 

 

 

2

10

3

0

4

4

9

0

1

0

0

 

207878

 

 

Frind

Germany

eh

C

67

 

 

 

 

2

10

3

0

4

4

7

1

4

0

2

 

70482

6HMRD

+

Ostholm 

Sweden

ehy

C

67

 

 

 

 

2

10

4

1

4

4

6

1

2

1

1

 

226416

 

+

Sabieka

Belarus

eh

C

67

 

 

 

 

2

11

5

2

5

7

10

3

7

2

5

 

174240

 

 

 

Unknown

 

C

67

 

 

 

 

2

3

2

1

3

1

2

1

1

1

1

3

 

WHFQB

 

Froetscher

Germany

y

C?

67

 

 

 

 

3

15

6

1

3

8

12

1

4

1

2

 

97005

CBF87

 

Strejc

Austria

ehy

X?

67

 

 

 

 

6

16

9

4

3

8

11

3

8

3

5

 

310951

 

 

Petrov

Russia

e

X?

67

 

 

 

 

6

18

9

3

2

9

13

2

7

2

5

 

E7459

8K6VZ

 

Casado

Croatia

ehy

X?

67

 

 

 

 

6

17

11

5

2

9

14

5

9

4

6

 

E8272

 

L241

Abdurrah

Kosovo

eh

X

67

 

 

 

 

7

15

9

4

2

8

10

4

7

4

4

 

25780

5DQ2B

 

Wilson

England

ehy

X

67

 

 

 

 

7

20

12

3

2

9

14

3

7

2

2

 

143479

 

 

Mastel

United K

eh

X

67

 

 

 

 

7

17

10

4

3

8

12

4

8

4

3

 

24437

 

 

Harvison

Scotland

h

X

67

 

 

 

 

7

23

16

3

3

14

17

2

6

2

3

 

199300

EJ4B6

 

McKrell

Unknown

ehy

X

67

 

 

 

 

7

18

10

3

2

11

15

3

8

3

5

 

 

 

 

4 more

 

y

X

67

 

 

 

 

7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

27 samples

 

ey

X

67

 

 

 

 

8

 

 

 

 

 

 

 

 

 

 

 

N109412

BYHHR

 

Howe 

Unknown

eh

C?

37

 

 

 

 

 

 

 

 

 

1

3

0

2

0

0

 

158091

QHU8Y

+

Kline 

Germany

eh

C

37

 

 

 

 

 

 

 

 

 

2

5

1

2

1

2

 

B3807

 

 

Stavbom

Sweden

eh

C?

37

 

 

 

 

 

 

 

 

 

4

9

1

5

0

4

 

141863

W5JHS

 

Pohl 

Germany

eh

C?

37

 

 

 

 

 

 

 

 

 

5

8

1

3

1

3

 

B2670

X2JH9

 

Bogdanski

Germany

e

C?

37

 

 

 

 

 

 

 

 

 

5

7

2

5

2

2

 

N106293

GJNU6

 

Beasley

USA

e

X?

37

 

 

 

 

 

 

 

 

 

6

14

7

12

1

8

 

275510

3K5CF

 

Roider

Germany

e

C?

37

 

 

 

 

 

 

 

 

 

7

9

0

4

0

1

 

 

 

 

8 more

 

e

X?

37

 

 

 

 

 

 

 

 

 

7

 

 

 

 

 

 

177898

B6CUR

 

Miller

Germany

e

L?

37

 

 

 

 

 

 

 

 

 

8

13

5

8

4

5

 

 

 

 

7 more

 

e

 

37

 

 

 

 

 

 

 

 

 

7

 

 

 

 

 

 

 

Q8JRJ

 

Spooner

USA

y

C?

37

 

 

 

 

 

 

 

 

 

1

3

0

1

0

0

 

 

9P4Z5

 

Sager

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

2

5

1

1

1

1

 

 

2N3UM

 

Oppitz

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

3

7

1

4

1

2

 

 

EDS4E

 

Haenicke

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

3

5

1

2

1

1

 

 

V6X4V

 

Fitze

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

3

6

0

1

0

0

 

 

3K4Y2

 

Lintner

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

4

8

0

4

0

1

 

 

4Q933

 

Kephart

USA

y

C?

37

 

 

 

 

 

 

 

 

 

4

6

2

3

2

2

 

 

A9FVE

 

Weiand

Germany

y

C?

37

 

 

 

 

 

 

 

 

 

4

6

1

4

0

1

 

 

WME5S

 

Cervenka

Hungary

y

C?

37

 

 

 

 

 

 

 

 

 

5

10

1

6

0

2

 

 

UF6K3

 

Spatz

Poland

y

X?

37

 

 

 

 

 

 

 

 

 

6

9

3

6

3

4

 

 

 

 

More

 

y

X?

37

 

 

 

 

 

 

 

 

 

7

 

0

 

 

 

 

S10193

 

 

Engel

Germany

h

C?

34

 

 

 

 

 

 

 

 

 

 

 

0

1

0

1

 

S10194

 

 

Kochtitizky

Hungary

h

C?

34

 

 

 

 

 

 

 

 

 

 

 

0

3

0

1

 

A10196451

 

 

Stavbom

Sweden

h

C?

34

 

 

 

 

 

 

 

 

 

 

 

1

6

0

4

 

A2983

 

 

Undisclosed

Austria

h

C?

34

 

 

 

 

 

 

 

 

 

 

 

1

4

1

1

 

 

 

 

22 more

 

eh

X?

25

 

 

 

 

 

 

 

 

 

 

 

2

 

 

 

 

N26163

R38X2

 

Fritsch

Czech

ehy

C?

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

2

N39377

 

 

Obendorf

Germany

eh

C?

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

2

N57225

XKCE3

 

Livingston

Germany

ehy

C?

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

2

284871

 

 

Knotz

Austria

eh

C?

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

2

 

Ysearch

 

6 more

 

y

C?

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

2

 

Summary:  See the News topic at the top of this web page for a summary of this table.

 

Explanation of the modal haplotype columns in the table; update 29 Mar 2014:

           C90(111) is my new best modal haplotype definition for prediction of C type, a hypothetical branch of L540, using 90 of the 111 standard STR markers.  The cutoff is 7;  notice that there are no samples in the gap at steps 7 through 12.  No doubt future samples will show up in the gap, at which time the cutoff and the definition may need slight adjustment.  My analysis file http://www.gwozdz.org/C111Type.xls is available if you are interested in the details.

           C111 is the modal haplotype using the full 111 STR set.  It just barely works, with a gap of step one at the cutoff value of 16.

           L77(111) is my new best modal haplotype definition for prediction of L540 type, using 77 of the 111 standard STR markers.  L540 type is predicted equivalent to the haplogroup L540.  The cutoff for L540 type is 8;  notice that there are no samples in the gap at steps 8 through 14.  No doubt future samples will show up in the gap, at which time the cutoff and the definition may need slight adjustment.  My analysis file http://www.gwozdz.org/L111Type.xls is available if you are interested in the details.  Notice that two samples in the table meet L77 but not C90, so those two are predicted to belong to older branches of the Y-DNA tree - older than the node for C type.  C type samples do not differ much in STR values, which means the C type clade is quite young.  Those two L type samples that are not C type differ a lot more in STR values, meaning their nodes are probably older.  Future data may provide more such samples, at which time the L type definition will likely need adjustment.  In this respect, L type predictions are not as reliable as C type predictions;  there may be future samples with L77 step greater than 15 that turn out L540+.  Such a probability is low for each particular sample, so I used boldface “X” for samples in the table with step greater than 15, even without the L540 test, because each particular sample has greater than 80% confidence, in my estimation, of not belonging to L540.  The probability might be 1% or 20% that any one sample at L77 steps 15 to 20 might be an L540 outlier;  I do not know how to estimate such probability, except that I’m sure such probability is not very high and decreases with increasing step.

           L3(111) is the L540 STR signature for L540 type, using only 3 markers, and it nicely selects only the confirmed L540+ samples.  L3 is (594,636,561) = (12,12,17).  That file L111Type.xls shows that any number of markers from 3 to 105 can be used to distinguish the L540 samples, but this will surely change in the future as L540 samples with unusual STR mutations show up in the database.  Actually, any one of those three L3 markers is sufficient to identify the L540 samples in this L540 Neighborhood table, but there are a few samples outside the V13 father haplogroup that match at any one of the three due to the large number samples in the full E-M35 database.  Notice that the L3 signature markers are not among the standard 37 set.  The signature markers that strongly distinguish C type from the rest of L540 are all in the standard 67 set (see C4(67) below);  the data so far do not suggest more signature markers in the 111 set for C type, although a few are useful in the C90 definition discussed above.

           L111, the modal haplotype using the full 111 STR set, is not in the table, because L111 differs from C111 at only two STR markers, which are very variable, so for most samples L111 differs from C111 by step < 2. L77 step numbers differ from C90, because the same markers are not used;  C90 markers are selected to best distinguish C type, while L77 markers are selected to best predict L540+.

           Nearest Neighbors at 111 markers:  The table includes a few samples just beyond the C90 cutoff of step 8.  These are all the samples I could find with C90 step less than 23, or with L77 step less than 19.  These “near neighbors” act as a calibration of the modal haplotypes using 67 or fewer markers - data on the right side of the table at 111 markers.  I hope the meanings of those code names are obvious in light of the 5 examples explained above.  For more discussion, see the topics C67Type.xls, L67Type.xls, C37Type.xls, C25, and C12.  Notice that the definitions C42(67) and L45(67) work, but without a wide gap.  As expected, 67 STR markers provide less confidence in assignments than 111 markers.  At 37 or fewer markers there is yet less confidence, and the defintions fail for a few samples.

           LnotC3(67) is a signature that seems to predict L “paratype” samples (L540 not C type):  (439, Δ389, 413b) = (11, 17, 25).  My confidence is not high in this signature, because it is based on only two samples, which might be just the luck of random mutations.

           C4(67) is the signature used by Haplozone cluster C since before L540 was discovered:  (390, Δ389, 447, 594) = (25, 19, 25, 12).

           C3(25) is the original Friedman signature, proposed years ago.  For this table I used the difference for DYS389: (390, Δ389, 447) = (25, 19, 25).  The table shows that it still works remarkably well.  However, there is selection bias, because some samples at C3 step 1 were not included in the table because so far all these have 37 or more markers and do not fit C type using the corresponding definition in the table;  some of these belong to V13, the father haplogroup of L540.  At C3 step 0 there are only a few more samples in the database but these are from outside V13 so these are not listed in the table.  Note that Sabieka is L540+ at C3 step 2, and a few others at step 2 are predicted C type based on more markers.  On the other hand, the table has at C3 step 2 two samples L540- and a few more predicted well outside C type.  The table indicates 22 more samples with only 25 markers at C3 step 2;  no doubt a few of these might eventually test L540+, but the probability for each one individually seems to be low.

 

Explanation:  The simplest explanation for what this table means: Butman (N91348) has a male line with node in the Y-DNA tree slightly older than the L540 mutation, so Butman is the closest “neighbor” to L540 but not in the L540 haplogroup.  Gebert (166692) and Fredeen (162917) each have male lines with nodes in the Y-DNA tree younger than the L540 mutation.  The node for C type is much younger.  More complex explanations are possible;  for more discussion see the Structure topic.

 

Notes;  column in the table at the far right:

           1:  Sample from a 3rd cousin of an L540+ sample, so assumed L540+ without testing.

           2:  The 12 marker set at step zero (perfect match to the 12 marker modal haplotype) provides a low confidence prediction of which samples might benefit from the L540 test.

           3:  Two samples, E10751 and 174240, are not in the E-M35 Project;  these two were brought to my attention by Paul Svercl (in the table), who noticed them in an E haplogroup tree by Marko Heinila, but that tree is no longer on-line.

           I use the Ysearch method for calculating step, which gives a result slightly different than the Haplozone method.  There is also a one marker discrepancy mentioned in the Ysearch topic.

           If you are a neighbor and wish to be added to this table, please let me know.

 

Gwozdz

           My sample is kit N16800.  N81304 is my 3rd cousin Gwozdz.

 

Kargul

 

           Aloysius Kargul (Kargol) is my closest STR match available on the web.  Kit 199446.  In May 2010, his daughter noticed, on ancestry.com, that he and I are perfect matches at 12 markers.  I studied the LDS microfilms and located his 1820’s Kargul ancestor living in a village in Poland only 20 miles away from the village of my Gwozdz ancestor.  I paid for his FTDNA sample.  Kargul is in the table above.  His L540 test came out positive, placing him in that new haplogroup.  We are 5 steps apart at 67 markers;  9 at 111 (4 Jul 2011 update).

           For estimating the size of C type or L540, my cousin and Kargul should not be included, because I recruited them, paying for their tests.  Family sets such as these distort size estimates, compared to random data.  In other words, C type really has only 16 samples, not 18 (on 4 July 2011) if compared to other clusters (which should also be adjusted for family sets).

 

Butman

 

           New topic 13 May 2011.  Update 17 Jul 2011:  Butman’s L540 SNP test just came out negative;  that means he is not a member of the new L540 haplogroup.

           Raymond Butman, kit N91348, is right on the edge of the predicted C type using the old 61 marker definition.  This sample is a recent addition to the M35 database.  His step using the 61 marker definition is 8.  My cutoff for the definition before this sample showed up was 7 because of the gap - no samples from step 7 to 12 at that time.  When I wrote this topic after Butman’s sample showed up, I changed my cutoff to 9 and I pointed out here that this sample might land in V13C (L540), but it was a close call, not a confident prediction.

           Although this sample matches most of the markers of the definition, it misses at the two best signature markers, DYS389-2 and DYS594.

           I subsequently developed a new L540 definition that excludes Butman.

           What does this mean?  The simplest explanation:  Butman’s node in the Y-DNA tree (his male line common ancestor - branch joint) seems slightly older than the nodes of L540 members (in the database so far today).  His node is older than the L540 mutation, and also seems older than the DYS389-2 mutation, and also older than the DYS594 mutation.  Most of his other STR values match the L540 definition because his node is not much older, so there has not been much time for more mutations.

           This simplest explanation is a good statistical prediction, not a proof.  Other less likely explanations are possible.  For example Butman might be an outlier from a clade very distant from L540, where most men have different STR values, where he might have many matching STR values to L540 due to the luck of random mutations in his male line.  Yet another possible explanation:  Butman might belong to a very small clade with a much older node with L540, even older than the node for the men at steps 7 and 8, but the ancestor at that node might just happened to have STR values very close to the values for the L540 ancestor, due to the luck of random mutations, and now Butman is the only sample available from that very small clade.

 

Gebert

           Update 30 Sep 2011:

           I noticed Gebert sample on Ysearch and encouraged him to join the E-M35 project, which he did, kit 166692 in the table below.  I helped pay for the orders for the L540 test and for the 111 extension.  This data is important because his STR values place him near the predicted cutoff of C type.

 

Fredeen

           Update 30 Sep 2011:

           Kit 162917, Fredeen, came up L540+ in May 2011.  This was significant because this was the first L540+ sample outside C type, later joined by Gebert.

           Logically, Fredeen and Gebert might be outliers from cluster C, with back mutations in 389II and multiple other mutations just due to the luck of random mutations.  This is very unlikely

           These two samples most likely represent two old branches of L540, with older nodes than the C type node.  They do not match each other well at 67 markers.  Their closest matches at 67 markers are each other and C type samples.  There are no close STR matches to either in the 67 marker data.  Any future close STR matches would of course be predicted L540+ in the corresponding branch.

           Three markers are of interest here:  (385a,406,444) = 17,11,13) are the values for L540 and C type.  Most samples in the STR neighborhood just outside L540 have the ancestral values (16,12,12).  Fredeen is ancestral for two of these, (16,11,13) while Gebert is ancestral for one of them (17,12,13).  This is evidence that Fredeen’s node is older, but the evidence is statistically unconvincing.  We realize that these 3 markers may have experienced mutations after the two nodes of interest.  Both Fredeen and Gebert have 111 marker data, which does not help out for this question.

 

Hohnloser

 

           Hohnloser (kit N39989) fell into C type at 37 markers in 2010 (marginally), but not at 67 markers (not particularly close).  He is not a member of cluster C because his sample does not match the Friedman signature (originally at 25 markers).  He provides an interesting example of how statistics works - in this case, due to the luck of random STR mutations, the sample is close to C type only at 37 markers.

           Hohnloser does not belong to the L540 haplogroup because his SNP test came out negative.

           Hohnloser has extensive family tree research results.  He administers a Hohnloser project at FTDNA.  He exchanged helpful email discussions with me.

 

Structure of the L540 Haplogroup

           Complete rewrite 16 Jul 2011.

           C type is a hypothetical haplogroup within L540.  The evidence is presented throughout this web document, particularly in topics about my 67 marker analysis files and about DYS389II.  My estimate is about 99% confidence that my C type definition corresponds to a clade that will be proven to be a haplogroup by a newly discovered SNP - someday when tests for new SNPs are more comprehensive and lower cost - continuation of the current trend.  My estimate is that 98% (80% confidence range more than 90%) of the samples predicted C type by my definition will end up in such a haplogroup.

           This confidence is based on a combination of statistical calculation, plus judgment where calculation is not possible;  for more discussion see my confidence topic.

           C type seems to be about 80% of L540, based on only 2 L540+ not C type, compared to 7 independent C type samples, for 77.8%, in the 67 marker data.  Confidence in this 80% is not high because it is based on only two samples.  There may be more samples in the STR neighborhood that will test out L540+ in the future, with STR values quite different than my current L540 definition.  Also, there may be L540- samples in the future that match my current L540 definition.

           In other words, I have high confidence that C type predictions are L540+, but not high confidence in the L540 predictions outside C type.

           I expect to update my L540 definition as data accumulates, thereby improving the confidence for L540 STR predictions outside C.

           For now, I recommend the L540 test to everyone in the neighborhood table, with particular emphasis on those with closest step to my current L540 definition, and with the caveat that there may be more L540 beyond that table.

           Cluster P probably corresponds to a small Polish clade within C type.  It may take some time to find an SNP to define such a small haplogroup.

           C type is close to the same as a clade defined by a particular mutation(s) at the DYS389II marker, from value 30 to 32.  We can even speculate that clade is identical to C type data.

           A particular mutation at the DYS594 marker, from 12 to 11, seems to define a “father” clade that is slightly older than L540.  The same might be true for a particular mutation at the DYS636 marker, although more 111 marker data is needed to estimate if 636 is younger or older than 594 and / or L540.  With more data, these two markers will provide a foundation from which the age of other mutations can be estimated, because independent mutations in other markers are unlikely to also have mutations in these two foundation markers.

           The L540 data is bimodal in a number of markers.  Each of these bimodal markers is evidence that there is a significant subclade division than might be determined for L540 structure.  However, these various suggestions point to different divisions;  so far no two of them are strongly correlated.  With more data, it may be possible to split L540 based on statistical correlation of STR markers, using my mountain method.

 

Call for 111 Marker Data

           Update 30 Sep 2011:

           FTDNA provides a 67 marker standard set of STR markers.  I have been using this 67 set for analysis.

           A standard set of 111 markers is now available.  For existing samples with 67 or fewer, an extension to 111 can be purchased from FTDNA.

           I am hopeful that 111 marker data will enable me to construct a high confidence family tree for the L540 haplogroup, as more data accumulates.  Indeed, already the marker DYS636, one of the 111 extension markers, already has provided an additional signature marker for L540.

           Additional markers make it more likely to subdivide haplogroups and types with confidence.  Indeed, already marker DYS445, one of the 111 extension markers, already has provided an additional signature marker for the P cluster division of C type.

           If you are in my neighborhood table above, please consider ordering the FTDNA panels of additional markers.  Some of us have already ordered, as indicated in the table.

           It is helpful for me to include the neighborhood just beyond L540 in this request, for better determination of ancestral STR values.  In addition, the marker CDYb may provide a definition for a “father” clade of L540 if additional markers correlate with CDYb.

           The current 67 marker data includes a few L540+ samples where I paid for the extra data, but it also includes one sample that was in the neighborhood table but has been removed because I paid for the extra data, which showed it is not an L540 neighbor.  So my help is causing a bias in that table above, but not a significant bias yet in my Excel analysis files.

 

Ysearch

 

           Update 17 Jul 2011:

           479H7 is a direct link to the my modal haplotype for L540.

           QAZ7P is a direct link to the my modal haplotype for C type.

           If you are not listed in the table above you can compare your data on Ysearch.  You can compare your step genetic distance to these modal haplotypes if you have the standard 12, 25, 37, or 67 markers.  The comparison may not work if you have a non standard marker set.  The cutoff for each marker set is given in the legend in the table above.

           Brief description of Ysearch.  Link to the site home:  http://www.ysearch.org.

           To join Ysearch, click on the Create A New User tab, where you can upload your Y-DNA STR data from a number of testing services.  Or, you can type in your data.  You end up with a “User ID”.

 

           Instructions for comparison to V13C at Ysearch:

           Click here:  Research Tools (or click on the tab with that name)

           Copy the following line into the “UserIDs” bar at the Research Tools page:

                                  USEID, 479H7, QAZ7P

           Change USEID to your User ID.

           You need to type the Captcha puzzle for access.

           Click on ‘Show genetic distance report” to see your step genetic distance from C type and from L540.

 

Ancestry.com

 

           www.Ancestry.com  is the web page for a commercial DNA testing company.  Men with Y-DNA test results can choose to make results available for matching to others.  Kargul originally matched with me at this site.

           I last checked for matches 16 May 2011.  There are 9 close matches of Y-DNA to Kargul & me, but these are not close enough to include in my Neighborhood Table.

 

Age of C Type

Age of L540

 

           Comment 25 Mar 2013:  this topic needs update modification using the latest data.  The new on-line Excel file versions do not yet have the ASD sheets.  Coming soon.

           Topic update 11 Jul 2011:

           The discussion in this topic is based on the sheet “ASD” in the two 67 marker analysis files.

           Average Squared Distance (ASD) is equivalent to variance of STR values.  Most people use ASD to calculate age in genetic genealogy, as I explain in my publications.  The ASD method has large known systematic uncertainties, discussed in my publications, which make age calculation uncertain.  It is not possible to calculate a confidence range because the systematic errors might be larger than the statistical errors, even for small samples of data.

           C type is quite young.  The age using all 67 markers comes out 805 years, cell N12 on the “ASD” sheet in C67xls.

           Although I do not have high confidence in estimating the exact age of C type, there is additional evidence that C type and L540 are young.  The fact that C37, using the first 37 markers, provides a reasonable definition of C type is evidence of youth.  Old haplogroups do not provide reasonable modal definitions using all 37 markers, because of the wide variation in the rapidly mutating markers.  Another way of saying this:  using all 37 markers, there is a lot of overlap of old haplogroups.  Another way of saying it:  Isolation in STR values is evidence of youth.  My publication elaborates on this.

           The fact that C type samples can be extracted from V13 using only a 4 marker signature is also evidence of isolation and youth.

           Low SBP is evidence that C type and L540 are well isolated;  see the analysis for SBP.

           I expect the age to creep up somewhat as new data is discovered.  The correct well known statistical way to correct for this expectation:  divide by N-1 instead of N when figuring ASD.  My files do not use N-1 because that is not the common practice in genetic genealogy.  Using N-1 the age is 939 years instead of 805.  This estimate includes a best guess for future samples;  eventually samples will show up that fit C type but have more mutations than the samples so far - either because of true older nodes with the Y-DNA tree or just due to bad luck in random STR mutations.

           Conclusion:  My best guess for the age of C type is about 1,000 years. 

           Of course, this result has very low confidence, because it is based on only 7 samples.  The true age might be up to a factor of 4 older.  Or it might be a lot younger.  Although it is not possible to calculate a confidence range that includes systematic errors in age estimation, I suppose a reasonable high confidence range estimate would be 500 to 2000 years old.

           Discussion of C age estimation.  My calculation excludes two samples, Gwozdz and Kargul, whom I recruited, because a random selection is required.  Including them yields 806 years, corrected to 907, not significantly different.  All 67 markers are valid for use in a young clade like this, because there are no recLOH issues in the C data.  Issues in recLOH usually cause problems using all 67.  Those analysis files include comparison to a classic “Thomas” method, which I explain in my publication, and which yields 292 (vs 805) years, but I consider that just a demonstration that the old method uses too few markers, which by luck do not vary much in the C data.  My files allow the user to easily vary the markers and easily vary the samples.  Boyd seems to be the oldest sample;  removing him lowers 805 to 712, corresponding to the age without his hypothetically older clade.  The “oldest” two markers (far right of the ASD sheet has a sort by age) are DYS460 and DYS385a, with 3975 and 3058 years respectively;  this is evidence that those markers might correspond to clades within C type, but the evidence is not compelling yet because some markers are bound to come out old just by luck.  My definition of C type of course excludes such old markers, which is a proper procedure for a good definition.  But it is not fair to exclude old markers in age estimates because the zeros balance the old ones statistically;  C type has 45 markers with zero age - the lucky ones with no mutations.  Many zeros is evidence of youth but also evidence of few samples - only 7.  With more samples there will be fewer zeros.  Most published ASD age calculations include a correction making the result older, but the reason for that correction is population bottlenecks, which reduce ASD;  since C type is large for a very young clade, I’m guessing the age corresponds to a rapid population expansion after which there were no significant bottlenecks.  If I am right, no such correction would be appropriate.  Also, most published ASD ages use N instead of N-1, so a published correction method should be applied to my first number, 805.

           An age calculation for L540 is not appropriate, although L67.xls does it, getting 971 years in cell N12 of the ASD sheet.  That’s not fair.  C type dominates the data, but C type is really only one man, the MRCA.  Our data for L540 is really only 3 men:  the hypothetical C type MRCA, Fredeen, and Gebert.  It is well known that estimating the TMRCA of two men is highly uncertain, just due to the luck of random mutations.  For 3 men it is not much better.  The L540 age (TMRCA) is surely older than C type, just a bit older based on the meager data here of 2 men with STR values different than C type - but not very different at 67 markers than the C type samples with highest step.

           Age of a mutation is of course older than the TMRCA because there should almost always be multiple generations between nodes.  We know the L540 mutation happened after the node for Butman and before the nodes for Fredeen and Gebert.  That assessment will get better with more data.

           I said in the Abstract here that L540 might be twice as old as C type.  That is just a guess.

 

Origin of L540 and C type

           Update 17 Jul 2011:

           The neighborhood table shows that 9 of the 18 cluster C men indicated “Germany” in the “Origin” field of their data.  One of the two L540 men outside C indicate “Germany”.  That is very good (although not convincing) evidence that the ancestors (MRCA) of C type and L540 lived in what is now Germany.  A caveat:  men of German origin are more likely to purchase a DNA test and submit data to web databases.  It is obvious from data searches, in Ysearch for example, that men of east European ancestry are under represented.  This sample bias is difficult to measure, but I doubt the bias is sufficient to rule out a German origin as our best guess.

           The parent V13 haplogroup is concentrated in the Balkans, according to density maps on the web.  I’m guessing that our L540 MRCA lived in Central Europe, but I do not know that.  This cannot be checked with data available today, because the published Balkan Y-DNA data has too few markers to distinguish L540 or C type.  I look forward to the near future when data with more STR markers become available from the Balkans to verify my guess.  L540 SNP data would help, where I expect almost all to be negative from the Balkans.  On the other hand, if L540 is common in the Balkans, that would imply a probable MRCA origin in the Balkans, with subsequent expansion into Central Europe.

           Bird published evidence for a hypothesis that E1b1b1a1b (V13) appeared in England, concentrated at the two locations of ancient Roman garrisons, because of men from Moesia Superior who joined the Roman Legions when the Romans conquered the Balkans.

           Speculation:

           Model L540A:  V13 in Europe springs largely from Roman Legionnaires from the Balkans.  Due to the statistics of Y-DNA, most men do not form lasting clades, but many Balkan Legionnaires were lucky enough in their male line descendants so that many small V13 clades in Europe today correspond to individual Roman Legionnaires.  Because enlistment in Balkan armies, and subsequent enlistment in the Roman army, is largely random from the point of view of Y-DNA, these clades are a random selection from a much larger population, so the Legionnaire founders had very variable STR values.  The clades today have STRs quite different from each other.  Most clades are small enough that no samples, or only 1 or 2 samples, are present from each in the databases today.  The two L540 samples from outside C type, and the one sample just outside L540, and others not yet tested for the L540 SNP, represent such small clades.  C type is an exception, with 18 samples available today.  The MRCA of C type was a descendant of one of these Legionnaires, but that MRCA lived about 1,000 years later, in what is now Germany.

           Model L540A1:  C type is larger just by luck.  Statistically, some clades are necessarily larger than others.  It is not very surprising that one clade is unusually large.  We do not notice small clades, so of course we are now studying C type because it stands out.

           Model L540A2:  C type is larger because a descendant who lived about 1,000 years ago was a king or otherwise very prominent man, so his family grew much faster than others.  His ancestors were not prominent, so C type is isolated, but not more isolated than those other small clades from Roman times.

           Model L540A3:  C type is larger because of a local population expansion during Medieval times.  The C type MRCA was one of many who participated in these good times, but the others were from other haplogroups.  For example, the population expansion might have been associated with a Germanic R1b tribe, where the C type MRCA was an outsider who joined the tribe before the expansion, along with outsiders from other haplogroups.

           Model L540A1:  C type is larger for another population expansion reason.

           Model L540Aa:  The relatives of the Roman Legionnaires stayed in the Balkans.  In the near future L540+ samples will show up from the Balkans.  C type will show up, and it will not be particularly isolated in STR values, because the ancestors will not be the same at the C type MRCA.

           Model L540Ab:  There was a population bottleneck (or 2 or more bottlenecks) in the Balkans during the past 2,000 years.  War, famine, whatever.  There was a later population expansion in other haplogroups, so very few if any L540+ samples will show up in the Balkans.

           Model L540Ac:  That parent population did not get entirely wiped out.  It survives, in a remote area.  Maybe a group of villages in the Balkan mountains.  The population has not grown much over the centuries.  If we go there and test for Y-DNA we’ll find lots of L540 men, some of them C type.

           Model L540B:  Not Roman.  A tribe of barbarians showed up in Germany about 1,000 years ago.  Prior to that, they passed through another region, not necessarily the Balkans, where a lone L540 individual joined them.  This is similar to ModelL540A3.  There are Ba, Bb, Bc variations similar to Aa, Ab, Ac.

           Model L540C:  He didn’t join a Roman army.  He was a medieval trader.  A very charming traveling salesman.  He fathered children all over central Europe, mostly in what is now Germany.  This model has similar variations to models A and B.

           Model L540D:  I can think of other speculative scenarios.  I’m sure you can, too.

           The point of these examples:  we don’t know the history, but C type is unusual in that it is young, small, and well isolated in STR values (a small mountain in haplospace).  It is not closely related to the rest of L540 or V13.  The migration history of the very large V13 parent haplogroup may or may not be relevant to the history of the relatively small C type hypothetical haplogroup.

 

Validity of C Type

 

           Update 10 Jul 2010.  Quite frankly, I was surprised by cluster C.  Friedman did a good job finding this one.  I admit I dismissed it when I first saw cluster C in 2007 because it was so small that statistical significance did not seem possible to me.  I postponed analysis until Jan 2010, independently verifying cluster C as C type.

           By “valid” I mean a cluster whereby most of the samples belong to a single clade, and whereby very few other samples in the database belong to that clade.  In other words, a valid cluster should eventually have a corresponding SNP discovered.  Throughout 2010 I confidently predicted such an SNP here in this topic, although I doubted it would be discovered soon.  L540 turned out to be almost the same as C type, although slightly larger and quite a bit older, as discussed elsewhere in this web page.  As samples predicted C type test L540+, this adds evidence that C type corresponds to a clade.

 

My WTY Analysis

 

           Update 23 Feb 2012:  Fifteen new SNPs were discovered in my “Walk Through the Y” (WTY).  L535 through L547, L614, and L618.  All 15 are available as commercial SNP tests from  FTDNA.

           My WTY test read about 200,000 base pairs in Feb 2011.  In Feb 2012 the test has expanded to twice that many.  For details, here is a link for this "WTY" commercial product from FTDNA.

           I announced 8 new SNPs here on 29 Mar 2011.  The count on 30 Mar was 13 new SNPs in my WTY.  L614 was added in June.  L618 was added in August.  That was a lot more than I expected.  I now realize that’s because FTDNA expanded the number of DNA bases included in WTY just before my test.  Also, I seem to have been the first WTY from E-M78 in quite some time.  Since then, a few others from M78 and V13 have tested, so there are quite a few more new SNPs of interest recently discovered.

           I tracked the status on these 15 SNPs right here on this web page for a year, in detail.  Recently the positions in the Y-DNA tree have been determined for most of these SNPs of interest to me.  I recently removed most of the detail from this page.  I’m leaving the Summary, below, for a while because other people have links to that Summary.  I’ll remove most of this, including the Summary, later in 2012.

           In late 2011 the SNP Tracker was set up, as part of the E-M35 Project, to track all new SNPs of interest.  That’s another reason for me to drop my details here.  That SNP Tracker merges data from WTY, from the 1000 Genomes, and from SNP tests by members of the E-M35 Project.

 

SNP Summary

 

           Update 18 Jan 2013.  For a detailed SNP tree of the E-M35 haplogroup, see the SNP Tracker.  This topic used to have a summary of the SNPs found in my WTY, but the SNP Tracker is now a better place to find an update.  Only L540 defined a new haplogroup.  L542 is equivalent to V13.  The others are all equivalent to known haplogroup SNPs older than V13.

 

SNP Test Orders

 

           SNP tests cost about $39 each from FTDNA if your sample is already there from previous testing.  Click on “Order an Upgrade” from your FTDNA home page (top right), then click “Order an Advanced Test” (do not click on “Order Advanced SNP Test”).  In the box “Test Type” select “SNP”.  Type the SNP code (for example L540) into the “Find” box to search for it.

 

References & Sources

 

           E-M35, a project at FTDNA, is my main source of data.  Previously called E3b.  Link:  http://www.familytreedna.com/public/E3b.  The official name today would be E1b1b1.  ISOGG changes the name when new defining SNPs are discovered, so the name may change again in the future.  M35.1 is the name of the SNP that defines E1b1b1 within haplogroup E.

           Haplozone is a web site for analysis of data from the E-M35 project.  This site has not been updated since September 2013.  Link:  http://www.haplozone.net/e3b/project.  Data from E-M35, plus some data added from sources other than FTDNA, so this database is larger than the E-M35.  Page with a listing of proposed clusters:  http://www.haplozone.net/e3b/project/cluster/.  Page with L540 / C cluster samples:  http://www.haplozone.net/e3b/project/cluster/42.

           SNP Tracker is a web page added to the E-M35 project in late 2011, to keep track of all the new SNP branches in M35.  http://tinyurl.com/e-m35-snps

           The V13 data:  http://www.haplozone.net/e3b/project/cluster/10.  V13 is the defining SNP for E1b1b1a1b1a, a major branch haplogroup in E, and “father” of L540.  That page of data does not have the data for samples that have been assigned to clusters as subdivisions of V13, just the data that does not fit any downstream proposed cluster.  The number code for other clusters can be typed over that “10” to quickly get to other cluster data.

           Cluster C Data:  http://www.haplozone.net/e3b/project/cluster/42.

           Victor Villarreal is an administrator for the E-M35 (E3b) Project.

           Andrew Lancaster is an administrator for the E-M35 (E3b) Project.  Andrew has been particularly patient with me with long helpful email discussions.

           Elise Friedman a co-administrator for the E-M35 (E3b) Project and is administrator for the Jewish E3b project.

           Peter Gwozdz.  That’s me.  pete2g2@comcast.net.

 

Revision History

2010 Jan 14 original draft version

2010 13 updates

2011 Feb - Jun, 12 updates

2011 Jul - 10 updates

2011 Aug - Nov, 6 updates

2012 Jan 1 update of my WTY and status of the SNPs - not finished

2012 Jan 3 update of L542 status

2012 Jan 4 update of the five M78 SNP candidates

2012 Jan 5 update of the seven M34 SNP candidates;  finished SNP update

2012 Feb 23 update of SNP Summary;  remove most of the details of SNP tracking

2012 Mar 16 update Neighborhood Table

2013 Jan 21 update Neighborhood Table, drop SNP Summary details

2013 Mar 9 add Svercl cousin to Table

2013 Mar 21 new L45(67) definition;  more update of the Neighborhood Table;  update not finished

2013 Mar 23 new C46(67) defintion;  more update of the Neighborhood Table;  update not finished

2013 Mar 24 continue update of table;  rewrite of first 6 topics

2013 Mar 25 continue update of table;  edit several topics

2013 Mar 26 continue update of table;  C37 results; not uploaded

2014 Mar 9 update of table; not complete

2014 Mar 10 update of table; not complete

2014 Mar 13 update of table; not complete;  also update of abstract and L540 topics

2014 Mar 18 update of table; not complete

2014 Mar 21 update of table; not complete

2014 Mar 25 update of table; not complete

2014 Mar 26 update of table; not complete

2014 Mar 27 update of table; not complete

2014 Mar 28 update of table; lots of additions and removals at 37 and 25 markers

2014 Mar 29 update of table finished

2014 Apr 21 minor update of table;  update ISOGG code for L540;  change “new” to “small” in the title