Our towns, villages and regions Oziemb這wski surname in Poland in year 2002 Classical
genealogy Another
graphics

_Main page > Genetic genealogy YDNA of Oziemb這wski Family > Statistical investigation of STR YDNA haplotypes in I1M253P (I1ASP) subclad 
Genetic
genealogy YDNA of Oziemb這wski (Ozi瑿這wski) Family 

Statistical
investigation of STR YDNA haplotypes in I1M253P (I1ASP) subclade using of "multidimensional scaling" and "cluster analysis" author: Maciej Oziemb這wski 

[ASP
subclad means Anglosaxon in Polish, Pomeranian
or Prussian variant] 

Two
identical (for Y67) haplotypes of two men with Oziemb這wski surname
(kit 169071 and 191847) are part of ASP subclad
(Anglosaxon in Polish/Pomeranian/Prussian variant). These haplotypes
are treated as 1 common haplotype "Oziemblowski" in the next
part of the text. More about I1ASP subclad you can find in Polish language
on my another web site in the context of FTDNA
project "NormansCE". 

Preparing
of data 

1) 20 haplotypes of STR 67 markers YDNA were taken from NormansCE project and 2 haplotypes from Ysearch database (i.e. Matz, Reck) classified into I1ASP subclade by Peter Gwozdz, so there were 22 haplotypes taken for study. Values of 67 STR YDNA markers for all 22 haplotypes can be found in Tab. 1 (three parts: 1a, 1b, 1c). 

2) The next step was the standardization of data. The reason for that procedure was to avoid effect that the markers with value of eg 3435 would be "stronger" in the model as compared to markers with eg 89 value. New values after standardization can be found in Tab. 2a and 2b. Identical marker values within all 22 haplotypes occured for 22 markers, so there were finally only 45 markers (6722=45) taken into account for the next step. 

3) "Distance matrix" was created according to Tab. 2, what was shown in Tab. 3. There is "a number" for each pair of compared haplotypes. Higher numbers = less similarity between compared pairs of haplotypes. Smaller numbers = more similarity between compared pairs of haplotypes.


MULTIDIMENSIONAL
SCALING 

4) Matrix in the Tab. 3 was the base for the next calculations in the multidimensional scaling method. It was stated (according to Fig. 1) that 2 or 3dimensional model will be good for the next calculations. 2D model was finally chosen. Shepard's Diagram (Fig. 2) confirmed, that 2D model correct described similarity of STR YDNA haplotype pairs (points are close to "stairs"). Each point on Fig. 3 represents one pair of compared haplotypes. Some points are overlapped, but there is 231 pairs of compared haplotypes what can be calculated with the use of Newton binomial or direct from the next expression: (22*21)/2. Fig.
1
Fig.
2 

5) Similarity of 22 haplotypes (persons) is shown on the 2dimentional model (i.e. on surface). Points closer to each other = higher similarity of haplotypes. Three haplotypes are the most different as compared to the next 19 haplotypes. That 3 haplotypes are in "Far outher space". Fig.
3 

6) Two groups of haplotypes can be manually "demacrated" when "zoomingin" into 19 haplotypes (Fig. 4): "Middle Outer Space" and "Inner Space". These 19 points are a little moved right (as compared to 0,0 point) because of the most different 3 points (Fig. 3) influence. Fig. 4 shows 19 points on selected surface of 22 haplotypes 2D model. Fig.
4 

7) Three the most different haplotypes were excluded from the next model. The same steps for 19 haplotypes were repeated: "scree" diagram and choosing of 2D model (Fig. 5) and later Shepard's diagram (Fig. 6) which seems to be OK (points representing all of haplotypes pairs are close to "stairs"). There were only 171 pairs of haplotypes to be compared [171 = (19*18)/2]. 2D model was plotted for 19 haplotypes (Fig. 7). It was found 10 haplotypes located in nucleus cluster in the centre of coordinate system ("Inner space"). Bembnista and Pietruszewski haplotypes were the most close to 0,0 point. Johnsen haplotype (one of the 9 haplotypes from "Middle outer space") was the most close to "Inner space" haplotypes. Each new haplotype added to (or removed from) model has influence on points (or haplotypes=persons) distribution onto multidimenstional model (for instance 2D model = surface). Each new haplotype "brings" something new into model and each removed haplotype "takes something away" from the model. Fig.
5
Fig.
6
Fig.
7 

CLUSTER
ANALYSIS 

8)
"Cluster analysis"
method was also choosen for comparision of haplotypes in I1M253P (I1ASP)
subclade. The data was the same as in "Multidimensional scaling"
method, i.e. matrix in the Tab. 3 obtained in the way described in steps
13 ("Preparing of data"). 

9) There were 5 tree diagrams plotted (Fig. 812) as a results of 5 different ways of "binder", i.e.: single binder method (Fig. 8), full binder method (Fig. 9), average binder (Fig. 10), specific gravity method (Fig. 11) and Ward's methods (Fig. 12). Fig. 812 confirmed generally observations from "Multidimensional scaling" method, but the matrix used in both methods (MS & CA) was the same. Nevertheless a little another clusters were created in Ward's methods of "Cluster analysis". Fig.
8
Fig.
9
Fig.
10
Fig.
11
Fig.
12 

10.
It can be concluded that both MS and CA methods created clusters
with different degree of inner similarity. "Phylogenetic tree"
created with specialised genetic software could be helpful in more accurate
"calibrating" MS/CA methods used as a complementary way of STR
YDNA haplotypes clustering in specific subclades. 

_Main page > Genetic genealogy YDNA of Oziemb這wski Family > Statistical investigation of STR YDNA haplotypes in I1M253P (I1ASP) subclad  
Contact: maciej@oziemblowski.eu


This web site
"oziemblowski.eu" was created on December, 2009. Last modification:
12.12.2010  Maciej Oziemb這wski (c) 