### Transcription of Discrimination Among Groups - UMass Amherst

1 **Discrimination** **Among** **Groups** P Are **Groups** significantly different? (How valid are the **Groups** ?). < Multivariate **analysis** of Variance [(NP)MANOVA]. < Multi-Response Permutation Procedures [MRPP]. < **analysis** of **group** Similarities [ANOSIM]. < Mantel's Test [MANTEL]. P How do **Groups** differ? (Which variables best distinguish **Among** the **Groups** ?). < Discriminant **analysis** [DA]. < Classification and Regression Trees [CART]. < Logistic Regression [LR]. < Indicator Species **analysis** [ISA]. 1. Important Characteristics of Discriminant **analysis** P Essentially a single technique consisting of a couple of closely related procedures. P Operates on data sets for which pre-specified, well- defined **Groups** already exist. P Assesses dependent relationships between one set of discriminating variables and a single grouping variable; an attempt is made to define the relationship between independent and dependent variables.

2 2. Important Characteristics of Discriminant **analysis** P Extracts dominant, underlying gradients of variation (canonical functions) **Among** **Groups** of sample entities ( , species, sites, observations, etc.) from a set of multivariate observations, such that variation **Among** **Groups** is maximized and variation within **Groups** is minimized along the gradient. P Reduces the dimensionality of a multivariate data set by condensing a large number of original variables into a smaller set of new composite dimensions (canonical functions) with a minimum loss of information. 3. Important Characteristics of Discriminant **analysis** P Summarizes data redundancy by placing similar entities in proximity in canonical space and producing a parsimonious understanding of the data in terms of a few dominant gradients of variation. P Describes maximum differences **Among** pre-specified **Groups** of sampling entities based on a suite of discriminating characteristics ( , canonical **analysis** of **Discrimination** ).

3 P Predicts the **group** membership of future samples, or samples from unknown **Groups** , based on a suite of classification characteristics ( , classification). 4. Important Characteristics of Discriminant **analysis** P Extension of Multiple Regression **analysis** if the research situation defines the **group** categories as dependent upon the discriminating variables, and a single random sample (N) is drawn in which **group** membership is "unknown". prior to sampling. P Extension of Multivariate **analysis** of Variance if the values on the discriminating variables are defined as dependent upon the **Groups** , and separate independent random samples (N1, N2, ..) of two or more distinct populations ( , **Groups** ) are drawn in which **group** membership is "known" prior to sampling. 5. Analogy with Regression and ANOVA. Regression Extension Analogy: P A linear combination of measurements for two or more independent (and usually continuous) variables is used to describe or predict the behavior of a single categorical dependent variable.

4 P Research situation defines the **group** categories as dependent upon the discriminating variables. P Samples represent a single random sample (N) of a mixture of two or more distinct populations ( , **Groups** ). P A single sample is drawn in which **group** membership is "unknown" prior to sampling. 6. Analogy with Regression and ANOVA. ANOVA Extension Analogy: P The independent variable is categorical and defines **group** membership (typically controlled by experimental design) and populations ( , **Groups** ) are compared with respect to a vector of measurements for two or more dependent (and usually continuous) variables. P Research situation defines the discriminating variables to be dependent upon the **Groups** . P Samples represent separate independent random samples (N1, N2, .., NG) of two or more distinct populations ( , **Groups** ).

5 P **group** membership is "known" prior to sampling and samples are drawn from each population separately. 7. Discriminant **analysis** Two Sides of the Same Coin Canonical **analysis** of Discriminance: P Provides a test (MANOVA) of **group** differences and simultaneously describes how **Groups** differ; that is, which variables best account for the **group** differences. Classification: P Provides a classification of the samples into **Groups** , which in turn describes how well **group** membership can be predicted. The classification function can be used to predict **group** membership of additional samples for which **group** membership is unknown. 8. Overview of Canonical **analysis** of Discriminance P CAD seeks to test and describe the relationships **Among** two or more **Groups** of entities based on a set of two or more discriminating variables ( , identify boundaries **Among** **Groups** of entities).

6 P CAD involves deriving the linear combinations ( , canonical functions) of the two or more discriminating variables that will discriminate "best" **Among** the a priori defined **Groups** ( , maximize the F-ratio). P Each sampling entity has a single composite canonical score, on each axis, and the **group** centroids indicate the most typical location of an entity from a particular **group** . Hope for significant **group** separation and a meaningful ecological interpretation of the canonical axes. 9. Overview of Classification Parametric Methods: Valid criteria when each **group** is multivariate normal. P (Fisher's) Linear discriminant functions: Under the assumption of equal multivariate normal distributions for all **Groups** , derive linear discriminant functions and classify the sample into the **group** with the highest score.

7 [lda(); MASS]. P Quadratic discriminant functions: Under the assumption of unequal multivariate normal distributions **Among** **Groups** , dervie quadratic discriminant functions and classify each entity into the **group** with the highest score. [qda(); MASS]. P Canonical Distance: Compute the canonical scores for each entity first, and then classify each entity into the **group** with the closest **group** mean canonical score ( , centroid). 10. Overview of Classification Nonparametric Methods: Valid criteria when no assumption about the distribution of each **group** can be made. P Kernal: Estimate **group** -specific densities using a kernal of a specified form (several options), and classify each sample into the **group** with largest local density. [ (); ks]. P K-Nearest Neighbor: Classify each sample into the **group** with the largest local density based on user- specified number of nearest neighbors.

8 [knn(); class]. Different classification methods will not produce the same results, particularly if parametric assumptions are not met. 11. Geometric View of Discriminant **analysis** P Canonical axes are X3. derived to maximally separate the three **Groups** on the first axis. DF2. P The second axis is derived to provide additional separation X1. for the blue and green **Groups** , which overlap on the first axis. X2 DF1. 12. Discriminant **analysis** The Analytical Process P Data set P Assumptions P Sample size requirements P Deriving the canonical functions P Assessing the importance of the canonical functions P Interpreting the canonical functions P Validating the canonical functions 13. Discriminant **analysis** : The Data Set P One categorical grouping variable, and 2 or more continuous, categorical and/or count discriminating variables.

9 P Continuous, categorical, or count variables (preferably all continuous). P **Groups** of samples must be mutually exclusive. P No missing data allowed. P **group** sample size need not be the same; however, efficacy descreases with increasing disparity in **group** sizes. P Minimum of 2 samples per **group** and at least 2 more samples than the number of variables. 14. Discriminant **analysis** : The Data Set P Common 2-way ecological data: < Species-by-environment < Species' presense/absence-by-environment < Behavior-by-environment < Sex/life stage-by-enironment/behavior Variables < Soil **Groups** -by-environment **group** X1 X2 .. Xp < Breeding demes-by-morphology 1 A x1 1 x1 2 .. x1 p 2 A x2 1 x2 2 .. x2 p < Etc.. n A xn 1 xn 2 .. xn p Samples n+1 B x1 1 x1 2 .. x1 p n+2 B x2 1 x2 2 .. x2 p .. N B xN 1 xN 2 .. xN p 15. Discriminant **analysis** : The Data Set Hammond's flycatcher: occupied vs unoccupied sites 1 1S0 NO 21 15 75 20 30 0 0 0 0 0 0 0 0 1 1 0 20 40 60 2 1S1 NO 36 15 95 15 35 0 0 0 0 1 0 0 1 0 2 20 20 80 120 3 1S2 NO 30 30 70 10 55 0 0 0 1 2 2 1 0 1 7 140 160 0 300 4 1S3 NO 11 50 70 20 70 0 0 0 0 1 0 0 3 1 5 60 300 0 360 5 1S4 NO 33 40 80 15 65 0 0 1 0 0 0 0 0 0 1 20 160 0 180.

10 49 1U0 YES 3 15 95 20 55 3 0 0 2 1 0 1 1 2 10 80 40 80 200 50 1U1 YES 2 15 80 30 70 5 0 0 1 3 0 0 2 0 11 80 40 180 300 51 1U2 YES 2 65 70 15 70 0 0 0 1 0 0 0 3 0 4 60 60 120 240 52 1U3 YES 30 55 35 25 75 0 0 0 0 3 0 0 3 2 8 20 20 80 120 53 1U4 YES 2 20 95 10 60 2 0 0 0 1 0 0 2 2 7 20 160 40 220 .. 16. DA: Assumptions P Descriptive use of DA requires "no" assumptions! < However, efficacy of DA depends on how well certain assumptions are met. P Inferential use of DA requires assumptions! < Evidence that certain of these assumptions can be violated moderately without large changes in correct classification results. < The larger the sample size, the more robust the **analysis** is to violations of these assumptions. 17. DA: Assumptions 1. Equality of Variance-Covariance Matrices: DA assumes that **Groups** have equal dispersions ( , within- **group** variance-covariance structure is the same for all **Groups** ).