Binge drinking by adults Data from CDC, Linear Discriminant Analysis Approach

What Linear discriminant analysis (LDA) is theoretically:

  1. It is an overview of Ronald Fisher’s theory on Linear discriminant analysis.
  2. It has wide application practice in statistics, pattern recognition and machine learning.
  3. It is used to locate linear groupings of attributes that distinct and or splits two or more classes of objects or outcomes.
  4. The grouped attributes of the classes can serve as a linear classifier, dimensionality or feature reduction before going to classify it correspondingly.

In addition, LDA is very similar to analysis of variance, regression analysis, which on the other hand also works to establish one dependent variable as a linear grouping of other attributes or metrics.

It is imperative to mention here that, Analysis if Variance, an arm of Mathematical Statistics applies categorical independent variables, continuous dependent variable respectively, and a categorical dependent variable (i.e. the class tag) meanwhile, a discriminant analysis possesses continuous independent variables as well as a definite independent variable.

There are similarities between Logistic regression and probit regression which share akin to LDA than ANOVA. The key concept of LDA function or method is that it takes independent variables that are normally spaced whereas these two do not.

 One very notable fact here is that LDA shares proximity to principal component analysis (pca), which is a major feature enzyme to feature reduction in large datasets. It is also close to factor analysis, for the fact that it both PCA and Factor Analysis seek to get linear relationships of variables which has the line of best fit on huge datasets. While LDA attempts to build the uniqueness within classes of huge datasets, PCA never refers to account on any modification between in a class. Factor Analysis on the other hand structures the feature togetherness based on differences rather than similarities. The major dissimilarity Discriminant analysis has from factor analysis is the absence of interdependence; a unique feature between independent variables and dependent variables.

Linear Discriminant Analysis performs better when metrics produced on independent variables for each event(s) on target are constants, this inherently is known as discriminant correspondence analysis.

Incremental LDA: When executing LDA method, datasets must be availed in advance. There are a few occasions that streaming data serves as the input dataset. This will make provision for feature extraction and resource to update estimated LDA features through observation of new datasets excluding the algorithm function on the entire dataset at once. A typical instance of this is Robotics-face recognition, an arm of Artificial Intelligence, computer vision, it would make perfect sense to update LDA feature ASAP, once new datasets are availed. This technique of on-the-fly, dataset update by LDA method is known as Incremental process.  It is an active area of data mining research. Some of the current literature reviews on this include, Self –organized LDA algorithm for increments, LDA feature update – Catterjee and Roychowdhury. Secondly, Demir and Ozmehmet “suggested online local learning algorithms for informing LDA features incrementally using error-correcting and the Hebbian learning rules”.

Whilst, Demir and Ozmehmet recommended online local learning algorithms for informing LDA features incrementally using error-correcting and the Hebbian learning andd Aliyari et al. postulated fast incremental algorithms to update the LDA features by observing the incoming samples.

Applications of LDA in Software Engineering

Bankruptcy prediction: Linked to accounting values, monetary variants, LDA is very useful.

Face recognition: The basic logic behind face detection or recognition is array based matrix depicted by pixels on a computer screen. For every new dimension is a linear adding of pixelate values which form a regular pattern. Fisher based LDA yields Fisher faces, and PCA based analysis are referred to as eigenfaces.

Marketing: Differentiating between rules that distinguish customers and/or products on the grounds of user generated data. This is achieved by:

  1. Conceiving the problem and gathering useful data.
  2. Determine the Discriminant function coefficients and find the statistical significance.
  3. Represent the outcomes on a two dimensional map, get the metric and proffer solutions.

Bioinformatics: In medical practice, discriminant analysis helps in determining the consequence of patient diagnosis of a health ailment and eventual outcome.

Earth Science: LDA has been proven to be very articulate in pattern recognition within huge data sets.