A new 5-step perturbation-based multicollinearity diagnostic package
MetadataShow full metadata
The ordinary least squares method, for estimating unknown parameters of a multiple linear regression (MLR) model, produces an idealistic solution if the column vectors (regressors) of the design matrix X are linearly independent. However, in a typical MLR setting, true linear independence of the regressors is often an unrealistic situation. Multicollinearity arises as two or more predictor variables departure from linear independence, thus, providing the model with redundant information and causing problems in the MLR parameter estimation and inaccurate statistical inference. The degree of multicollinearity directly reflects the amount of redundancy or interdependence among regressors and the inaccuracy of the MLR inference. Several statistical and analytical detection methods exist and are commonly used to diagnose multicollinearity. These diagnostic methods often produce a measure that reflects the degree of multicollinearity present in the overall model or among the individual regressors. However, these diagnostic methods generally fail to breakdown complex multicollinearity relationships among the regressors. There is also lack of a methodology that combines perturbation analysis with the available diagnostic measures. In addition, several observational strategies are often overlooked and underutilized for diagnosing multicollinearity. Therefore, we develop a new R package, mcperturb, that encompasses several multicollinearity observational strategies and employs a new 5-step perturbation-based method. This package can identify the regressors that may be the main source of the multicollinearity problem. The outputs from the mcperturb package provide a comprehendible opportunity to observe the relatedness between two or more variables on a deeper level than the currently available multicollinearity diagnostic packages.