hidden pixel

Confounding Variable Information

In statistics, a confounding variable (also confounding factor, lurking variable, a confound, or confounder) is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable. The methodologies of scientific studies therefore need to control for these factors to avoid a false positive (Type I) error; an erroneous conclusion that the dependent variables are in a causal relationship with the independent variable. Such a relation between two observed variables is termed a spurious relationship. Thus, confounding is a major threat to the validity of inferences made about cause and effect, i.e. internal validity, as the observed effects should be attributed to the independent variable rather than the confounder.

Contents

Example

For example, consider the statistical relationship between ice cream sales and drowning deaths. These two variables have a positive, and potentially statistically significant, correlation with each other.

Experimental controls

There are various ways to modify a study design to actively exclude or control confounding variables:[1]

All these methods have their drawbacks:

  1. Case-control studies are feasible only when it is easy to find controls, i.e., persons whose status vis-à-vis all known potential confounding factors is the same as that of the case's patient: Suppose a case-control study attempts to find the cause of a given disease in a person who is 1) 45 years old, 2) African-American, 3) from Alaska, 4) an avid football player, 5) vegetarian, and 6) working in education. A theoretically perfect control would be a person who, in addition to not having the disease being investigated, matches all these characteristics and has no diseases that the patient does not also have — but finding such a control would be an enormous task.
  2. In cohort studies, the overexclusion of input data may lead researchers to define too narrowly the set of similarly situated persons for whom they claim the study to be useful, such that other persons to whom the causal relationship does in fact apply may lose the opportunity to benefit from the study's recommendations. Similarly, "over-stratification" of input data within a study may reduce the sample size in a given stratum to the point where generalizations drawn by observing the members of that stratum alone are not statistically significant.
  3. Both case-control studies and cohort studies are inevitably subject to the possibility of "residual confounding": If one or more unknown, improperly quantified, or unquantifiable confounding factors are present, then a study will be tainted unknown to the researchers involved.
    • The best available defense against this possibility is often to dispense with efforts at stratification and instead conduct a randomized study of a sufficiently large sample taken as a whole, such that all confounding variables (known and unknown) will be distributed by chance across all study groups.

Types of confounding

Confounding by indication[2]: Evaluating treatment effects from observational data is problematic. Prognostic factors may influence treatment decisions, producing a type of bias referred to as “confounding by indication”. Controlling for known prognostic factors may reduce this problem, but it is always possible that a forgotten or unknown factor was not included or that factors interact complexly. Confounding by indication has been described as the most important limitation of observational studies of treatment effects. Randomized trials are not affected by confounding by indication.

Confounding variables may also be categorised according to their source: the choice of measurement instrument (operational compound), situational characteristics (procedural confound), or inter-individual differences (person confound).

See also

Statistics portal

References

  1. ^ Mayrent, Sherry L (1987). Epidemiology in Medicine. Lippincott Williams & Wilkins. ISBN 0-316-35636-0.
  2. ^ Johnston SC. Identifying Confounding by Indication through Blinded Prospective Review. Am J Epidemiol 2001;154:276–84
  3. ^ a b Pelham, Brett (2006). Conducting Research in Psychology. Belmont: Wadsworth Publishing. ISBN 0534532942.

External links

These sites contain descriptions or examples of confounding variables:

This textbook has a nice overview of confounding factors and how to account for them in design of experiments:

· · Statistics
Descriptive statistics
Continuous data
Location Mean (Arithmetic, Geometric, Harmonic) · Median · Mode
Dispersion Range · Standard deviation · Coefficient of variation · Percentile · Interquartile range
Shape Variance · Skewness · Kurtosis · Moments · L-moments
Count data Index of dispersion
Summary tables Grouped data · Frequency distribution · Contingency table
Dependence Pearson product-moment correlation · Rank correlation (Spearman's rho, Kendall's tau) · Partial correlation · Scatter plot
Statistical graphics Bar chart · Biplot · Box plot · Control chart · Correlogram · Forest plot · Histogram · Q-Q plot · Run chart · Scatter plot · Stemplot · Radar chart
Data collection
Designing studies Effect size · Standard error · Statistical power · Sample size determination
Survey methodology Sampling · Stratified sampling · Opinion poll · Questionnaire
Controlled experiment Design of experiments · Randomized experiment · Random assignment · Replication · Blocking · Regression discontinuity · Optimal design
Uncontrolled studies Natural experiment · Quasi-experiment · Observational study
Statistical inference
Statistical theory Sampling distribution · Sufficient statistic · Meta-analysis
Bayesian inference Bayesian probability · Prior · Posterior · Credible interval · Bayes factor · Bayesian estimator · Maximum posterior estimator
Frequentist inference Confidence interval · Hypothesis testing · Likelihood-ratio
Specific tests Z-test (normal) · Student's t-test · F-test · Chi-square test · Pearson's chi-square · Wald test · Mann–Whitney U · Shapiro–Wilk · Signed-rank
General estimation Mean-unbiased · Median-unbiased · Maximum likelihood · Method of moments · Minimum distance · Density estimation
Correlation and regression analysis
Correlation Pearson product-moment correlation · Partial correlation · Confounding variable · Coefficient of determination
Regression analysis Errors and residuals · Regression model validation · Mixed effects models · Simultaneous equations models
Linear regression Simple linear regression · Ordinary least squares · General linear model · Bayesian regression
Non-standard predictors Nonlinear regression · Nonparametric · Semiparametric · Isotonic · Robust
Generalized linear model Exponential families · Logistic (Bernoulli) · Binomial · Poisson
Partition of variance Analysis of variance (ANOVA) · Analysis of covariance · Multivariate ANOVA · Degrees of freedom
Categorical, multivariate, time-series, or survival analysis
Categorical data Cohen's kappa · Contingency table · Graphical model · Log-linear model · McNemar's test
Multivariate statistics Multivariate regression · Principal components · Factor analysis · Cluster analysis · Copulas
Time series analysis Decomposition · Trend estimation · Box–Jenkins · ARMA models · Spectral density estimation
Survival analysis Survival function · Kaplan–Meier · Logrank test · Failure rate · Proportional hazards models · Accelerated failure time model
Applications
Biostatistics Bioinformatics · Biometrics · Clinical trials & studies · Epidemiology · Medical statistics · Pharmaceutical statistics
Engineering statistics Methods engineering · Probabilistic design · Process & Quality control · Reliability · System identification
Social statistics Actuarial science · Census · Crime statistics · Demography · Econometrics · National accounts · Official statistics · Population · Psychometrics
Spatial statistics Cartography · Environmental statistics · Geographic information system · Geostatistics · Kriging
Category · Portal · Outline · Index

Categories: Design of experiments | Analysis of variance | Statistical terminology

 

The above information uses material from Wikipedia and is licensed under the GNU Free Documentation License.
Some facts may not have been fully verified for accuracy. [Disclaimers]
This page was last archived by our server on Tue Jul 19 14:15:33 2011.
Displaying this page or its contents does not use any Wikimedia Foundation's resources.
The owners of this site proudly support the Wikimedia Foundation.