Behrens–Fisher problem

Behrens–Fisher problem

In statistics, the Behrens–Fisher problem is the problem of interval estimation and hypothesis testing concerning the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples.

The Behrens–Fisher Problem has been solved, in fact there are three solutions: that of Chapman in 1950, that of Prokof'yev and Shishkin in 1974, and that of Dudewicz and Ahmed in 1998. The solutions have been compared by Dudewicz, Ma, Mai, and Su in 2007, and it was found that the Dudewicz-Ahmed procedure is recommended for practical use.

Ronald Fisher in 1935 introduced fiducial inference in order to apply it to this problem. He referred to an earlier paper by W. V. Behrens from 1929. Behrens and Fisher proposed to find the probability distribution of

: T equiv {ar x_1 - ar x_2 over sqrt{s_1^2/n_1 + s_2^2/n_2

where ar x_1 and ar x_2 are the two sample means, and s_1 and s_2 are their standard deviations.Fisher approximated the distribution of this by ignoring the random variation of the relative sizes of the standard deviations,

: {s_1 / sqrt{n_1} over sqrt{s_1^2/n_1 + s_2^2/n_2.

Fisher's solution provoked controversy because it did not have the property that the hypothesis of equal means would be rejected with probability α if the means were in fact equal. Many other methods of treating the problem have been proposed since.

Welch's approximate t solution

The most widely used method (for example in statistical packages and in Microsoft Excel) is that of B. L. Welch, who, like Fisher, was at University College London. The variance of the mean difference

: ar d =ar x_1 - ar x_2

results in

: s_{ar d}^2 = s_1^2/n_1 + s_2^2/n_2.

Welch (1938) approximated the distribution of s_{ar d}^2 by the Type III Pearson distribution (a scaled chi-squared distribution) whose first two moments agree with that of s_{ar d}^2. This applies to the following number of degrees of freedom (d.f.), which is generally non-integer:

: u = {(gamma_1 + gamma_2)^2 over gamma_1^2/(n_1-1) + gamma_2^2/(n_2-1)} ext{ where }gamma_i = sigma_i^2/n_i.

Under the null hypothesis of equal expectations, mu_1=mu_2, the distribution of the Behrens Fisher statistic T, which also depends on the variance ratio sigma_1^2/sigma_2^2, could now be approximated by Student's t distribution with these u degrees of freedom. But this u contains the population variances sigma_i^2, and these are unknown. The following estimate only replaces the population variances by the sample variances:

:hat u = {(g_1 + g_2)^2 over g_1^2/(n_1-1) + g_2^2/(n_2-1)} ext{ where } g_i = s_i^2/n_i.

This hat u is a random variable. A t distribution with a random number of degrees of freedom does not exist. Nevertheless, the Behrens Fisher T can be compared with a corresponding quantile of Student's t distribution with these estimated number of degrees of freedom, hat u, which is generally non-integer. In this way, the boundary between acceptance and rejection region of the test statistic T is described by a smooth function dependent on the empirical variances s_i^2.

This method also does not give exactly the nominal rate, but is generally not too far off. However, if the population variances are equal, or if the samples are rather small and the population variances can be assumed to be approximately equal, it is more accurate to use the standard method, which is the two-sample t-test.

References and external links

*W. V. Behrens, "Ein beitrag zur Fehlerberechnung bei wenigen Beobachtungen", 'Landwirtschaftliche Jahrbücher' 68 (1929), pp. 807–37.
* [http://eric.ed.gov/ERICDocs/data/ericdocs2/content_storage_01/0000000b/80/25/e5/02.pdf "On the Behrens–Fisher Problem: A Review"] , by Seock-Ho Kim and Allan Cohen, University of Wisconsin-Madison, 1995. Paper presented at the annual meeting of the Psychometric Society, Minneapolis.

* [http://www.stat.wisc.edu/Department/techreports/tr1111r.pdf "Distributional Property of the Generalized p-value for the Behrens–Fisher Problem with Applications to Multiple Testing"] , by Kam-Wah Tsui and Shijie Tang, University of Wisconsin-Madison, October 31, 2005

* [http://sankhya.isical.ac.in/search/servlet/SSearch?s_order=2&choice1=author&text1=Ruben&opt1=And&choice2=title&text2=&opt2=And&choice3=title&text3=&opt3=And&choice4=keyword&text4=&rel_yr=equalto&yearsrch=2002&rel_vol=equalto&volumesrch=64&series=on&part=on&amssrch=&num=20&cntr=0 "A simple conservative and robust solution of the Behrens–Fisher problem"] , by Harold Ruben, 'The Indian Journal of Statistics' Series A Volume 64 Part 1 Pages 139–155 Year: 2002

* [http://www.jstor.org.ezp1.harvard.edu/view/00063444/di992284/99p0179u/0?currentResult=00063444%2bdi992284%2b99p0179u%2b0%2c4F31&searchUrl=http%3A%2F%2Fwww.jstor.org%2Fsearch%2FAdvancedResults%3Fhp%3D25%26si%3D1%26All%3DWelch%26Exact%3D%26One%3D%26None%3D%26sd%3D1938%26ed%3D1938%26jt%3DBiometrika "The significance of the difference between two means when the population variances are unequal"] by B. L. Welch, 1938, 'Biometrika' 29, pp. 350–62.

* [http://web.uvic.ca/econ/ewp0404.pdf A solution using empirical likelihood]

*Dudewicz, E.J., S.U. Ahmed (1998) New exact and asymptotically optimal solution to the Behrens–Fisher problem, with tables. American Journal of Mathematical and Management Sciences, 18, 359–426.

*Dudewicz, E.J., S.U. Ahmed (1999) New exact and asymptotically optimal heteroscedastic statistical procedures and tables, II. American Journal of Mathematical and Management Sciences, 19, 157–180.

*Dudewicz, E.J., Y. Ma, S.E. Mai, and H. Su (2007) Exact solutions to the Behrens–Fisher problem: Asymptotically optimal and finite sample efficient choice among. Journal of Statistical Planning and Inference, 137 (2007), 1584–1605.

*Fraser, D.A.S., Rousseau, J. (2008) Studentization and deriving accurate p-values. Biometrika, 95 (1), 1–16. doi:10.1093/biomet/asm093

*Sawilowsky, Shlomo S. (2002). [http://tbf.coe.wayne.edu/jmasm/sawilowsky_behrens_fisher.pdf Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ1 ≠ σ2] "Journal of Modern Applied Statistical Methods", 1(2).


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Behrens-Fisher-Problem — Das Behrens Fisher Problem ist eine Problemstellung der mathematischen Statistik, deren exakte Lösungen nachgewiesenermaßen unerwünschte Eigenschaften haben, weswegen man Approximationen bevorzugt. Gesucht ist ein nichtrandomisierter ähnlicher… …   Deutsch Wikipedia

  • Behrens-Fisher-Problem — Behrens Fịsher Problem   [ ʃ ; nach W. U. Behrens, * 1903, ✝ 1963, und R. A. Fisher], Statistik: das Auffinden eines exakten (nicht asymptotischen) statistischen Prüfverfahrens für den Vergleich zweier Mittelwerte aus Normalverteilungen mit… …   Universal-Lexikon

  • Behrens — is a surname and may refer to:* Heidi Behrens Benedict * Herbert Behrens * Hildegard Behrens (b. 1941), opera singer * Isidor Behrens * Paul Behrens (1893 ndash; 1984), German clockmaker * Peter Behrens (1868 ndash; 1940), German architect * Sam… …   Wikipedia

  • Ronald Fisher — Ronald Aylmer Fisher Sir Ronald Aylmer Fisher (* 17. Februar 1890 in London, England; † 29. Juli 1962 in Adelaide, Australien) war einer der bedeutendsten Theoretischen Biologen, Genetiker, Evolutionstheoretiker und …   Deutsch Wikipedia

  • Ronald Fisher — R. A. Fisher Born 17 February 1890(1890 02 17) East Finchley, London …   Wikipedia

  • Ronald Aylmer Fisher — Sir Ronald Aylmer Fisher (* 17. Februar 1890 in London, England; † 29. Juli 1962 in Adelaide, Australien) war einer der bedeutendsten Theoretischen Biologen, Genetiker, Evolutionstheoretiker und …   Deutsch Wikipedia

  • Welch-Test — Das Behrens Fisher Problem ist eine Problemstellung der mathematischen Statistik, deren exakte Lösungen nachgewiesenermaßen unerwünschte Eigenschaften haben, weswegen man Approximationen bevorzugt. Gesucht ist ein nichtrandomisierter ähnlicher… …   Deutsch Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Unsolved problems in statistics — There are many longstanding unsolved problems in mathematics for which a solution has still not yet been found. The unsolved problems in statistics are generally of a different flavor; according to John Tukey, difficulties in identifying problems …   Wikipedia

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”