The Z-test is a statistical test used in inference which determines if the difference between a sample mean and the population mean is large enough to be
statistically significant, that is, if it is unlikely to have occurred by chance.
The Z-test is used primarily with
standardized testingto determine if the test scores of a particular sample of test takers are within or outside of the standard performance of test takers.
Notation and mathematics
In order for the Z-test to be reliable, certain conditions must be met. The most important is that since the Z-test uses the population standard deviation, it must be known. The sample must be a
simple random sampleof the population. If the sample came from a different sampling method, a different formula must be used. It must also be known that the population varies normally (i.e., the sampling distribution of the probabilities of possible values fits a standard normal curve). If it is not known that the population varies normally, it suffices to have a sufficiently large sample, generally agreed to be ≥ 300 or 400.
In actuality, knowing the true σ of a population is unrealistic except for cases such as
standardized testingin which the entire population is known. In cases where it is impossible to measure every member of a population it is more realistic to use a t-test, which uses the standard error obtained from the sample along with the t-distribution.
The test requires the following to be known:
standard deviationof the population)
First calculate the standard error (SE) of the mean::
The formula for calculating the
z scorefor the Z-test is as follows::
*x is a mean score to be standardized
*μ is the mean of the population
Finally, the "z" score is compared to a Z table, a table which contains the percent of area under the normal curve between the mean and the "z" score. Using this table will indicate whether the calculated "z" score is within the realm of chance or if the "z" score is so different from the mean that the sample mean is unlikely to have happened by chance.
Let's take a look at using the Z-test with standardized testing.
In a U.S.
school district, a standardized reading test is used to test the performance of fifth grade students in an elementary schoolagainst the national norm for fifth grade students. The number of fifth grade students in this elementary school taking the test is 55 students.
The national norm test score, the population mean, for this particular standardized test is 100 points. The population standard deviation for the year under study is 12.
The scores of the fifth grade students of the elementary school in this school district are a sample of the total population of fifth grade students in the U.S. which have also taken the test.
The school district is told that the mean for their particular school is 96, which is lower than the national mean. Parents of the students become upset when they learn their school is below the national norm for the reading test. The school district administration points out that the test scores are actually pretty close to the population mean though they are lower.
The real question is this, is the school's mean test score sufficiently lower than the national norm as to indicate a problem or is the school's mean test score within acceptable parameters. We will use the Z-test to see.
First of all calculate the standard error of the mean:
Next calculate the "z" score:
Remember that a "z" score is the distance from the population mean in units of the population standard deviation. This means that in our example, a mean score of 96 is −2.47 standard deviation units from the population mean. The negative means that the sample mean is less than the population mean. Since the normal curve is symmetric the Z table is always expressed in positive "z" scores so if the calculated "z" score is negative, look it up in the table as if it were non-negative.
Next we look the "z" score up in a Z table and we find that a "z" score of −2.47 is 49.32%. This means that the area under the normal curve between the population mean and our sample mean is 49.32%.
What this tells us is that 49.32% plus 50% or 99.32% of the time, a randomly selected group of 55 students have a higher average score than these 55 students had. This is because our "z" score is negative so we are below the population mean. So not only do we include the distance between our sample mean and the population mean, we also include the area under the normal curve which is greater than the population mean.
If our sample mean had been 104 rather than 96, then our "z" score would have been 2.47 which would have indicated that our sample mean was above the population mean. That would have indicated that the fifth grade students in our sample were in the top 0.7% of the nation.
But let's get back to our original question. Is there a problem with the reading program at our elementary school? Our question can be reformulated to say, is the mean from our elementary school, a sample from the general population of fifth grade students, far enough outside of the norm that we need to take a corrective action to improve the reading program?
Let's put this in the form of a
hypothesiswhich we are going to test with our statistical analysis. Our alternative hypothesis is that our sample mean is significantly different from the population mean and that corrective action is necessary. Our null hypothesisis that the difference is purely attributable to chance and no action is necessary.
To answer this question, we need to determine what is the level of confidence (
confidence level) we want to use. Typically a 0.05 confidence level is used meaning that if the null hypothesis is true we stand only a 5% chance of rejecting it anyway.
In the case of our sample mean, the "z" score of −2.47 which provides us a value of 49.32% means that 49.32% plus 50% or 99.32% of the time, a randomly selected group of 55 students have a higher average score than the 55 students in our sample had.Fact|date=September 2007 To test our null hypothesis, we have to conduct a two-sided test. Since our sample is outside of this area by 1.82%, we have to reject the null hypothesis because the value of 1.82% is less than 5%, our confidence level.Fact|date=September 2007
Therefore we can conclude with a 95% confidence level that the test performance of the students in our sample were not within the normal variation.
* [http://groups.google.ca/groups?selm=3757C73F.45B4675F%40geog.uu.nl Code/pseudo-code for Z-test at Google Groups]
* Sprinthall, Richard C. Basic Statistical Analysis: Seventh Edition, copyright 2003, Pearson Education Group
Wikimedia Foundation. 2010.
Look at other dictionaries:
Test-driven development — (TDD ) is a software development technique consisting of short iterations where new test cases covering the desired improvement or new functionality are written first, then the production code necessary to pass the tests is implemented, and… … Wikipedia
Test pilot — Test pilots are aviators who fly new and modified aircraft in specific maneuvers, allowing the results to be measured and the design to be evaluated. Test pilots may work for military organizations or private, (mostly aerospace) companies.… … Wikipedia
Test logiciel — Test (informatique) En informatique, un test (anglicisme) désigne une procédure de vérification partielle d un système informatique. Le but en est de trouver un nombre maximum de comportements problématiques du logiciel, car il est impossible de… … Wikipédia en Français
Test Card F — is a test card that was created by the BBC and used on television in the United Kingdom and in countries elsewhere in the world for more than four decades. Like other test cards, it was usually shown while no programmes were being broadcast, but… … Wikipedia
Test the Nation — is a television programme, first broadcast in 2001 by BNN in The Netherlands where the concept is owned by [http://www.eyeworks.tv/ Eyeworks Holding] who license it to TV production companies around the world. how FormatThe format is designed to… … Wikipedia
Test and tagging — is a generic name given to the process of visually inspecting and electrically testing in service electrical equipment for personal use and/or safety. Colloquially, it is also referred to as; tagging, test tag, test and tag, electrical tagging,… … Wikipedia
Test Rorschach — Test de Rorschach La première planche parmi les dix du test de Rorschach Le test de Rorschach ou psychodiagnostik est un outil d évaluation psychologique de type projectif élaboré par le psychiatre et psychanalyste Hermann Rorschach en 1921. Il… … Wikipédia en Français
Test de Charge — Test de performance Un test de performance ou benchmark est un test dont l objectif est de déterminer la performance d un système informatique. L acception la plus courante de ce terme est celle dans laquelle ces tests logiciels vont avoir pour… … Wikipédia en Français
Test de Khi-2 — Test du χ² Pour la loi de probabilité, voir Loi du χ². Densité du χ² en fonction du nombre de degrés de liberté Le test du χ² (prononcer … Wikipédia en Français
Test de rorschach — La première planche parmi les dix du test de Rorschach Le test de Rorschach ou psychodiagnostik est un outil d évaluation psychologique de type projectif élaboré par le psychiatre et psychanalyste Hermann Rorschach en 1921. Il consiste en une… … Wikipédia en Français
Test des taches d'encre — Test de Rorschach La première planche parmi les dix du test de Rorschach Le test de Rorschach ou psychodiagnostik est un outil d évaluation psychologique de type projectif élaboré par le psychiatre et psychanalyste Hermann Rorschach en 1921. Il… … Wikipédia en Français