Skip over navigation
Cambridge University Faculty of Mathematics NRich logo
menu search
  • Teachers expand_more
    • Early years
    • Primary
    • Secondary
    • Post-16
    • Events
    • Professional development
  • Students expand_more
    • Primary
    • Secondary
    • Post-16
  • Parents expand_more
    • Early Years
    • Primary
    • Secondary
    • Post-16
  • Problem-Solving Schools
  • About NRICH expand_more
    • About us
    • Impact stories
    • Support us
    • Our funders
    • Contact us
  • search

Or search by topic

Number and algebra

  • The Number System and Place Value
  • Calculations and Numerical Methods
  • Fractions, Decimals, Percentages, Ratio and Proportion
  • Properties of Numbers
  • Patterns, Sequences and Structure
  • Algebraic expressions, equations and formulae
  • Coordinates, Functions and Graphs

Geometry and measure

  • Angles, Polygons, and Geometrical Proof
  • 3D Geometry, Shape and Space
  • Measuring and calculating with units
  • Transformations and constructions
  • Pythagoras and Trigonometry
  • Vectors and Matrices

Probability and statistics

  • Handling, Processing and Representing Data
  • Probability

Working mathematically

  • Thinking mathematically
  • Developing positive attitudes
  • Cross-curricular contexts

Advanced mathematics

  • Decision Mathematics and Combinatorics
  • Advanced Probability and Statistics
  • Mechanics
  • Calculus

For younger learners

  • Early Years Foundation Stage

Powerful Hypothesis Testing

Age 16 to 18
Challenge Level Yellow starYellow star
  • Problem
  • Getting Started
  • Student Solutions
  • Teachers' Resources

Why do this problem?


This problem is designed to help students understand that the power of a test depends on a variety of factors.  It is thus a far more intricate question than that of handling the significance of a test.  It can also lead to an understand that interpreting the result of a hypothesis test is not straightforward: what does a non-significant result actually mean?  Is it that the null hypothesis is true, or that the experiment was simply not powerful enough to discover that it is false?  The distinction between these possibilities is crucial in many areas where hypothesis testing is performed: it is too easy to incorrectly assert that the null hypothesis is true (or likely to be true).  This links in well with the activity Hypothetical Shorts.

As an extension, it is also possible to work out algebraically the probability of rejecting the null hypothesis if it is false; it is important, though, to also develop a sense of how different factors affect the answer.

In this resource, we use a binomial hypothesis test for the simplicity of description, but the principles are applicable more generally.

Possible approach


Students would benefit from having some exposure to hypothesis testing before looking at this simulation.  It would also be very helpful for them to have access to the simulation themselves so that they can explore it.

The problem could be posed in a real-world context as opposed to picking balls from a bag: you could ask students to suggest real-life contexts where we would be interested in distinguishing between two competing hypotheses.  For example, we could be trying to find out whether a new drug is better than the standard one, or whether eating certain foods for breakfast or doing a certain amount of exercise improves students' chances of passing a particular test.  The former would lead to a decision about whether to use the drug in future, while the latter might affect advice on how best to prepare for tests.  Nevertheless, the theoretical ideas are subtle enough that it is probably simpler to work with abstract coloured balls for the actual activity.

You could then explain that Robin, the experimenter, wants to know how likely it is that the experiment will successfully reject the null hypothesis if it is false.  (Robin knows that it will reject the null hypothesis if it is true with a probability of 5%, the significance level.)

Students may require guidance as to how to use the simulation.  For example, they could begin with the default of 2 red balls, 3 green balls, $H_0\colon \pi=\frac{1}{2}$ and 50 trials, and note the proportion of the experiments in which $H_0$ is rejected after doing some large number of experiements.  (The simulation provides this figure for students.)  They could then do this again with a different proportion of red and green balls and note what changes.  It would be good to ask students to make a prediction before they rerun the simulation, and compare their prediction with the actual results.

Students could then go on to change some of the parameters in a systematic fashion and consider the questions provided.

Key questions

  • What does a significant result (one with the p-value below 0.05) tell us?
  • What factors affect the probability of obtaining a significant result if the null hypothesis is false?
  • What does a non-significant result (one with the p-value above 0.05) tell us?

Possible extension

  • Can you theoretically work out the probability of obtaining a significant result if the null hypothesis is false?

Possible support


Students will benefit from being systematic when working with the simulation and recording their results as they go.  There are several factors involved, and adjusting just one factor at a time is a wise thing to do.

To work out the answer to the question of what a significant result means, students may need prompting to use a tree diagram.

You may also like

Very Old Man

Is the age of this very old man statistically believable?

Reaction Timer Timer

How can you time the reaction timer?

Chi-squared Faker

How would you massage the data in this Chi-squared test to both accept and reject the hypothesis?

  • Tech help
  • Accessibility Statement
  • Sign up to our newsletter
  • Twitter X logo

The NRICH Project aims to enrich the mathematical experiences of all learners. To support this aim, members of the NRICH team work in a wide range of capacities, including providing professional development for teachers wishing to embed rich mathematical tasks into everyday classroom practice.

NRICH is part of the family of activities in the Millennium Mathematics Project.

University of Cambridge logo NRICH logo