Skip over navigation
Cambridge University Faculty of Mathematics NRich logo
menu search
  • Teachers expand_more
    • Early years
    • Primary
    • Secondary
    • Post-16
    • Events
    • Professional development
  • Students expand_more
    • Primary
    • Secondary
    • Post-16
  • Parents expand_more
    • Early Years
    • Primary
    • Secondary
    • Post-16
  • Problem-Solving Schools
  • About NRICH expand_more
    • About us
    • Impact stories
    • Support us
    • Our funders
    • Contact us
  • search

Or search by topic

Number and algebra

  • The Number System and Place Value
  • Calculations and Numerical Methods
  • Fractions, Decimals, Percentages, Ratio and Proportion
  • Properties of Numbers
  • Patterns, Sequences and Structure
  • Algebraic expressions, equations and formulae
  • Coordinates, Functions and Graphs

Geometry and measure

  • Angles, Polygons, and Geometrical Proof
  • 3D Geometry, Shape and Space
  • Measuring and calculating with units
  • Transformations and constructions
  • Pythagoras and Trigonometry
  • Vectors and Matrices

Probability and statistics

  • Handling, Processing and Representing Data
  • Probability

Working mathematically

  • Thinking mathematically
  • Developing positive attitudes
  • Cross-curricular contexts

Advanced mathematics

  • Decision Mathematics and Combinatorics
  • Advanced Probability and Statistics
  • Mechanics
  • Calculus

For younger learners

  • Early Years Foundation Stage

Robin's Hypothesis Testing

Age 16 to 18
Challenge Level Yellow starYellow star
  • Problem
  • Getting Started
  • Student Solutions
  • Teachers' Resources

Why do this problem?


This problem is designed to help students understand the meaning of hypothesis tests, and in particular why it is necessary to fully specify the experiment - in particular, the sample size - before we begin, otherwise our results may be meaningless.  There is an important technique called sequential testing which allows one to stop an experiment early while the results remain valid, but significant care must be taken in this situation, as shown by this resource.  (Bayesian inference has an alternative approach to this, but that is another story entirely.)

In this resource, we use a binomial test, but the principles are more generally applicable.  The solution section provides a more detailed explanation of these ideas.

Possible approach


Students would benefit from having some exposure to hypothesis testing before looking at this simulation.  It would also be very helpful for them to have access to the simulation themselves so that they can explore it.

To put the problem in a real-world context as opposed to picking balls from a bag, you could ask students to suggest real-life contexts where we would want to or have to limit the number of trials in an experiment.  For example, we could be doing laboratory experiments, and all of the materials involved are expensive.  Or we might be trialling a new drug, and it costs a large amount to test it on a person, or there are only a limited number of people with the condition the drug is designed to treat.  It might be that this is an experiment on animals, and we wish to limit the number of animals we are working with for ethical reasons.  Another reason (which is related to the cost reason) is that each trial takes a large amount of time, perhaps a day or two, so it is not feasible to do very large numbers of trials.

You could then explain that Robin, the experimenter, has suggested a way of saving money, as described in the problem.  Your students, as budding statisticians, will need to consider Robin's proposed method, and explain why it is good and will save money, or why it is broken and will potentially give a misleading answer.

Students may require guidance as to how to use the simulation.  For example, they could begin with 2 red balls, 2 green balls, $H_0\colon \pi=\frac{1}{2}$ and 50 trials, hide the p-values graph, and just note the proportion of the experiments in which $H_0$ is rejected based on the final p-value.  They could then repeat this but note the proportion of the experiments in which the p-value ever drops below 0.05.  What does this suggest?

Students could then go on to change some of the parameters in a systematic fashion, exploring whether their initial ideas hold true more generally.

Key questions

  • Is it necessary to specify the number of trials in advance?
  • What would happen if we didn't?

Possible extension

  • Is there any way of stopping the experiment early and still obtaining useful results?
  • What is the benefit of doing more trials?  Surely we would still only reject $H_0$ 5% of the time?  You can use the simulation to explore this.

Possible support


There are several things which can be changed in the simulation, and it is easy to get lost.  Students will benefit from being systematic, and guiding them to structure their exploration and recording of results will help them to understand what is happening.

You may also like

Very Old Man

Is the age of this very old man statistically believable?

Reaction Timer Timer

How can you time the reaction timer?

Chi-squared Faker

How would you massage the data in this Chi-squared test to both accept and reject the hypothesis?

  • Tech help
  • Accessibility Statement
  • Sign up to our newsletter
  • Twitter X logo

The NRICH Project aims to enrich the mathematical experiences of all learners. To support this aim, members of the NRICH team work in a wide range of capacities, including providing professional development for teachers wishing to embed rich mathematical tasks into everyday classroom practice.

NRICH is part of the family of activities in the Millennium Mathematics Project.

University of Cambridge logo NRICH logo