Do you believe that a crowd can be more intelligent than any individual in the crowd?
Find out with an easy experiment!
Fill a jar with jelly beans (or similar small items) and ask as many people as possible to guess how many there are.
Why do this problem?
This problem provides an experimental context in which students can compare the advantages of the median and mean averages as data summaries, while investigating an interesting phenonmenon - that in some cases, a crowd acting as individuals often make better decisions than the individuals of which it is made.
Possible approach
Provide a transparent container which is full of small sweets or other small items - there should be too many for anyone to be able to estimate how many there are at all easily.
Tell the students to survey as many people as possible, asking them how many sweets they think the container contains. Students should keep a record of the guesses (with names, if sweets are involved, so that the winner can receive their prize!), then calculate the median and mean average.
Key questions
How close are the averages to the actual number of sweets in the container?
How many people guessed closer than the averages?
Which is the best estimate - a guess, or an average, and if so, which average?
Possible extension
Students could build up a simple histogram to display the guesses graphically. They should then consider what the distribution of guesses looks like.
What is the overall shape? How do you explain this shape?
Which intervals received most guesses, which least?
Are there any particularly extreme guesses?
How symmetrical is the distribution?
The distribution is likely to be skewed, because people are less likely to make extreme under-estimates than over-estimates when guessing like this.
This means that the distribution may well not be symmetric, and that therefore the median and mean will be different - a point worth drawing to the students' attention.
The median is not affected by the value of extreme guesses, simply by the number of them, whereas the mean is affected by their value as well as their number.
The geometric mean gives the best estimate for the actual number of sweets in the container, and this could be a further extension.
Possible support
The most difficult aspect is ensuring that students don't make mistakes in calculating the mean and median if there is a lot of data. It may help to provide a tablet or laptop so that data can be entered directly into a spreadsheet, and any calculations which are done by hand can then be checked against the spreadsheet answers.
Alternatively, students could be given a small subset of the data to analyse by hand, then the data set as a whole analysed with the spreadsheet for further discussion.