Chapter 6: Rejection of Data

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Picture this.

It is 1130 p .m.

You are sitting in the glow of your laptop in a, well, a mostly empty college library.

Oh, I know that feeling all too well.

Right.

You have just spent three grueling hours in the physics lab measuring the swing of a pendulum.

You are exhausted.

You just want to finish the lab report and go to sleep.

Exactly.

And you are staring at your column of data.

Now most of your numbers look fantastic.

They are perfectly clustered together validating all your hard work.

But then you see it.

Yeah, right there at the bottom of the spreadsheet.

A single number so wildly spectacularly different from the rest that it just,

it stops you in your tracks.

It looks like a glaring undeniable mistake and you were instantly faced with a choice.

Like do you keep that weird number and let it completely ruin your carefully calculated average or do you just, you know, highlight it, quietly hit the delete key and pretend it never happened is the ultimate temptation.

Welcome to a special last minute lecture deep dive.

Consider this your personalized one on one tutoring session.

If you are a college student tackling error analysis right now, you are in the exact right place.

Our mission today is to help you master chapter six of your textbook introduction to error analysis.

The chapter is called rejection of data.

And we are going to chart a really clear path through this dilemma.

We'll start by looking at the fundamental problem of anomalous data and, you know, why the scientific community argues so passionately about it.

Because it's a big deal.

It really is.

Then we'll introduce a very specific objective mathematical tool called Chauvinet's criterion.

Chauvinet's criterion.

Oh, God.

Yeah, it acts as a sort of statistical bouncer to help you make the decision to keep or toss a number.

And finally, we will navigate the gray areas.

The caveats, the controversies and those weird edge cases of throwing out your hard earned data.

Exactly.

Okay.

So let's unpack this.

Before we even touch the math, we really need to talk about why hitting that delete key is so scientifically dangerous.

Right.

Let's use the first example right from the textbook.

Imagine you are timing the period of a pendulum in seconds.

Okay.

In the classic physics lab.

Yep.

You take six measurements.

Your first five tries are incredibly consistent.

They all land between 3 .4 and 3 .9 seconds.

Which is a great spread.

But on your sixth try, you get a baffling 1 .8 seconds.

Wow.

Yeah, that 1 .8 is startlingly different from the rest of your cluster.

It really is.

Now, if you are lucky, there's an easy way out of this predicament.

The absolute best case scenario for rejecting a data point is having concrete, external evidence that a physical mistake actually occurred in the room.

Okay, wait.

What does a physical mistake look like in this context?

Well, it really comes down to meticulous lab notes.

You look back at your notebook and realize that during that final pendulum swing, your electric timer briefly stopped.

Oh.

Like because of a momentary power failure in the building or something?

Exactly.

Or perhaps you switched to a different stopwatch for that last run and a subsequent check proves that the second watch is defective and runs slow.

If you can establish a definite external cause like that, the textbook is incredibly clear.

That anomalous measurement should absolutely be rejected.

So you don't even need statistics for that?

No, you just need good record keeping.

Okay.

But here's where it gets really interesting and, well, where the frustration really sets in for a student.

Oh, I know what you're going to say.

What if you do not have an excuse?

Your stopwatch was fine.

The power did not flicker.

Your lab partner didn't bump the table.

And you have absolutely no idea why you got a 1 .8.

The unexplained anomaly.

Tossing it out just because it looks weird feels a lot like, well, it feels like fixing the data.

It is fixing the data, essentially.

It's like looking at your monthly bank statement, seeing a massive unexplained withdrawal that totally ruins your budget,

and just crossing it out with a black marker.

That's a great analogy.

You cannot just pretend it didn't happen to make your finances look better.

And that gets right to the heart of the scientific stakes here.

The decision to reject data without external evidence is deeply subjective.

Because you're just guessing.

Right.

Let's look at the actual mathematical impact on your pendulum experiment.

If you average all six measurements, including that wild 1 .8, your mean is 3 .4 seconds.

But if you throw out the 1 .8 and only average the first five, your mean jumps to 3 .7 seconds.

That is a huge difference in the world of physics.

It's massive.

You are fundamentally altering the outcome of your experiment and shifting reality on paper based on nothing but a hunch that one number is bad.

Right.

Because if we constantly throw out data that doesn't fit our neat little expectations, we are just confirming what we already believe.

Exactly.

We stop discovering new things and just start echoing our own assumptions.

And that danger is the driving force behind this entire chapter.

The textbook explicitly points out that many of the most important scientific discoveries in history first appeared as anomalous measurements.

Wait, really?

They just look like mistakes.

Yep.

They look like mistakes.

If you just throw out that 1 .8 because it's annoying your average,

you might be throwing out the most interesting part of the data.

Oh, wow.

You might be missing a real physical effect, like a secondary force or some strange resonance that you haven't accounted for.

So what does this all mean for you sitting in the library with your lab report due in a few hours?

What is the genuinely right thing to do?

Well, the only truly honest course of action, mathematically speaking, is to go back to the lab and repeat that measurement many, many times.

How many is many?

Say a hundred times.

A hundred?

Yeah.

Because if the 1 .8 was just a bizarre fluke or a subtle mistake you didn't notice, it won't show up again.

And by the time you average a hundred measurements, that single 1 .8 will be drowned out entirely.

Okay.

That makes sense.

On the other hand, if it does keep showing up every tenth swing, you know you have a real phenomenon on your hands.

Let's be real, though.

If you are a student in a teaching laboratory, the lab is locked, the equipment is put away, and your report is due at 8 a .m.

Yeah, it's not happening.

Doing a pendulum swing a hundred more times is totally impractical.

I can't just break into the science building at midnight.

Which is exactly why we need a mathematical rule.

Since you do not have the time or resources to take a hundred measurements, you need an objective criterion to justify rejecting a suspect result.

A rule that takes the decision out of my hands.

Right.

You need a statistical tool that takes the human subjectivity, the temptation to just make the data look pretty entirely out of the equation.

We need a bouncer for our data set.

A bouncer?

I like that.

Yeah, somewhat objective at the door to look at the numbers and say, you belong in the club, but you, buddy, are way too far out of line.

You're out.

That bouncer is called Chauvinet's criterion.

It provides a simple instructive application of the Gauss distribution.

The Gauss distribution.

That's what most people know as the normal bell curve, right?

Exactly, the bell curve.

And it uses that curve to solve exactly this problem.

Okay, let's get into the mechanics of this.

I want to understand how Chauvinet's criterion actually works.

I don't want to just plug numbers into a formula blindly.

Sure.

Walk me through the logic so I can apply this to my own homework.

The underlying logic is pure probability.

To understand it, we first have to understand standard deviation.

Imagine you are throwing darts at a dartboard aiming for the bullseye.

The bullseye represents your mean, your average measurement.

Makes sense.

Most of your darts will hit the bullseye or the rings right next to it.

A few might stray further out.

The standard deviation is basically a measure of how spread out your darts generally are.

Okay, so a small standard deviation means I'm a pro and all my darts are tightly packed near the center.

And a large standard deviation means my darts are scattered all over the board.

Precisely.

Chauvinet's criterion looks at the overall spread of your darts, including the weird one, and asks a simple question.

Based on how you normally throw, what is the probability of a dart landing that far away from the bullseye by pure chance?

Okay, so how do we calculate that?

To answer that, we start by calculating the mean and standard deviation of all your measurements.

You do not leave the weird one out yet.

So for the pendulum data, our average swing is 3 .4 seconds.

And let's say our standard deviation, our average spread, is 0 .8 seconds.

We have our bullseye, and we know how wide our dart groupings usually are.

What does the bouncer do next?

The bouncer needs to measure exactly how far out of line the suspect number is.

We find the difference between your weird number and the average, and we see how many standard deviations fit into that gap.

Okay, let me try this.

Your weird measurement was 1 .8.

The average was 3 .4, so the gap between them is 1 .6 seconds.

Since one standard deviation is 0 .8 seconds, that gap is exactly two standard deviations wide.

You got it.

I see.

So we are standardizing the distance.

Instead of saying it is 1 .6 seconds away, which means nothing out of context, we say it is two standard deviations away.

Exactly.

Once we know that distance, we consult the probability tables for a normal bell curve, which you can usually find in the appendix of your textbook.

Oh, the big table of numbers at the back.

That's the one.

We look up the probability of getting a measurement that falls two standard deviations away from the mean.

The table tells us that for a distance of exactly two standard deviations, the probability is 0 .05.

Meaning 5%.

Yes.

It means that if you took an infinite number of measurements, you would expect about 5 % of them to randomly land that far away from the center just by pure dumb luck.

Okay, I'm following so far.

But a 5 % chance doesn't tell me whether to keep it or delete it.

How do we make the final call?

This is where we calculate the expected number of deviant measurements.

The book labels this with a lowercase n.

You simply multiply the total number of measurements you actually took by that probability we just found.

Wait, why are we multiplying the probability by our total number of measurements?

What does that actually represent in the real world?

Well, think about flipping a coin.

The probability of getting heads is 50 % or 0 .5.

If you flip the coin 10 times, you multiply 10 by 0 .5 and you expect to get heads 5 times.

It is the exact same logic here.

Our probability of a wild pendulum swing is 5 % or 0 .05.

We took a total of six measurements.

Right, the six swings.

So if we multiply 6 by 0 .05, we get an expected number of 0 .3.

So out of our six little swings, we should only expect to see a number this wild 0 .3 times, meaning not even once.

Exactly.

And this brings us to the golden rule of Chauvinet's criterion.

The threshold for rejection is set at 1 half, or 0 .5.

If your expected number is less than 0 .5, the criterion dictates that the measurement is ridiculously improbable for the sample size you have, and you are officially permitted to reject it.

So since our expected number is 0 .3 and 0 .3 is less than 0 .5, the bouncer kicks the 1 .8 out the door.

It fails the test.

You get to cross it out without feeling like you are cheating.

That is so satisfying.

To make sure this is totally clear, let's look at another scenario from the text.

Sure, let's do the length one.

Yeah.

Imagine a student is measuring a length in millimeters, and they take 10 measurements.

Nine of those measurements are tightly clustered right around 45 millimeters.

But the 10th measurement,

it is a whopping 58 millimeters.

Just conceptually, that 58 is a massive outlier.

So the student checks their equipment, finds no physical mistake, and calls in Chauvinet's bouncer.

Right.

They calculate the mean of all 10 numbers, which comes to 45 .8.

Then they calculate the standard deviation, which is 5 .1.

Next, they find out how many standard deviations away that 58 is.

The gap between 58 and the average of 45 .8 is 12 .2.

When you divide that gap of 12 .2 by the standard deviation of 5 .1, you find the suspect value is about 2 .4 standard deviations away.

Perfect.

Then they go to the probability table.

The chance of a dart landing 2 .4 standard deviations away from the bullseye is roughly 1 .7%, or 0 .016.

They multiply their 10 total measurements by that 0 .016 probability.

The expected number of times a measurement this weird should happen in a sample of 10 is 0 .16.

And since 0 .16 is way, way below our Golder threshold of 0 .5, the 58 is rejected.

Exactly.

But here's the part that really matters for the student's lab report.

Once you reject that number, you don't just leave a blank space.

You have to recalculate your final answer using only the remaining nine legitimate measurements.

Yes.

And the cause and effect of that recalculation is striking.

When the student recalculates without the 58, the new average shifts slightly.

Which is expected.

Right.

But the standard deviation, which represents the uncertainty of the entire experiment, plummets.

It drops from 5 .1 all the way down to 2 .9.

Wow.

That uncertainty is slashed almost in half just by removing that one bad dart from the board.

It makes the data look incredibly clean and precise.

It does make it look pristine.

However, we need to read the fine print on this magical formula.

I am so glad you brought that up because I have a major pushback here.

What's your?

Setting the absolute boundary for rejection at exactly one half 0 .5 feels incredibly arbitrary.

Why 0 .5?

That is the big question.

Right.

Why not 0 .3 or 0 .8?

If my calculation comes out to 0 .49, I throw the data away.

But if it is 0 .51, I am forced to keep it.

That feels like a totally made up rule.

Well, what's fascinating here is that many prominent scientists share your exact discomfort.

You are entirely correct.

The boundary of one half is arbitrary.

You ass.

It is reasonable, serving as a tipping point between more likely to happen than not and less likely to happen than not.

But it is absolutely not a fundamental law of physics.

It is just a convention.

It feels risky to build our concept of reality on a mere convention.

Plus, aren't we basing this entire probability curve on a tiny amount of data in the first place?

Yes.

If we connect this to the bigger picture of statistics, that is the most critical flaw in the entire process.

Chauvinist criterion relies completely on your standard deviation to calculate probability.

But when you only take a few measurements, your standard deviation is a highly uncertain, deeply flawed estimate of the true spread of the data.

Because we haven't thrown enough darts to know what our actual spread looks like.

Exactly.

You are just a college student doing a lab in three hours, not a researcher running a ten -year study.

Right.

The textbook actually provides a sobering statistic on this.

If you only take six measurements, your estimate for standard deviation is about 30 % uncertain.

Wait.

30 %?

Yes.

That is a massive margin of error for a tool that is supposed to be an objective downser.

And it gets worse.

Because the probability in the lookup table is extremely sensitive to that standard deviation,

a 30 % error in your standard deviation causes a cascading massive error in your probability.

Oh jeez.

This casts serious doubt on the whole procedure when your sample size is small.

You should really regard Chauvinist criterion with considerable skepticism unless you have a robust number of measurements, say 50 or more.

Well, if the math is that fragile, what is the best practice?

What should you actually do when you are writing your report and you want to be as scientifically honest as possible?

The textbook offers a very elegant compromise here.

Do not just blindly use Chauvinist criterion as a weapon to erase data.

Use it as an alarm bell to identify data that might be problematic.

Okay.

Then do all your final calculations twice.

Run the numbers once with the weird dart included and once with it kicked off the board.

Show your work for both realities.

See how much the questionable value actually affects your final conclusion and present both outcomes in your report.

That is really smart.

Chauvinist criterion should be used as an absolute last resort only when you cannot check your measurements by repeating the experiment.

I love that.

It shows the professor you really understand the nuance.

But I have an edge case for you.

Bring it on.

What happens if you just have a really bad day in the lab and two of your measurements look terrible?

Can Chauvinist's bouncer kick out two people at once?

This raises an important question and it is where things get even more problematic.

If using this math to reject one measurement is open to doubt, using it to reject several is playing with fire.

But if you absolutely must, the textbook outlines a modified rule.

How does it work?

First, you focus exclusively on the most deviant value, the absolute worst offender.

The dart furthest from the board.

Right.

You calculate the expected number for that worst offender just like before.

But because you have two suspect measurements in your data set, the rejection boundary changes.

Oh, it does?

Yeah.

You double the standard boundary.

Instead of 0 .5, your new threshold is 1.

Okay, so the boundary effectively doubles.

If the calculation for my worst offender is less than 1, what happens?

If it is less than 1, you can reject both suspect measurements simultaneously.

A double eviction.

Exactly.

But what if the number for the worst offender is more than 1?

If it is more than 1, you certainly cannot reject both.

In that case, you keep the first most deviant one.

It stays in the data.

Okay.

But you are not quite done.

You then look at the second most deviant value and you test it all by itself using the standard 0 .5 boundary.

Oh, I see.

If the expected number for that second value is less than 0 .5, you can reject just the second one.

That is a very specific, careful sequence of operations.

It sounds like they really don't want you throwing out multiple numbers unless it is undeniably justified.

Which leads to the final, absolute, non -negotiable warning in the textbook.

We talked about how if you reject a data point, you naturally recalculate your mean and standard deviation with the remaining cleaner data.

Right.

Your uncertainty drops and everything looks much tighter.

Well, because your new standard deviation is suddenly so much smaller, some of your remaining measurements might now look like outliers compared to this new super tight spread.

Oh no.

I see where this is going.

You might be strongly tempted to run Chauvinet's criterion again to clean up the data even more.

It is like peeling an onion.

You strip away the worst layer, but then the next layer suddenly looks like the outside, so you peel that one off too.

You could just keep going until you have nothing left.

Exactly.

And you must not do it.

Agreement is absolute among experts on this point.

Chauvinet's criterion should never be applied a second time using the recalculated values.

Never.

You run the test exactly once on your original data set.

Period.

If you keep reapplying it, you will artificially shrink your uncertainty until your data looks perfectly impossibly precise, which is a total fabrication of reality.

Okay.

Let's bring this all together.

You are sitting in the library staring at your spreadsheet.

Here is your game plan, your tutoring summary.

Let's hear it.

Step 1.

Check your notes.

Was there a physical mistake like a stopwatch glitch?

If yes, reject the number.

If no, you move to statistics.

Good.

Step 2.

Find your bullseye and your spread by calculating the mean and standard deviation of all the data.

Step 3.

Find how many standard deviations away your suspect number is.

Spot on.

Step 4.

Use the table to find the probability and multiply that by your total number of measurements to get your expected number.

Finally, if that expected number is less than 0 .5, the bouncer says reject it.

You recalculate your final answers without it, but you never repeat the process.

That is the complete, scientifically sound procedural flow for Chapter 6.

But as we close out this session, I want to leave you with something entirely new to ponder.

We've spent this whole time talking about a human student painstakingly deciding whether to manually delete a bad number on a spreadsheet.

Right.

A very manual, deliberate choice.

But think about modern science today.

In massive fields like genomics, particle physics, or climate modeling, human beings aren't looking at spreadsheets of 10 numbers.

No, of course not.

Algorithms in AI software are processing billions of data points every second, and many of those have automated statistical bouncers, rules far more complex than Chauvines, built right into their code.

Oh, absolutely.

They are silently scrubbing anomalous data before a human scientist ever even looks at a screen.

The machine is making the philosophical choice for us.

Exactly.

So the next time you wonder about the integrity of data, ask yourself,

in our rush to automate and clean up our massive modern data sets, how many brilliant, world -changing anomalies are being quietly deleted by software in the dark?

It's a scary thought.

What mysterious new realities are we blinding ourselves to, simply because a machine was programmed to keep the average looking neat?

It is a profound shift in how we observe the universe.

The math gives us a clean answer, but a clean answer is not always the truth.

So the next time you see a jagged weird number at the bottom of your lab report, don't just see it as a nuisance.

See it as a question.

Beautifully said.

From all of us at The Last Minute Lecture Team, thank you so much for joining us on this deep dive.

We wish you the absolute best of luck on your error analysis studies.

You are now fully equipped to tackle your data and to know exactly when and how to throw it away.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Anomalous data points present researchers with a fundamental methodological challenge that sits at the intersection of statistical rigor and scientific integrity. When measurements deviate substantially from the bulk of a dataset, the investigator must decide whether to exclude the outlying value or incorporate it into subsequent analyses. The distinction between justified and unjustified rejection hinges primarily on causation: data can be confidently discarded when a concrete external source for the anomaly exists, such as instrument failure or environmental contamination. In the absence of such identifiable causes, however, the decision becomes ethically fraught, as removing data without clear justification risks accusations of manipulating results toward desired conclusions, while retaining anomalies may obscure genuine patterns or be misleading if the anomaly reflects actual measurement error rather than meaningful variation. Chauvenet's criterion offers a statistically grounded alternative to purely subjective judgment by establishing quantitative thresholds for rejection based on the Gaussian distribution. The method operates by calculating the number of standard deviations separating a suspect observation from the dataset mean, converting this distance into a probability, and determining how many values at that extreme level should theoretically appear in a dataset of the given size. When this expected frequency drops below 0.5, the criterion permits rejection. Despite its mathematical foundation, Chauvenet's criterion carries important limitations that practitioners must recognize. The rejection threshold itself remains somewhat arbitrary, and the reliability of standard deviation estimates deteriorates significantly when sample sizes are small, undermining the criterion's validity in restrictive settings. Rather than adopting a binary approach of applying the criterion and then discarding identified outliers, many methodologists advocate for a balanced strategy: use the criterion to flag questionable measurements, then conduct the complete analysis twice, once with and once without the suspect data, explicitly comparing how conclusions shift based on inclusion or exclusion. This dual-analysis framework reveals whether the anomalous point meaningfully alters results or proves inconsequential. The chapter emphasizes avoiding iterative application of the criterion, which progressively tightens datasets toward artificially clean results, and addresses practical situations involving multiple suspect measurements. Ultimately, Chauvenet's criterion serves as a defensible tool for situations where direct measurement verification or replication is genuinely impossible, providing statistical justification while maintaining transparency about the inherent subjectivity underlying all data rejection decisions.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 6: Rejection of Data

Related Chapters