Chapter 14: Partial Derivatives

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

What if the world isn't just one straight line?

What if the things you care about, like say the weather or maybe the economy, depend on multiple interconnected moving parts, all influencing each other?

Welcome to the deep dive.

We plunge into complex ideas and try to pull out some genuinely surprising insights.

Today we're navigating the multidimensional landscapes of calculus,

specifically how we apply its tools to functions of several variables.

We're drawing heavily from chapter 14 of calculus, early transcendentals, which really lays the groundwork well.

Our mission.

Let's cut through the complexity.

We want to show you how calculus extends way beyond simple curves, offering a path to understanding, maybe even optimizing really complex stuff.

Think climate modeling, engineering designs, economic systems, hopefully we'll have some aha moments that make this abstract stuff feel concrete.

Let's dive in.

So you're probably used to functions like fx, it will buy two, one input, one output gives you a nice curve, but reality,

that's rarely that neat, is it?

Think about the temperature in a room.

It depends on where you are, right?

Not just one thing.

Exactly.

And that's where functions of several variables really come into their own.

We're moving beyond just fx, we're talking fxy, maybe fxyz.

And the big shift, their graphs aren't curves anymore.

Their surfaces in 3D or even these higher dimensional things, hyper surfaces, much closer to how the world actually works.

Okay.

But visualizing those sounds, tricky, like trying to grab smoke, how do mathematicians help us see these things?

Right.

Yeah.

Visualization is definitely a hurdle.

One of the main tools we use, as you hinted, is the level curve, sometimes called a contour line.

Basically, these trace out where the functions output, fxy, is just a constant value.

Let's call it k.

Ah, okay.

So exactly like a topographic map showing elevation lines.

Precisely the analogy.

If you've ever looked at one of those maps, you're already reading level curves.

They connect points of equal height.

And what's really useful isn't just where the lines are, but how close they are together.

Tightly packed lines.

That means a steep slope.

The function's changing fast there.

Right.

And spread out lines mean it's flatter.

Exactly.

A more gradual change.

Yeah.

And this isn't just for mountains, I assume.

Where else do we see this in action?

Maybe daily life.

Oh, absolutely.

Think about weather maps.

You see isothermals all the time.

Isothermals.

Constant temperature.

Yep.

Lines connecting points with the same temperature.

Like in figure 13 in the source, showing average July temperatures.

Or you see isobars.

Constant peripheral lines.

Right.

And looking at how close those isobars are tells you about wind speed.

Wind flows from high to low pressure.

And where those lines are straight, that's where the wind is strongest.

Hmm.

It's like a visual shortcut to understanding the dynamics.

Pretty cool.

It really is.

Okay.

So beyond weather,

what about other functions where multiple inputs are key?

I'm thinking of like windchill.

Something many of us feel.

It's not just the temperature.

The wind makes a huge difference.

Perfect example.

The windchill index, let's call it W, is definitely a function of the actual air temp T and the wind speed V.

So WFTV.

And the National Weather Service, for instance, they have tables for this.

They might say if it's 25 degrees C but you've got a 50 kilometer rate wind, it feels like Manifest 15 degrees.

So F2550 equals Manifest 15.

That directly tells you how many layers to put on.

It's practical.

And then shifting gears completely, you have economics.

The Cobb -Douglas production function.

Right.

I've heard of that one.

It's fundamental.

Cobb and Douglas came up with PLK, P -L -A -K -1 -A, to model a country's production.

PL is labor, K is capital investment.

And they even used real data back in the day to refine it.

They landed on something like PLK equals 1 .01 L .75 K .0 .25 for the U .S.

economy then.

Wow.

So it shows how output depends on both workforce and investment.

These functions capture complex interactions.

Exactly.

They model those relationships.

Okay.

We've covered two variables.

But what if there are more?

Three.

Four inputs.

Does calculus just give up then?

Not at all.

For three variables, say F -X -Y -Z.

You're right.

We can't directly graph it.

They'd need four dimensions.

But we just extend the idea.

Instead of level curves, we talk about level surfaces.

Level surfaces.

Think of F -X -Y -Z by 2 plus Y2 plus Z2.

If you set that equal to different constants K, what do you get?

Spheres centered at the origin.

Exactly.

Concentric spheres.

Each sphere represents all the points, X, Y, Z, where the function has the same value K.

It's how we conceptualize these higher dimensions even if we can't draw them perfectly.

Okay.

That makes sense.

So functions can have lots of inputs interacting.

But how do we measure change?

Like if the temperature depends on location, X, Y, Z, and maybe even time T, how do we figure out how fast it's changing just because of, say, moving along the X axis while everything else stays put?

That is the exact question that leads us to partial derivatives.

They're the tool for that job.

They let us measure how a function changes when only one input variable changes while we hold all the others constant, frozen, essentially.

So you're saying if I have F -X -Y and want the change with respect to X, I just pretend Y is a number.

Yeah.

Like five or something and differentiate normally.

That's basically it.

It's that straightforward conceptually.

To find the partial derivative with respect to X, we write an F -X or that curly D thing.

F -X, you just differentiate like you always do for X, but you treat Y and any other variables like Z as if they were fixed constants.

Same process.

If you want and spry, treat X as constant.

Okay.

That sounds manageable.

It brings it back to single variable calculus rules we already know.

Precisely.

But what does it mean geometrically?

What are we looking at on the surface?

Good question.

Geometrically, F -X at a point A -B is the slope of the tangent line to the surface right at that point.

If you slice the surface with a vertical plane that's parallel to the X -axis going through Y -B.

Okay.

Like you're walking on the surface but only allowed to move east -west.

Exactly.

And F -X is the steepness of your path in that specific direction and F -I is the steepness if you're only allowed to move north -south parallel to the Y -axis.

Let's make it concrete.

Back to the heat index, A -F -F -F -T -H.

If we calculate these partials, what can they tell us?

Okay.

So using the estimates from the text,

we find that FT9670 is about 3 .75 meaning when it's 96 degrees Fahrenheit and 70 percent humidity,

the feels like temperature goes up by roughly 3 .75 degrees F for every single degree the actual temperature increases.

Wow.

Okay.

And the humidity part.

Right.

FH9670 is about 0 .9.

So under those same conditions, the heat index climbs by only about 0 .9 degrees F for each percentage point increase in humidity.

But at that specific point, the actual temperature change has much bigger impact on how hot it feels than a humidity change does.

It gives you that precise sensitivity under those specific conditions.

Very practical.

What about something like BMI, body mass index, BMHH, MH2, mass M height H.

What do the partials tell us there?

Okay.

Let's look at BMI.

That's how BMI changes with mass, keeping height fixed.

For someone at, say, 64 kilograms and 1 .68 meters tall, BMA is about 0 .35.

So if that person gains 1 kilogram, their BMI goes up by roughly 0 .35 points.

Makes sense.

And the height part, e -way.

That tells you how BMI changes with height, keeping mass constant.

For the same person, MH is about neckday 27.

Whoa.

Negative.

Yeah, negative.

It means if that person grew taller, say by 1 centimeter, 0 .01 meter, while staying 64 kilograms, their BMI would decrease by about 0 .27 points, a native 27 .01.

Height makes you relatively slimmer, holding weight constant.

That really clarifies the impact of each variable.

Cool.

Can we keep differentiating?

Can you take the partial derivative of a partial derivative, like flicks?

Is that a thing?

Oh, absolutely.

Those are called higher order partial derivatives.

You just differentiate again, following the same rules.

And there's something really elegant here called Clairaut's theorem.

It basically says that under pretty normal conditions, like if the second partials are continuous, the order you differentiate in doesn't matter.

Wait, really?

So flicks doing X, then Y is the same as fix, doing Y, then X.

Yep.

Fixy is nice symmetry, and it can simplify calculations quite a bit sometimes.

Okay.

Thinking back to single variable calc, the tangent line was super useful for approximating a curve near a point.

Is there an equivalent for surfaces?

Can we approximate a bumpy surface with something flatter?

That's the perfect analogy.

Yes, we use a tangent plane.

Just like zooming in on a smooth curve makes it look like a line, if you zoom in really close to a smooth surface, it starts to look very much like a flat plane.

That flat plane is the tangent plane.

And I bet there's an equation for it.

You bet correctly.

For a surface, z equals f x y at a point x, y, y, x, y at x, y, x, y, x, y.

The tangent plane equation is z, x, y, x, y, x, y, x, y, x, y, x, y, y, y.

Notice how it uses those partial derivatives we just talked about.

Oh, okay.

They determine the tilt of the plane in the x and y directions.

Exactly.

And this equation is the foundation for linear approximation in multiple dimensions.

So the tangent plane itself is the linear approximation.

Pretty much.

The function defined by that tangent plane equation, we often call it L x, y, the linearization of S -way x, y.

And the key idea is that x, y is approximately equal to L x, y when your point x, y is close to x, y.

It's a linear stand -in for the potentially complex function f.

Let's bring back the heat index again.

FTH.

We do f96 .70, it was 125.

And we had those estimates, FTH at 3 .75 and FHA at 0 .9 at that point.

How could we estimate the heat index if the temp nudged up to 97 degrees air and humidity to 72 %?

Okay.

Using the linear approximation formula, we build it around T96 and H70.

So FTH, f96 .70 plus fT96 .70, T96 plus fH96 .70, it's 70.

Plugging in the numbers.

Right.

FTH, 125 plus 3 .75, T96 plus 0 .9, H70.

Now we want f97 .72, so plug in T97, H72, that's T96, 1 and H $7 to f97 .72, 125 plus 3 .75 plus 1 .8.

Carry the 1.

130 .55.

Exactly.

130 .55 degrees Fahrenheit.

It gives you a quick estimate without needing the full, possibly very complex, original function f.

Super useful for quick checks.

That is a powerful shortcut.

What about estimating small errors?

Like if my measurements for x and y are slightly off, how much might my output z0 be off?

That's where differentials come in handy.

For z, fxy, the total differential is written dv for c, x, dx plus festine.

Here dx and d represent small changes or potential errors in x and y, and dz gives you the approximate resulting change or error in z.

How would that work?

Give me an example.

Okay.

Imagine calculating the volume of a rectangular box.

V equals xyz.

Suppose your measurements for xyz could each be off by a small amount, say hick.

We want to estimate the maximum possible error in the calculated volume, dv.

Using differentials, dv equals ex, dx plus v, plus ex, the partials, or vxyz, yxyz, v equals y.

So dv equals yz, zy.

So the maximum error in each is a, so dx, a.

The source calculates the potential volume error dv could be up to 60, 40, plus 75, 40, plus 75, 40, plus 75, 60 base, which adds up to 9900 and 0.

Wow.

So even small measurement errors can add up to a significant uncertainty in the results.

Absolutely.

That's crucial for experimental science and engineering, understanding how errors propagate.

All right.

We've tackled measuring change in specific directions and approximating functions.

But reality is often dynamic, isn't it?

What if the variables themselves, like x and y, are changing over time?

Or what if the temperature at a point depends on its location, but the location itself is changing?

How do we track change through these layers?

Now, you're talking about the chain rule for multivariable functions.

It's the tool designed precisely for following these interconnected paths of change.

Let's start with case one.

Suppose z depends on x and y, so z, x over y, but both x, y are themselves changing with respect to a single parameter, let's say time t, so x, x, t, and y, y, y, t, then z also indirectly depends on t.

The chain rule tells us how to find ez, et.

And the formula is?

It's ds, ct, dx, dt, plus x, d, dt.

Let's see how it combines the rates.

The first term accounts for how z changes via x, and the second term accounts for how z changes via y as t changes.

Okay, let's try it.

How about the ideal gas law?

PV equals 8 .31T.

Pressure P, volume V, temperature T.

Suppose we have some gas.

Its temperature is rising, and maybe its container is expanding at the same time.

How fast is the pressure changing?

Perfect application.

Let's rearrange to solve for pressure.

P equals 8 .31TV.

So P is a function of t and v.

Now, let's say at some instant t equals 300k, and it's increasing at .1k, so dt, t, 0 .1, and maybe v equals 100L, and it's expanding at .2L, so dv, dt equals .2.

Instead of you use the chain rule, dp, dt, dtt, plus wd, dt, we need the partial derivatives of P.

Evd, t is 8 .31V, and PV is a mannix 8 .31TV2.

Right.

Now, plug in all the values, t300, v100, dv, dv is 0 .1, dv, dt, .2, dv, dt equals 8 .31100, 0 .1, plus mannix 8 .31302, 0 .2.

You crunch those numbers, and it comes out to roughly vanic .04155 kps.

Negative.

So the pressure is decreasing even though the temperature is rising.

Yes.

In this case, the effect of the volume increasing is overpowering the effect of the temperature increasing, causing the pressure to drop.

The chain rule lets us quantify that net effect.

It's essential for dynamic systems.

That really shows its power.

And that was just case one.

There's case two, where x and y might depend on multiple parameters, like s and t.

Then you'd find partial derivatives, like eros and eriza, using a similar but expanded chain rule formula.

It scales up.

OK, got it.

So partial derivatives nail down the change in the cardinal directions, east, west, s, vex, north, south, vama.

But what if I'm standing on a hillside, and I want to know the slope if I walk, say, southeast,

or how fast the temperature on a metal plate changes if I move diagonally across it?

We need change in any direction, right?

Exactly.

We need directional derivatives.

And asking that question leads us straight to one of the absolute cornerstone concepts of multivariable calculus, the gradient vector.

The gradient vector, usually written of the upside -down delta symbols called nabla, is simply a vector built from the first partial derivatives.

For f x y ero, as f x av equalizes as x y, for f x y z, it's s s y f y.

It's like packing all the first -order change information into one vector.

And why is this gradient vector so important?

You called it a cornerstone.

What are the aha moments with the gradient?

OK, there are three huge properties you need to know.

They're really insightful.

First,

the gradient vector chef at a point always, always points in the direction where the function if increases most rapidly.

The direction of steepest descent, like pointing straight uphill.

Exactly.

If you want to get higher faster, follow the gradient.

Second property, the magnitude of the gradient vector f at f tells you what that maximum rate of increase is.

So it gives you direction and the steepness in that direction.

Precisely.

And third, this one connects everything visually.

The gradient vector f at f is always perpendicular or normal to the level curve or level surface of f passing through that point perpendicular to the level curve.

Why does that make sense?

Well, think about it.

If you walk along a level curve, the function's value isn't changing at all, right?

The rate of change is zero.

So the direction where it changes most, the gradient must be at a right angle to the direction where it changes least along the level curve.

It creates this beautiful geometric relationship that clicks.

Okay, let's test it.

Suppose the temperature in space is given by Txyz equals 80, 1 plus by 2 plus 2y2 plus 3z2.

If I'm at the point 1, 1, 1, and 2, where should I go to warm up the fastest?

And how fast would the temperature increase?

All right, we need the gradient.

We calculate the partial derivatives, Tx, T, Ts, and evaluate them at 1, 1, 2.

The source does this calculation and finds the gradient vector there, points in the direction of 2, minus 2, 6, or technically proportional to it after simplification.

So I should head in the direction 2y2j plus 6k.

That's the direction of fastest temperature increase.

Now, for the rate of increase, we need the magnitude of that gradient vector at the point.

The calculation shows HT at level 1, 1, 2 is 58, 70, 41, which is roughly 4.

So the temperature increases at about 4 degrees C per meter if I move in that specific optimal direction.

Exactly.

It shows the power of the gradient for optimization, finding the fastest way up, or the path of greatest change, and that perpendicularity property.

It's also how we find tangent planes to surfaces defined implicitly, like f, x, y, z.

The gradient f is the normal vector to that tangent plane.

It ties it all together.

Fantastic.

So we can measure change, approximate functions, find the steepest direction.

What about the ultimate goal in many applications?

Finding the highest peaks or the lowest valleys, the maximums and minimums?

Right.

That brings us to optimization in multiple dimensions.

Similar to single variable calculus, we're looking for local maxima hilltops and local minima valley bottoms.

But in higher dimensions, we also get something new.

Saddle points.

Think of the shape of a horse saddle.

It curves up in one direction and down in another.

OK, so how do we find these points, maxima, minima, saddles?

The first step is finding critical points.

These are points where the tangent plane is horizontal, meaning all the first partial derivatives are zero simultaneously.

So f, x, e, rho, and phi, or s, a, equals zero, and f, c, equals zero if you have three variables.

Just like setting f, x in single variable calc.

But once we find a critical point, how do we know if it's a max, min, or one of those saddle points?

We need a second derivatives test for multiple variables.

It's a bit more involved than the single variable version.

You calculate a quantity d, sometimes called the discriminant, using the second partial derivatives, d equals x, x, phi, s, y, y, 2, evaluated at the critical point.

Then, the sine of d and the sine of x, x tell you what kind of critical point you have.

If d, zero, and x, x, zero, it's a local minimum.

If d, zero, and x, x, zero, it's a local maximum.

If d, zero, it's a saddle point.

If d, you all, the test is inconclusive.

OK, that sounds like a clear procedure.

Let's try that classic problem.

You have 12 square meters of cardboard, and you want to make an open top rectangular box with the maximum possible volume.

Right, a standard optimization problem.

Let the base dimensions be x and y, and the height be z.

Volume v equals x, y, z.

The constraint is the surface area of the cardboard used.

Area base plus four sides equals glici plus 2xz plus 2y60 equals 12.

How do we solve it using the derivative tests?

The source shows one way.

Solve the constraint for z.

Substitute it into the volume formula v to get v as a function of only x and y.

Then, find the critical points by setting nrbx equals zero and every oe equals zero.

Solve that system.

Finally, use the second derivatives test d to confirm it's a maximum, the result.

The maximum volume is four meteragus, and it happens when the dimensions are x2, y crew 2, z11.

A perfect cube base and height half the side length.

Neat.

This process clearly demonstrates finding extrema in a practical design scenario.

And one more thing on extrema, if you need the absolute max or min on a specific closed and bounded region, like a rectangle or a disk, a continuous function is guaranteed to have them.

How do you find those?

You have to check two places.

First, find the value of f at all critical points inside the region.

Second, find the maximum and minimum values of f along the boundary of the region.

The absolute maximum will be the largest, smallest value among all those points you checked.

Makes sense.

Check inside, check the edges.

But what if the constraint is more complicated?

Or what if you can't easily solve for one variable and substitute,

like maximize profit subject to a fixed budget across multiple inputs?

That box example felt a bit too neat.

Ah, you've hit on the exact reason we need another, often more powerful, technique.

Lagrange multipliers.

They are designed for optimization with constraints, especially when substitution is hard or impossible.

The core idea is actually quite beautiful geometrically.

At the point where you achieve the maximum or minimum value of your function f, subject to staying on the constraint surface g, x, y, z, a, k, the gradient vector of f must be parallel to the gradient vector of the constraint, g, a.

They have to point in the same or exactly opposite directions.

Parallel gradients.

H of a parallel to Erichan, why?

Think about it.

If Rech have had any component along the constraint surface, you could move a tiny bit along the surface in that direction and increase f without violating the constraint.

So at the maximum, Chechess must be pointing purely perpendicular to the constraint surface.

And we already know Erich is always perpendicular to the constraint surface g, x, y.

If both vectors are perpendicular to the same surface at the same point, they must be parallel to each other.

That's a really neat insight.

So Erich is parallel to Getch.

How does that help us find the point?

If two vectors are parallel, one must be a scalar multiple of the other.

That gives us the central Lagrange multiplier equation.

Erich is just some scalar, the Lagrange multiplier.

This vector equation, along with the original constraint equation g, x, y, z, k, gives you a system of equations to solve for x, y, z and x.

The solutions x, y, z are your candidate points for extrema.

OK, let's test this elegance.

Can we redo the open top box problem?

Maximize v, x, y, z subject to g, x, y, z plus 2x e plus 2x e plus 2x e is 12.

Absolutely.

We set up jox v, agos fe is gloco c, x, y.

And Elorni's is e plus 2 equals x plus 2x plus 2y.

So the system becomes 1, uil's e plus 2.

2il's is lex plus 2y's, 4isi plus 2x plus 2x plus 12, the constraint itself.

Now you solve this system of four equations for x, y, z.

Take some algebra, but the source confirms.

You arrive at the same solution, x2, y2, z1.

Maximum volume is four medial middles.

Lagrange multipliers deliver the same answer, often more systematically.

Wow.

It handles the constraint directly without substitution.

Exactly.

And it easily extends to problems with multiple constraints, too.

You just introduce more multipliers like x, x plus plus matia.

It's a very versatile and powerful method for real world constrained optimization.

What an incredible journey through this chapter.

Seriously.

We started by just acknowledging that things depend on multiple factors.

Then we learned to visualize these dependencies using level curves.

Then partial derivatives let us measure change in specific directions.

Tangent planes gave us approximations.

The chain rule handled dynamic changes.

And the gradient showed us the path of steepest descent.

And finally, we learned how to find the absolute peaks and valleys, both freely using critical points and the second derivatives test, and under constraints using the elegance of Lagrange multipliers.

That's a great summary.

And the key takeaway, I think, is empowerment.

These tools let you model, analyze, and truly optimize systems that are far more complex than simple one variable functions, whether it's climate modeling, engineering design, economic forecasting, biological processes.

The principles we cover are fundamental for understanding and improving things in our multifaceted world.

It really opens up a new way of seeing how interconnected things are.

So the question for you, our listener, is, now that you've navigated these higher dimensions with us,

what complex system in your world are you ready to maybe deep dive into and optimize?

Thank you so much for joining us on this deep dive from the entire Last Minute Lecture team.

We really appreciate you tuning in.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Extending calculus from single variables to multiple dimensions requires developing new tools for analyzing how functions change along different directions and axes. Functions of two or more variables form the foundation of multivariable calculus, with their domains and level curves providing geometric intuition for understanding behavior and relationships. Partial derivatives measure how a function changes with respect to one variable while keeping all others fixed, and computing them involves applying familiar differentiation rules to each variable independently. Higher-order partial derivatives arise by differentiating again, and mixed partial derivatives—obtained by differentiating with respect to different variables—are typically equal under continuity conditions, a result formalized by Clairaut's Theorem. Tangent planes generalize the concept of tangent lines to surfaces, enabling linear approximations that accurately describe function behavior in small neighborhoods around a point. The multivariable chain rule governs how composite functions are differentiated when intermediate variables themselves depend on multiple parameters, requiring careful tracking of dependencies. Directional derivatives quantify the rate of change in any chosen direction, while the gradient vector encodes the direction of steepest ascent and reveals how functions increase most rapidly through space. These tools prove essential for optimization problems where finding maximum and minimum values requires identifying critical points and analyzing their nature using the second partial derivative test. When optimization must satisfy constraints, Lagrange multipliers provide a systematic method for locating extrema on surfaces defined by equation restrictions. Implicit differentiation extends to multivariable contexts where functions are defined indirectly through equations relating multiple variables. Understanding saddle points—locations where function behavior differs across different directions—completes the picture of how multivariable functions behave locally. The concepts developed here form the foundation for studying vector fields, line integrals, and surface integrals in subsequent vector calculus courses.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥