The Chain Rule (in Several Real Variables)

This page is a sub-page of our page on Calculus of Several Real Variables.

///////

Related KMR-pages:

///////

Other relevant sources of information:

• …

///////

The interactive simulations on this page can be navigated with the Free Viewer
of the Graphing Calculator.

///////

Anchors into the text below:

///////

THE CHAIN RULE

Armed with the concepts of differentiation and affine approximation one is well prepared to engage with the chain rule in its full, multidimensional glory. The chain rule is one of the most dazzling jewels of calculus, and it reveals a fundamental connection between calculus and linear algebra.

/////// Quoting Wikipedia/Chain rule:

The simplest generalization of the chain rule to higher dimensions uses the total derivative, which is a linear transformation that captures the function’s rates of change along each one of its coordinate axes.

/////// End of quote from Wikipedia/Chain rule

The chain rule expressed through a combination of affine approximations

Let $\, X, Y \,$ and $\, Z \,$ be linear spaces and let $\, g : Z \leftarrow Y \,$ and $\, f : Y \leftarrow X \,$ be two differentiable functions. Select a point $\, \textcolor{blue}{p} \in X \,$ as well as its successive image points $\, f(\textcolor{blue}{p}) \in Y \,$ and $\, g(f(\textcolor{blue}{p})) \in Z$. We then have

$\, Z_{g(f(\textcolor{blue}{p}))} \, \xleftarrow{\, g} \, Y_{f(\textcolor{blue}{p})} \, \xleftarrow{\, f} \, X_{\textcolor{blue}{p}} \,$

where $\, X_{\textcolor{blue}{p}} \, , \, Y_{f(\textcolor{blue}{p})} \, , \, Z_{g(f(\textcolor{blue}{p}))} \,$ are the maximal affine subspaces of $\, X \, , Y \, , \, Z \,$
that correspond to the chosen points $\, \textcolor{blue}{p}, f(\textcolor{blue}{p}),$ and $\, g(f(\textcolor{blue}{p}))$.

We can think of these maximal subspaces as the affine spaces that arise when we choose a fixed point in the corresponding linear spaces. This choice brings with it a linear space of displacement vectors that act by displacement (or translation) on the vectors of the original linear space.

The corresponding affine approximations

$\, Y_{f(\textcolor{blue}{p})} \ni f(\textcolor{blue}{p}) + {f'(x)}_{\textcolor{blue}{p}} \, \textcolor{red}{\Delta x} \, \xleftarrow{\, f_{\textcolor{blue}{p}}} \, \textcolor{blue}{p} + \textcolor{red}{\Delta x} \in X_{\textcolor{blue}{p}} \,$

and

$\, Z_{g(f(\textcolor{blue}{p}))} \ni g(f(\textcolor{blue}{p})) + {g'(f)}_{f(\textcolor{blue}{p})} \, \textcolor{red}{\Delta f} \, \xleftarrow{\, g_{f(\textcolor{blue}{p})}} \, f(\textcolor{blue}{p}) + \textcolor{red}{\Delta f} \in Y_{f(\textcolor{blue}{p})} \,$

combine into

$\, Z_{g(f(\textcolor{blue}{p}))} \ni (g \circ f)(\textcolor{blue}{p}) + {(g \circ f)'(x)}_{\textcolor{blue}{p}} \, \textcolor{red}{\Delta x} \, \xleftarrow{\, {(g \circ f)}_{\textcolor{blue}{p}}} \, \textcolor{blue}{p} + \textcolor{red}{\Delta x} \in X_p \,$,

and their linear parts combine into the matrix product

$\, {(g \circ f)'(x)}_{\textcolor{blue}{p}} = {g'(f)}_{f(\textcolor{blue}{p})} \, {f'(x)}_{\textcolor{blue}{p}}$.

This relationship is called the chain rule.

///////

THE CHAIN RULE

The chain rule for functions from 1D to 1D

The Chain Rule for functions from $\, \mathbb{R}^1 \,$ to $\, \mathbb{R}^1$:

$\, \begin{matrix} \mathbb{R}^1 & \xleftarrow{\qquad f \qquad} & \mathbb{R}^1 \\ \uparrow & & \uparrow \\ f(x) & \xleftarrow{\qquad \qquad}\shortmid & x \\ & & & & & \\ {\mathbb{R}^1}_{f(\textcolor{blue} {p})} & \xleftarrow{\;\;\;\; {f'(x)}_{\textcolor{blue}{p}} \;\;\;\; } & {\mathbb{R}^1}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ df = {f'(x)}_{\textcolor{blue}{p}} dx & \xleftarrow{\qquad \qquad }\shortmid & dx \end{matrix} \,$.

///////

$\, {\mathbb{R}^1}_{f(\textcolor{blue} {p})} \, \ni df = f'(x)_{\textcolor{blue} {p}} \, dx \,$

/////// Expand the diagram to the right through the pullback substitution $\, x = x(u) \,$:

$\, \begin{matrix} \mathbb{R}^1 & \xleftarrow{\qquad \qquad f \qquad \qquad} & \mathbb{R}^1 & \xleftarrow{\qquad \;\;\;\; x \;\;\;\; \qquad} & \mathbb{R}^1 \\ \uparrow & & \uparrow & & \uparrow \\ f(x(u)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & x(u) & \xleftarrow{\qquad \qquad \qquad}\shortmid & u \\ & & & & & \\ {\mathbb{R}^1}_ {f(x(\textcolor{blue} {p}))} & \xleftarrow{{\qquad f'(x)}_{x(\textcolor{blue} {p})} \qquad} & {\mathbb{R}^1}_{x(\textcolor{blue} {p})} & \xleftarrow{\qquad x'(u)_{\textcolor{blue}{p}} \qquad} & {\mathbb{R}^1}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ df = {f'(x)}_{x(\textcolor{blue} {p})} \, dx & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & dx = x'(u)_{\textcolor{blue}{p}} \, du & \xleftarrow{\qquad \qquad \qquad}\shortmid & du \\ & & & & & \\ {\mathbb{R}^1}_{f(x(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (f \circ x)'(u)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {\mathbb{R}^1}_{\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \, du & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & du \end{matrix} \,$

/////// The chain rule for pullback:

$\, (f \circ x)'(u)_{\textcolor{blue}{p}} = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue} {p}}$.

///////

The chain rule for functions from 2D to 1D

The Chain Rule for functions from $\, \mathbb{R}^2 \,$ to $\, \mathbb{R}^1$:

///// First discuss gradients, i.e., the linear part of an affine approximation of functions from $\, \mathbb{R}^n \,$ to $\, \mathbb{R}^1$.

/////// DESCRIBE THIS DIAGRAM (which will be expanded to the left and to the right below)

$\, \begin{matrix} \mathbb{R}^1 & \xleftarrow{\qquad f \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow \\ f(x) & \xleftarrow{\qquad \qquad}\shortmid & x \\ & & & & & \\ {\mathbb{R}^1}_{f(\textcolor{blue} {p})} & \xleftarrow{\;\;\;\; {f'(x)}_{\textcolor{blue}{p}} \;\;\;\; } & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ df = {f'(x)}_{\textcolor{blue}{p}} \, dx & \xleftarrow{\qquad \qquad }\shortmid & dx \end{matrix} \,$.

///////

$\, {\mathbb{R}^1}_{f(\textcolor{blue} {p})} \, \ni df = f'(x)_{\textcolor{blue} {p}} \, dx = \begin{pmatrix} \frac{\partial f}{\partial x_1} & \frac{\partial f}{\partial x_2} \end{pmatrix}_{\textcolor{blue} {p}} \, {\begin{pmatrix} dx_1 \\ dx_2 \end{pmatrix}} = {\frac{\partial f}{\partial x_1}}_{\textcolor{blue} {p}} dx_1 + {\frac{\partial f}{\partial x_2}}_{\textcolor{blue} {p}} dx_2 \,$

/////// Expand the diagram to the right through the pullback substitution $\, x = x(u) \,$:

$\, \begin{matrix} \mathbb{R}^1 & \xleftarrow{\qquad \qquad f \qquad \qquad} & \mathbb{R}^2 & \xleftarrow{\qquad \;\;\;\; x \;\;\;\; \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow & & \uparrow \\ f(x(u)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & x(u) & \xleftarrow{\qquad \qquad \qquad}\shortmid & u \\ & & & & & \\ {\mathbb{R}^1}_ {f(x(\textcolor{blue} {p}))} & \xleftarrow{{\qquad f'(x)}_{x(\textcolor{blue} {p})} \qquad} & {\mathbb{R}^2}_{x(\textcolor{blue} {p})} & \xleftarrow{\qquad x'(u)_{\textcolor{blue}{p}} \qquad} & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ df = {f'(x)}_{x(\textcolor{blue} {p})} \, dx & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & dx = x'(u)_{\textcolor{blue}{p}} \, du & \xleftarrow{\qquad \qquad \qquad}\shortmid & du \\ & & & & & \\ {\mathbb{R}^1}_{f(x(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (f \circ x)'(u)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \, du & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & du \end{matrix} \,$

/////// The chain rule:

$\, (f \circ x)'(u)_{\textcolor{blue}{p}} = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \,$

///////

$\, {\mathbb{R}^1}_ {f(x(\textcolor{blue} {p}))} \, \ni \, d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = f'(x)_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue} {p}} \, du = \begin{pmatrix} \frac{\partial f}{\partial x_1} & \frac{\partial f}{\partial x_2} \end{pmatrix}_{x(\textcolor{blue} {p})} \begin{pmatrix} \frac{\partial x_1}{\partial u_1} & \frac{\partial x_1}{\partial u_2} \\ \frac{\partial x_2}{\partial u_1} & \frac{\partial x_2}{\partial u_2} \end{pmatrix}_{\textcolor{blue} {p}} {\begin{pmatrix} du_1 \\ du_2 \end{pmatrix}} =$

$\, = \begin{pmatrix} \frac{\partial f}{\partial x_1} & \frac{\partial f}{\partial x_2} \end{pmatrix}_{x(\textcolor{blue} {p})} \begin{pmatrix} \frac{\partial x_1}{\partial u_1}_{\textcolor{blue} {p}} du_1 + \frac{\partial x_1}{\partial u_2}_{\textcolor{blue} {p}} du_2 \\ \frac{\partial x_2}{\partial u_1}_{\textcolor{blue} {p}} du_1 + \frac{\partial x_2}{\partial u_2}_{\textcolor{blue} {p}} du_2 \end{pmatrix} = \,$

$\, = {\frac{\partial f}{\partial x_1}}_{x(\textcolor{blue} {p})} ( \frac{\partial x_1}{\partial u_1}_{\textcolor{blue} {p}} du_1 + \frac{\partial x_1}{\partial u_2}_{\textcolor{blue} {p}} du_2 ) + {\frac{\partial f}{\partial x_2}}_{x(\textcolor{blue} {p})} ( \frac{\partial x_2}{\partial u_1}_{\textcolor{blue} {p}} du_1 + \frac{\partial x_2}{\partial u_2}_{\textcolor{blue} {p}} du_2 ) = \,$

$\, = ( {\frac{\partial f}{\partial x_1}}_{x(\textcolor{blue} {p})} \frac{\partial x_1}{\partial u_1}_{\textcolor{blue} {p}} + {\frac{\partial f}{\partial x_2}}_{x(\textcolor{blue} {p})} \frac{\partial x_2}{\partial u_1}_{\textcolor{blue} {p}} ) du_1 + ( {\frac{\partial f}{\partial x_1}}_{x(\textcolor{blue} {p})} \frac{\partial x_1}{\partial u_2}_{\textcolor{blue} {p}} + {\frac{\partial f}{\partial x_2}}_{x(\textcolor{blue} {p})} \frac{\partial x_2}{\partial u_2}_{\textcolor{blue} {p}} ) du_2 \,$

///////

The chain rule for functions from 2D to 2D

The Chain Rule for functions from $\, \mathbb{R}^2 \,$ to $\, \mathbb{R}^2$:

$\, \begin{matrix} \mathbb{R}^2 & \xleftarrow{\qquad f \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow \\ f(x) & \xleftarrow{\qquad \qquad}\shortmid & x \\ & & & & & \\ {\mathbb{R}^2}_{f(\textcolor{blue} {p})} & \xleftarrow{\;\;\;\; {f'(x)}_{\textcolor{blue}{p}} \;\;\;\; } & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ df = {f'(x)}_{\textcolor{blue}{p}} \, dx & \xleftarrow{\qquad \qquad }\shortmid & dx \end{matrix} \,$.

/////// Expand the diagram to the left by operating with a function $\, g \,$:

$\, \begin{matrix} \mathbb{R}^2 & \xleftarrow{\qquad \qquad g \qquad \qquad} & \mathbb{R}^2 & \xleftarrow{\qquad \;\;\;\; f \;\;\;\; \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow & & \uparrow \\ g(f(x)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & f(x) & \xleftarrow{\qquad \qquad \qquad}\shortmid & x \\ & & & & & \\ {\mathbb{R}^2}_{g(f(\textcolor{blue} {p}))} & \xleftarrow{{\qquad g'(f)}_{f(\textcolor{blue} {p})} \qquad} & {\mathbb{R}^2}_{f(\textcolor{blue} {p})} & \xleftarrow{\qquad f'(x)_{\textcolor{blue}{p}} \qquad} & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ dg = {g'(f)}_{f(\textcolor{blue} {p})} \, df & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & df = f'(x)_{\textcolor{blue}{p}} \, dx & \xleftarrow{\qquad \qquad \qquad}\shortmid & dx \\ & & & & & \\ {\mathbb{R}^2}_{g(f(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (g \circ f)'(x)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(g \circ f) = (g \circ f)'(x)_{\textcolor{blue}{p}} \, dx = {g'(f)}_{f(\textcolor{blue} {p})} \, f'(x)_{\textcolor{blue}{p}} \, dx & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & dx \end{matrix} \,$

/////// The chain rule:

$\, {(g \circ f)'(x)}_{\textcolor{blue}{p}} = {g'(f)}_{f(\textcolor{blue}{p})} \, {f'(x)}_{\textcolor{blue}{p}}$.

///////

${\mathbb{R}^2}_{g(f(\textcolor{blue} {p}))} \, \ni \, d(g \circ f) = (g \circ f)'(x)_{\textcolor{blue}{p}} \, dx = {g'(f)}_{f(\textcolor{blue} {p})} \, f'(x)_{\textcolor{blue} {p}} \, dx = \begin{pmatrix} \frac{\partial g_1}{\partial f_1} & \frac{\partial g_1}{\partial f_2} \\ \frac{\partial g_2}{\partial f_1} & \frac{\partial g_2}{\partial f_2} \end{pmatrix}_{f(\textcolor{blue} {p})} \begin{pmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} \end{pmatrix}_{\textcolor{blue} {p}} \begin{pmatrix} dx_1 \\ dx_2 \end{pmatrix}$

///////

$\, \begin{matrix} \mathbb{R}^2 & \xleftarrow{\qquad f \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow \\ f(x) & \xleftarrow{\qquad \qquad}\shortmid & x \\ & & & & & \\ {\mathbb{R}^2}_{f(\textcolor{blue} {p})} & \xleftarrow{\;\;\;\; {f'(x)}_{\textcolor{blue}{p}} \;\;\;\; } & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ df = {f'(x)}_{\textcolor{blue}{p}} \, dx & \xleftarrow{\qquad \qquad }\shortmid & dx \end{matrix} \,$.

///////

Expand the diagram to the right through the pullback substitution $\, x = x(u) \,$:

$\, \begin{matrix} \mathbb{R}^2 & \xleftarrow{\qquad \qquad f \qquad \qquad} & \mathbb{R}^2 & \xleftarrow{\qquad \;\;\;\; x \;\;\;\; \qquad} & \mathbb{R}^2 \\ \uparrow & & \uparrow & & \uparrow \\ f(x(u)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & x(u) & \xleftarrow{\qquad \qquad \qquad}\shortmid & u \\ & & & & & \\ {\mathbb{R}^2}_ {f(x(\textcolor{blue} {p}))} & \xleftarrow{{\qquad f'(x)}_{x(\textcolor{blue} {p})} \qquad} & {\mathbb{R}^2}_{x(\textcolor{blue} {p})} & \xleftarrow{\qquad x'(u)_{\textcolor{blue}{p}} \qquad} & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ df = {f'(x)}_{x(\textcolor{blue} {p})} \, dx & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & dx = x'(u)_{\textcolor{blue}{p}} \, du & \xleftarrow{\qquad \qquad \qquad}\shortmid & du \\ & & & & & \\ {\mathbb{R}^2}_{f(x(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (f \circ x)'(u)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {\mathbb{R}^2}_{\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \, du & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & du \end{matrix} \,$

/////// The chain rule:

$\, (f \circ x)'(u)_{\textcolor{blue}{p}} = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \,$

///////

$\, {\mathbb{R}^2}_{f(x(\textcolor{blue} {p}))} \, \ni \, d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = f'(x)_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue} {p}} \, du = \begin{pmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} \end{pmatrix}_{x(\textcolor{blue} {p})} \begin{pmatrix} \frac{\partial x_1}{\partial u_1} & \frac{\partial x_1}{\partial u_2} \\ \frac{\partial x_2}{\partial u_1} & \frac{\partial x_2}{\partial u_2} \end{pmatrix}_{\textcolor{blue} {p}} \begin{pmatrix} du_1 \\ du_2 \end{pmatrix}$

/////// /////// ///////

Going back and forth between macro-cosmos (where $\, \Delta x \,$ ‘lives’) and micro-cosmos (where $\, dx \,$ lives):

///////

///////

The chain rule from $\, \mathbb{R}^2 \,$ to $\, \mathbb{R}^2$: ///////

//////

The chain rule in arbitrary finite dimensions:

$\, \begin{matrix} \mathbb{R}^s & \xleftarrow{\qquad \qquad f \qquad \qquad} & \mathbb{R}^n & \xleftarrow{\qquad \;\;\;\; x \;\;\;\; \qquad} & \mathbb{R}^m \\ \uparrow & & \uparrow & & \uparrow \\ f(x(u)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & x(u) & \xleftarrow{\qquad \qquad \qquad}\shortmid & u \\ & & & & & \\ {\mathbb{R}^s}_ {f(x(\textcolor{blue} {p}))} & \xleftarrow{{\qquad f'(x)}_{x(\textcolor{blue} {p})} \qquad} & {\mathbb{R}^n}_{x(\textcolor{blue} {p})} & \xleftarrow{\qquad x'(u)_{\textcolor{blue}{p}} \qquad} & {\mathbb{R}^m}_{\textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ df = {f'(x)}_{x(\textcolor{blue} {p})} \, dx & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & dx = x'(u)_{\textcolor{blue}{p}} \, du & \xleftarrow{\qquad \qquad \qquad}\shortmid & du \\ & & & & & \\ {\mathbb{R}^s}_{f(x(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (f \circ x)'(u)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {\mathbb{R}^m}_{\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(f \circ x) = (f \circ x)'(u)_{\textcolor{blue}{p}} \, du = {f'(x)}_{x(\textcolor{blue} {p})} \, x'(u)_{\textcolor{blue}{p}} \, du & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & du \end{matrix} \,$

///////

The $\, macro \, \longrightarrow \, micro \, \longrightarrow \, macro \,$ perspective
induced by differentiation and affine approximation
:

$\, \begin{matrix} macro & \xleftarrow{\qquad \;\;\;\; f \;\;\;\; \qquad} & macro \\ \uparrow & & \uparrow \\ f(x) & \xleftarrow{\qquad \qquad \qquad}\shortmid & x \\ & & & & & \\ {micro}_{f(\textcolor{blue} {p})} & \xleftarrow{\qquad f'(x)_{\textcolor{blue}{p}} \qquad} & {micro}_{\, \textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ \textcolor{red}{df} = f'(x)_{\textcolor{blue}{p}} \, \textcolor{red}{dx} & \xleftarrow{\qquad \qquad \qquad}\shortmid & \textcolor{red}{dx} \\ & & & & & \\ {macro}_{f(\textcolor{blue} {p})} & \xleftarrow{\qquad f'(x)_{\textcolor{blue}{p}} \qquad} & {macro}_{\, \textcolor{blue}{p}} \\ \uparrow & & \uparrow \\ \textcolor{red}{\Delta f} = f'(x)_{\textcolor{blue}{p}} \, \textcolor{red}{\Delta x} & \xleftarrow{\qquad \qquad \qquad}\shortmid & \textcolor{red}{\Delta x} \end{matrix} \,$

///////

The affine approximation theorem for differentiable functions:

Definition: Let $\, \overrightarrow {O,P} \,$ be the translation taking $\, (0, 0) \,$ to $\, (\textcolor{blue}{p}, f(\textcolor{blue}{p}))$, where $\, \textcolor{blue}{p} \in \mathbb{R}^n \,$ and $\, f(\textcolor{blue}{p}) \in \mathbb{R}^m$. The affine approximation of $\, f \,$ at the point $\, \textcolor{blue}{p}$
is the function $\, f_{\textcolor{blue}{p}} : \mathbb{R}^n \rightarrow \mathbb{R}^m \,$ defined by $\, f_{\textcolor{blue}{p}}(x) = f(\textcolor{blue}{p}) + {f'(x)}_{\textcolor{blue}{p}} \, (x - \textcolor{blue}{p}) \,$.

The function $\, f_{\textcolor{blue}{p}} \,$ deserves the uniquely identifying name
the affine approximation of $\, f$ at $\, \textcolor{blue}{p} \,$
because of the following:

Theorem: If the function $\, f \,$ is differentiable at the point $\, \textcolor{blue}{p}$, then we have
$\, f(x) = f_{\textcolor{blue}{p}}(x) +$ $\, o(x - \textcolor{blue}{p}) \,$ when $\, x \rightarrow \textcolor{blue}{p}$.
Moreover, $\, f_{\textcolor{blue}{p}} \,$ is the only affine map that behaves in this way,
i.e., that converges to $\, f \,$ in the first order at the point $\, \textcolor{blue}{p}$.

NOTE: The notation o( ) refers to the so-called little-O notation.
It says that $\, o(w_{hatever}) \,$ shrinks faster than $\, w_{hatever} \,$ when $\, w_{hatever} \,$ shrinks to zero.

///////

The chain rule from a macro and micro perspective:

The chain rule from a $\, macro \,$ and $\, micro \,$ perspective:

$\, \begin{matrix} macro & \xleftarrow{\qquad \qquad g \qquad \qquad} & macro & \xleftarrow{\qquad \;\;\;\; f \;\;\;\; \qquad} & macro \\ \uparrow & & \uparrow & & \uparrow \\ g(f(x)) & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & f(x) & \xleftarrow{\qquad \qquad \qquad}\shortmid & x \\ & & & & & \\ {micro}_{\, g(f(\textcolor{blue} {p}))} & \xleftarrow{{\qquad g'(f)}_{f(\textcolor{blue} {p})} \qquad} & {micro}_{f(\textcolor{blue} {p})} & \xleftarrow{\qquad f'(x)_{\textcolor{blue}{p}} \qquad} & {micro}_{\, \textcolor{blue}{p}} \\ \uparrow & & \uparrow & & \uparrow \\ dg = {g'(f)}_{f(\textcolor{blue} {p})} \, df & \xleftarrow{\qquad \qquad \qquad \qquad}\shortmid & df = f'(x)_{\textcolor{blue}{p}} \, dx & \xleftarrow{\qquad \qquad \qquad}\shortmid & dx \\ & & & & & \\ {micro}_{\, g(f(\textcolor{blue} {p}))} & & \xleftarrow{\qquad \qquad \qquad (g \circ f)'(x)_{\textcolor{blue}{p}} \qquad \qquad \qquad} & & {micro}_{\,\textcolor{blue}{p}} \\ \uparrow & & & & \uparrow \\ d(g \circ f) = (g \circ f)'(x)_{\textcolor{blue}{p}} \, dx = {g'(f)}_{f(\textcolor{blue}{p})} f'(x)_{\textcolor{blue}{p}} \, dx & & \xleftarrow{\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad}\shortmid & & dx \end{matrix} \,$

///////

Important: In “ordinary calculus” the micro-space can only be approached via limits, since the infinitesimals $\, dx \,$ and $\, df \,$ do not exist as real numbers. However, in the early 1960:ies Abraham Robinson came up with a model called non-standard analysis, which included hyper-real numbers, where $\, dx \,$ and $\, df \,$ exist as entities in themselves.

///////

8 thoughts on “The Chain Rule (in Several Real Variables)”

1. Roda says:

Hi, I do think this is an excellent site. I stumbled upon it I may come back yet again since i have bookmarked it. Money and freedom is the best way to change, may you be rich and continue to guide other people.

2. Herschel says:

Hey there, You’ve done a fantastic job. I’ll definitely digg it and in my view recommend to my friends. I’m confident they will benefit from this website.

3. Kiszka says:

4. Helena says:

Great site you have got here.. It’s difficult to find high quality writing like yours nowadays. I truly appreciate individuals like you! Take care!!

5. Gil says:

We are a group of volunteers and opening a new scheme in our community. Your web site provided us with valuable info to work on. You’ve done an impressive task and our entire neighborhood will be grateful to you.|

6. Dustin says:

Nice post. I learn something new and challenging on websites I stumble upon every day. It’s always interesting to read content from other authors and use something from other web sites.

7. Rodrick says:

I think the admin of this web page is really working hard in favor of his site, because here every material is quality based data.

8. Shaina says:

I’m extremely impressed with your writing skills as smartly as with the layout on your blog. Is that this a paid theme or did you modify it your self? Either way keep up the excellent high quality writing, it’s uncommon to peer a nice weblog like this one nowadays..|