# Substitution in integrals

This is a sub-page of our page on Integration (of functions of one real variable).

///////

The interactive simulations on this page can be navigated with the Free Viewer
of the Graphing Calculator.

///////

Affine approximation in $\, {\mathbf{R}}^1$ :

The affine approximation of the function $\, f(x) : \mathbf{R}^1 \longrightarrow \mathbf{R}^1 \,$
in the neighborhood of the point $\, p \in \mathbf{R}^1 \,$
is given by

$\, f(p + \Delta p) = f(p) + f'(p) \Delta p + o(\Vert \Delta p \Vert) \,$.

///////

$\, f(\textcolor{blue}{x_0} + \textcolor{red} {\Delta x}) = f(\textcolor{blue}{x_0}) + \textcolor{blue}{f'(\textcolor{black} {x})_{\textcolor{blue}{x_0}} \textcolor{red} {\Delta x}} + o(\Vert \textcolor{red} {\Delta x} \Vert) \,$.

///////

$\, f(\textcolor{blue} {p} + \textcolor{red} {\Delta x}) = f(\textcolor{blue} {p}) + \textcolor{blue}{f'(\textcolor{black} {x})_{\textcolor{blue} {p}}} \textcolor{red} {\Delta x} + o(\Vert \textcolor{red} {\Delta x} \Vert) \,$.

$\, f(\textcolor{blue} {p} - \textcolor{red} {\Delta x}) = f(\textcolor{blue} {p}) - \textcolor{blue}{f'(\textcolor{black} {x})_{\textcolor{blue} {p}}} \textcolor{red} {\Delta x} + o(\Vert \textcolor{red} {\Delta x} \Vert) \,$.

/////// Quoting Courant, R, Differential and Integral Calculus, Vol.I, 1937 (1934):

(Courant, 1937, Vol.I, p.106):

The equation $\, \lim\limits_{h \to 0} \frac{f(x + h) - f(x)} {h} = f'(x) \,$ defining the derivative
is equivalent to the equations

$\, f(x + h) - f(x) = hf'(x) + \epsilon h \,$
or
$\, y + \Delta y = f(x + \Delta x) = f(x) + f'(x) \Delta x + \epsilon \Delta x \,$,

where $\, \epsilon \,$ is a quantity which tends to zero with $\, h = \Delta x \,$. If for the moment we think of the point $\, x \,$ as fixed and the increment $\, \Delta x \,$ as variable, then by this formula the increment of the function, that is, the quantity $\, \Delta y \,$, consists of two terms, namely, a part $\, hf'(x) \,$ which is proportional to $\, h \,$, and an “error” which can be made as small as we please relative to $\, h \,$ by making $\, h \,$ itself small enough. Thus the smaller the interval about the point $\, x \,$ which we consider, the more accurately is the function $\, f(x + h) \,$ (which is a function of $\, h \,$) represented by its linear part $\, f(x) + hf'(x) \,$.

This approximate representation of the function $\, f(x + h) \,$ by a linear function of $\, h \,$ is expressed geometrically by the substitution for the curve of its tangent at the point $\, x \,$. Later, in Chapter VII, we shall consider the practical application of these ideas to the performance of approximate calculations.

/////// End of Quote from Courant (1937)

There is one term in the above quote that needs commenting, and this is Courant’s use of the term “linear”, when he says that the function $\, f(x + h) \,$ is more and more accurately represented by its linear part $\, f(x) + hf'(x) \,$.

However, the function $\, f(x) + hf'(x) \,$ is not really linear as a function of $\, h \,$, since it maps $\, h = 0 \,$ to $\, f(x) \,$ which is in general not zero (as it must be for a linear function).

In fact, under the assumption that $\, x \,$ is fixed and $\, h \,$ is variable, the function $\, f(x) + hf'(x) \,$ is affine as a function of $\, h \,$.

The function $\, hf'(x) \,$ on the other hand, is linear as a function of $\, h \,$.

Hence what Courant is presenting in the passage quoted above is in fact the affine approximation of a function $\, f(x) \,$ around a (temporarily) fixed point $\, x \,$.

Of course, Courant is well aware of the difference between a linear function and an affine function. If we continue the above quote, we find him using the term linear in a correct way:

/////// Quoting (Courant, 1937, Vol.I, p.106):

Here we merely remark in passing that it is possible to use this approximate representation of the increment $\, \Delta y \,$ by the linear expression $\, h f'(x) \,$ to construct a logically satisfactory definition of the notion of a “differential”, as was done by Cauchy in particular.

While the idea of the differential as an infinitely small quantity has no meaning, and it is accordingly futile to define the derivative as the quotient of two such quantities, we may still try to assign a sense to the equation $\, f'(x) = dy / dx \,$ in such a way that the expression $\, dy / dx \,$ need not be thought of as purely symbolic, but as the actual quotient of two quantities $\, dy \,$ and $\, dx \,$.

For this purpose we first define the derivative $\, f'(x) \,$ by our limiting process, then think of $\, x \,$ as fixed and consider the increment $\, h = \Delta x \,$ as the independent variable. This quantity $\, h \,$ we call the differential of $\, x \,$, and write $\, h = dx \,$. We now define the expression $\, dy = y' dx = h f'(x) \,$ as the differential of the function $\, y \,$; $\, dy \,$ is therefore a number which has nothing to do with infinitely small quantities.

So the derivative $\, y' = f'(x) \,$ is now really the quotient of the two differentials $\, dy \,$ and $\, dx \,$; but in this statement there is nothing remarkable; it is, in fact, merely a tautology, a restatement of the verbal definition. The differential $\, dy \,$ is accordingly the linear part of the increment $\, \Delta y \,$ (see fig. 16).

/////// End of Quote from Courant (1937)

Affine approximation in one dimension:

///////

The Chain Rule in R^1:

///////

Slightly different notation – adapted to the pullback x = x(u) :

$\, \begin{matrix} \mathbf{R}^2 & \xleftarrow{f} & \mathbf{R}^2 \\ & & & \\ f(x) & \longleftarrow & x \\ & & & \\ {\mathbf{R}^2}_{f(\textcolor{blue}{p})} & \xleftarrow{{f'(x)}_{\textcolor{blue}{p}}} & {\mathbf{R}^2}_{\textcolor{blue}{p}} \\ & & & \\ df_{f(\textcolor{blue}{p})} = {f'(x)}_{\textcolor{blue}{p}} dx_{\textcolor{blue}{p}} & \longleftarrow & dx_{\textcolor{blue}{p}} \end{matrix} \,$.

Expand the diagram through the substitution $\, x = x(u) \,$:

$\, \begin{matrix} \mathbf{R}^2 & \xleftarrow{f} & \mathbf{R}^2 & \xleftarrow{x} & \mathbf{R}^2 \\ & & & & & \\ f(x(u)) & \longleftarrow & x(u) & \longleftarrow & u \\ & & & & & \\ {\mathbf{R}^2}_{f(x(\textcolor{blue}{p}))} & \xleftarrow{{f'(x)}_{x(\textcolor{blue}{p})}} & {\mathbf{R}^2}_{x(\textcolor{blue}{p})} & \xleftarrow{x'(u)_{\textcolor{blue}{p}}} & {\mathbf{R}^2}_{\textcolor{blue}{p}} \\ & & & & & \\ df_{f(x(\textcolor{blue}{p}))} = {f'(x)}_{x(\textcolor{blue}{p})} dx_{x(\textcolor{blue}{p})} & \longleftarrow & dx_{x(\textcolor{blue}{p})} = x'(u)_\textcolor{blue}{p} du_{\textcolor{blue}{p}} & \longleftarrow & du_{\textcolor{blue}{p}} \end{matrix} \,$.

///////

///////

Substitution in integrals

Functions of one variable:

In one variable, we are mapping from a line to a line. A line segment (such as $\, dx$) can be given an “intrinsic” direction (= a direction that does not depend on any coordinate representation). In “ordinary” (= commutative) algebra directional ordering is impossible for higher-dimensional manifolds (such as planes or solids). This is the reason behind the slight (but very important) differences between the “variable-substitution” formula in the case of one variable and the case of several variables which is described below.

In 2 dimensions, $\, dx \,$ represents a surface element with the area $\, dx = dx_1 dx_2$,
in 3 dimensions, $\, dx \,$ represents a solid element with the volume $\, dx = dx_1 dx_2 dx_3 \,$.

We normally write

$\, \begin{matrix} \mathbf{R} & \xleftarrow{{{\int} \atop {D}}} & \mathbf{R}^1 & \xleftarrow{f} & \mathbf{R}^1 & \xleftarrow{x} & \mathbf{R}^1 \\ & & & & & & & \\ {{\int} \atop {D}} f(x) dx & \longleftarrow & f(x) & \longleftarrow & x & & \\ & & & & & & & \\ & & & & D & & \\ & & & & & & & \\ {{\int} \atop {x^{-1}(D)}} f(x(u)) \det x'(u) du & \longleftarrow & f(x(u)) & \longleftarrow & x(u) & \longleftarrow & u \\ & & & & & & & \\ & & & & & & {x^{-1}(D)} \\ \end{matrix} \,$

$\, {{\int} \atop {D}} f(x) dx \, \equiv \, {{\int} \atop {x^{-1}(D)}} f(x(u)) \det x'(u) du \, \equiv \, \, {{\int} \atop {x^{-1}(D)}} f(x(u)) x'(u) du \,$.

///////

Functions of several variables:

$\, \begin{matrix} \mathbf{R} & \xleftarrow{{{\int} \atop {D}}} & \mathbf{R}^n & \xleftarrow{f} & \mathbf{R}^m & \xleftarrow{x} & \mathbf{R}^m \\ & & & & & & & \\ {{\int} \atop {D}} f(x) |dx| & \longleftarrow & f(x) & \longleftarrow & x & & \\ & & & & & & & \\ & & & & D & & \\ & & & & & & & \\ {{\int} \atop {x^{-1}(D)}} f(x(u)) |\det x'(u)| |du| & \longleftarrow & f(x(u)) & \longleftarrow & x(u) & \longleftarrow & u \\ & & & & & & & \\ & & & & & & {x^{-1}(D)} \\ \end{matrix} \,$

$\, {{\int} \atop {D}} f(x)|dx| = {{\int} \atop {x^{-1}(D)}} f(x(u)) |\det x'(u)| | du | \,$.

///////

x = x(u):

$\, x(\textcolor{blue} {p} + \textcolor{red} {\Delta u}) = x(\textcolor{blue}{p}) + \textcolor{blue} {x'(\textcolor{black} {u})_p} \textcolor{red} {\Delta u} + o(\Vert \textcolor{red} {\Delta u} \Vert) \,$.

f(x) = f(x(u)) :

$\, f(x(\textcolor{blue} {p} + \textcolor{red} {\Delta u})) = f(x(\textcolor{blue} {p})) + \textcolor{blue} {f'(\textcolor{black} {x})_{x(p)} \textcolor{blue} {x'(\textcolor{black} {u})_p}} \textcolor{red} {\Delta u} + o(\Vert \textcolor{red} {\Delta u} \Vert) \,$.

NOTE: There is a subtle but very important difference between the substitution formulas in one and several variables. In the case of one variable the determinant appears WITHOUT a modulus, whereas in the case of several variables the determinant appears WITH a modulus.

The appearance of the modulus around the determinant reflects the fact that using ordinary (= commutative) polynomial algebra it is impossible to assign a direction to objects of more than one dimensions, such as planes and solids, in a coordinate-free way. As a consequence of this fact we must know in advance where the determinant changes its sign, since we are not allowed to integrate over a surface or a solid where this happens.

///////