Multivariable Integral · advanced · 55 min read

Surface Integrals & the Divergence Theorem

Integrating functions and vector fields over surfaces in ℝ³ — flux through oriented surfaces, the 3D curl and divergence, Stokes' theorem generalizing Green's from 2D to 3D, and the divergence theorem relating boundary flux to interior divergence.

Abstract. Surface integrals extend integration from curves to two-dimensional surfaces embedded in ℝ³. Given a parameterized surface S defined by r(u,v) for (u,v) in a parameter domain D*, the tangent vectors r_u and r_v span the tangent plane at each point, and their cross product r_u × r_v yields the normal vector whose magnitude is the surface area element dS = ‖r_u × r_v‖ du dv. For a scalar function f, the scalar surface integral ∬_S f dS = ∬_{D*} f(r(u,v)) ‖r_u × r_v‖ du dv sums f over S weighted by area. For a vector field F, the flux integral ∬_S F · dS = ∬_{D*} F(r(u,v)) · (r_u × r_v) du dv measures the net flow of F through the oriented surface. These constructions culminate in two fundamental theorems. Stokes' theorem ∮_C F · dr = ∬_S (∇ × F) · dS generalizes Green's theorem from 2D to 3D: the circulation of F around the boundary curve C of a surface S equals the flux of the curl ∇ × F through S. The divergence theorem (Gauss's theorem) ∬_S F · dS = ∭_E ∇ · F dV relates the net outward flux of F through a closed surface S to the total divergence of F in the enclosed volume E. Together, these theorems unify Green's theorem, the Gradient Theorem, and the Fundamental Theorem of Calculus under a single framework — the generalized Stokes' theorem ∫_M dω = ∫_{∂M} ω — connecting boundary integrals to interior derivatives at every dimension. In machine learning, the divergence theorem appears in conservation laws for gradient flow trajectories, in Stein's identity (the foundation of Stein variational gradient descent), in physics-informed neural networks enforcing PDE constraints, and in the flow-matching framework for generative models where the continuity equation ∂_t p + ∇ · (p v) = 0 governs density evolution under a learned velocity field.

Where this leads → formalML

  • formalML The divergence theorem quantifies how gradient flow trajectories converge or diverge through surfaces in parameter space. The divergence ∇ · (−∇L) = −ΔL determines whether trajectories focus into or spread from a region — connecting the Laplacian of the loss to the convergence geometry of optimization.
  • formalML The divergence theorem yields Stein's identity: E[∇ · F(X)] = −E[F(X) · ∇ log p(X)] for a smooth density p with suitable boundary conditions. This is the foundation of Stein variational gradient descent (SVGD) and kernel Stein discrepancy.
  • formalML Stokes' theorem on manifolds ∫_M dω = ∫_{∂M} ω subsumes Green's theorem, the classical Stokes' theorem, and the divergence theorem as dimension-specific instances. Surface integrals are integrals of 2-forms, and the surface area element dS is the pullback of the area 2-form.
  • formalML The Fisher-Rao volume form on the statistical manifold induces surface integrals that measure 'statistical area.' The divergence theorem on the statistical manifold connects natural gradient flow conservation to boundary flux in parameter space.

1. Overview & Motivation

You’re building a flow-matching generative model. The core idea: learn a velocity field v(x,t)\mathbf{v}(\mathbf{x}, t) that transforms a simple base distribution (say, a Gaussian) into your target data distribution over time t[0,1]t \in [0, 1]. The evolving probability density p(x,t)p(\mathbf{x}, t) obeys the continuity equation:

pt+(pv)=0.\frac{\partial p}{\partial t} + \nabla \cdot (p\,\mathbf{v}) = 0.

This equation says that probability is neither created nor destroyed — it flows. But how do we verify that no probability mass leaks through the boundaries of a region? We integrate the continuity equation over a volume EE and apply the divergence theorem:

ddtEpdV=EpvdS.\frac{d}{dt} \iiint_E p\,dV = -\oiint_{\partial E} p\,\mathbf{v} \cdot d\mathbf{S}.

The left side is the rate of change of total probability inside EE. The right side is the net flux of the probability current pvp\,\mathbf{v} through the boundary surface E\partial E. The divergence theorem converts the volume integral of (pv)\nabla \cdot (p\,\mathbf{v}) into this boundary flux — verifying mass conservation without tracking individual particles.

This is why we need surface integrals. They measure flow through surfaces — the net amount of a vector quantity passing through a two-dimensional membrane embedded in R3\mathbb{R}^3. The divergence theorem then connects that surface measurement to what happens in the interior. Together with Stokes’ theorem (which connects surface integrals to line integrals around the boundary), these results complete the hierarchy that started with the Fundamental Theorem of Calculus: interior derivatives determine boundary integrals, at every dimension.

2. Parameterized Surfaces & the Area Element

A surface in R3\mathbb{R}^3 is the two-dimensional analog of a curve. Where a curve is traced by one parameter tt, a surface is swept out by two parameters (u,v)(u, v). At each point, the partial derivatives ru\mathbf{r}_u and rv\mathbf{r}_v are tangent vectors that span the tangent plane. Their cross product ru×rv\mathbf{r}_u \times \mathbf{r}_v is perpendicular to the surface — it is the normal vector — and its magnitude gives the area of the infinitesimal parallelogram spanned by rudu\mathbf{r}_u\,du and rvdv\mathbf{r}_v\,dv. This magnitude is the surface area element dSdS.

The connection to the Jacobian (Topic 14) is precise: the parameterization r:DR3\mathbf{r}: D^* \to \mathbb{R}^3 is a map from a 2D parameter domain to 3D space. Its Jacobian matrix Jr=[rurv]J_{\mathbf{r}} = [\mathbf{r}_u \mid \mathbf{r}_v] is 3×23 \times 2. In the change-of-variables theorem, the scaling factor was detJϕ|\det J_\phi|, the absolute value of the determinant of a square Jacobian. Here the Jacobian is not square, so we use the Gram determinant det(JrTJr)\sqrt{\det(J_{\mathbf{r}}^T J_{\mathbf{r}})} instead — and this turns out to equal ru×rv\|\mathbf{r}_u \times \mathbf{r}_v\|.

📐 Definition 1 (Parameterized Surface)

A parameterized surface in R3\mathbb{R}^3 is a C1C^1 function r:DR3\mathbf{r}: D^* \to \mathbb{R}^3, where DR2D^* \subseteq \mathbb{R}^2 is a bounded, closed region (the parameter domain). We write r(u,v)=(x(u,v),  y(u,v),  z(u,v))\mathbf{r}(u, v) = (x(u,v),\; y(u,v),\; z(u,v)).

The surface is regular (or smooth) if the cross product ru×rv0\mathbf{r}_u \times \mathbf{r}_v \neq \mathbf{0} for all (u,v)(u, v) in the interior of DD^*. This ensures the tangent plane is well-defined at every point — the two tangent vectors are linearly independent, and the surface has no cusps or self-intersections locally.

📐 Definition 2 (Cross Product)

For vectors a=(a1,a2,a3)\mathbf{a} = (a_1, a_2, a_3) and b=(b1,b2,b3)\mathbf{b} = (b_1, b_2, b_3) in R3\mathbb{R}^3, the cross product is:

a×b=(a2b3a3b2,  a3b1a1b3,  a1b2a2b1).\mathbf{a} \times \mathbf{b} = (a_2 b_3 - a_3 b_2,\; a_3 b_1 - a_1 b_3,\; a_1 b_2 - a_2 b_1).

Equivalently, using the determinant mnemonic:

a×b=i^j^k^a1a2a3b1b2b3.\mathbf{a} \times \mathbf{b} = \begin{vmatrix} \hat{\mathbf{i}} & \hat{\mathbf{j}} & \hat{\mathbf{k}} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{vmatrix}.

Key properties:

  1. Orthogonality: (a×b)a=0(\mathbf{a} \times \mathbf{b}) \cdot \mathbf{a} = 0 and (a×b)b=0(\mathbf{a} \times \mathbf{b}) \cdot \mathbf{b} = 0 — the cross product is perpendicular to both factors.
  2. Area: a×b=absinθ\|\mathbf{a} \times \mathbf{b}\| = \|\mathbf{a}\|\,\|\mathbf{b}\|\sin\theta — the magnitude equals the area of the parallelogram spanned by a\mathbf{a} and b\mathbf{b}.
  3. Anti-commutativity: b×a=(a×b)\mathbf{b} \times \mathbf{a} = -(\mathbf{a} \times \mathbf{b}) — swapping the factors reverses the direction.

📐 Definition 3 (Surface Area Element)

For a regular parameterized surface r:DR3\mathbf{r}: D^* \to \mathbb{R}^3, the surface area element is:

dS=ru×rvdudv.dS = \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv.

The surface area of SS is:

Area(S)=Dru×rvdudv.\text{Area}(S) = \iint_{D^*} \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv.

The factor ru×rv\|\mathbf{r}_u \times \mathbf{r}_v\| is the area of the infinitesimal parallelogram spanned by the tangent vectors — it measures how much the parameterization stretches area at each point.

💡 Remark 1 (Connection to the Jacobian)

The Jacobian matrix of the parameterization is the 3×23 \times 2 matrix Jr=[rurv]J_{\mathbf{r}} = [\mathbf{r}_u \mid \mathbf{r}_v], with columns ru\mathbf{r}_u and rv\mathbf{r}_v. The Gram matrix is the 2×22 \times 2 matrix:

JrTJr=(rurururvrvrurvrv)=(ru2rurvrurvrv2).J_{\mathbf{r}}^T J_{\mathbf{r}} = \begin{pmatrix} \mathbf{r}_u \cdot \mathbf{r}_u & \mathbf{r}_u \cdot \mathbf{r}_v \\ \mathbf{r}_v \cdot \mathbf{r}_u & \mathbf{r}_v \cdot \mathbf{r}_v \end{pmatrix} = \begin{pmatrix} \|\mathbf{r}_u\|^2 & \mathbf{r}_u \cdot \mathbf{r}_v \\ \mathbf{r}_u \cdot \mathbf{r}_v & \|\mathbf{r}_v\|^2 \end{pmatrix}.

By the Lagrange identity:

ru×rv2=ru2rv2(rurv)2=det(JrTJr).\|\mathbf{r}_u \times \mathbf{r}_v\|^2 = \|\mathbf{r}_u\|^2 \|\mathbf{r}_v\|^2 - (\mathbf{r}_u \cdot \mathbf{r}_v)^2 = \det(J_{\mathbf{r}}^T J_{\mathbf{r}}).

So dS=det(JrTJr)dudvdS = \sqrt{\det(J_{\mathbf{r}}^T J_{\mathbf{r}})}\,du\,dv. This is the natural generalization of the 1D Jacobian detJϕ|\det J_\phi| from Topic 14: when the map goes from Rk\mathbb{R}^k to Rn\mathbb{R}^n with knk \le n, the area scaling factor is det(JTJ)\sqrt{\det(J^T J)}.

📝 Example 1 (Sphere of Radius R)

Parameterize the sphere of radius RR using spherical coordinates (θ,ϕ)(\theta, \phi) with θ[0,2π]\theta \in [0, 2\pi] (azimuthal) and ϕ[0,π]\phi \in [0, \pi] (polar):

r(θ,ϕ)=(Rsinϕcosθ,  Rsinϕsinθ,  Rcosϕ).\mathbf{r}(\theta, \phi) = (R\sin\phi\cos\theta,\; R\sin\phi\sin\theta,\; R\cos\phi).

The tangent vectors are:

rθ=(Rsinϕsinθ,  Rsinϕcosθ,  0),\mathbf{r}_\theta = (-R\sin\phi\sin\theta,\; R\sin\phi\cos\theta,\; 0),

rϕ=(Rcosϕcosθ,  Rcosϕsinθ,  Rsinϕ).\mathbf{r}_\phi = (R\cos\phi\cos\theta,\; R\cos\phi\sin\theta,\; -R\sin\phi).

We compute the cross product component by component:

rθ×rϕ=i^j^k^RsinϕsinθRsinϕcosθ0RcosϕcosθRcosϕsinθRsinϕ.\mathbf{r}_\theta \times \mathbf{r}_\phi = \begin{vmatrix} \hat{\mathbf{i}} & \hat{\mathbf{j}} & \hat{\mathbf{k}} \\ -R\sin\phi\sin\theta & R\sin\phi\cos\theta & 0 \\ R\cos\phi\cos\theta & R\cos\phi\sin\theta & -R\sin\phi \end{vmatrix}.

The i^\hat{\mathbf{i}}-component: (Rsinϕcosθ)(Rsinϕ)(0)(Rcosϕsinθ)=R2sin2ϕcosθ(R\sin\phi\cos\theta)(-R\sin\phi) - (0)(R\cos\phi\sin\theta) = -R^2\sin^2\phi\cos\theta.

The j^\hat{\mathbf{j}}-component: (0)(Rcosϕcosθ)(Rsinϕsinθ)(Rsinϕ)=R2sin2ϕsinθ(0)(R\cos\phi\cos\theta) - (-R\sin\phi\sin\theta)(-R\sin\phi) = -R^2\sin^2\phi\sin\theta.

The k^\hat{\mathbf{k}}-component: (Rsinϕsinθ)(Rcosϕsinθ)(Rsinϕcosθ)(Rcosϕcosθ)=R2sinϕcosϕ(-R\sin\phi\sin\theta)(R\cos\phi\sin\theta) - (R\sin\phi\cos\theta)(R\cos\phi\cos\theta) = -R^2\sin\phi\cos\phi.

So rθ×rϕ=R2sinϕ(Rsinϕcosθ,  sinϕsinθ,  cosϕ)\mathbf{r}_\theta \times \mathbf{r}_\phi = -R^2\sin\phi\,(R\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi). Wait — let us be more careful. Factoring:

rθ×rϕ=(R2sin2ϕcosθ,  R2sin2ϕsinθ,  R2sinϕcosϕ).\mathbf{r}_\theta \times \mathbf{r}_\phi = (-R^2\sin^2\phi\cos\theta,\; -R^2\sin^2\phi\sin\theta,\; -R^2\sin\phi\cos\phi).

=R2sinϕ(sinϕcosθ,  sinϕsinθ,  cosϕ).= -R^2\sin\phi\,(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi).

The vector (sinϕcosθ,  sinϕsinθ,  cosϕ)(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi) is the outward unit normal n^\hat{\mathbf{n}} on the sphere, so rθ×rϕ=R2sinϕn^\mathbf{r}_\theta \times \mathbf{r}_\phi = -R^2\sin\phi\,\hat{\mathbf{n}}. (The negative sign means this particular parameterization gives the inward normal; we can reverse it by swapping the order of the cross product.) The magnitude is:

rθ×rϕ=R2sinϕ(since sinϕ0 for ϕ[0,π]).\|\mathbf{r}_\theta \times \mathbf{r}_\phi\| = R^2\sin\phi \quad (\text{since } \sin\phi \ge 0 \text{ for } \phi \in [0, \pi]).

The surface area is:

Area(SR2)=02π0πR2sinϕdϕdθ=R22π[cosϕ]0π=R22π2=4πR2.\text{Area}(S^2_R) = \int_0^{2\pi}\int_0^\pi R^2\sin\phi\,d\phi\,d\theta = R^2 \cdot 2\pi \cdot [-\cos\phi]_0^\pi = R^2 \cdot 2\pi \cdot 2 = 4\pi R^2.

📝 Example 2 (Cylinder)

The cylinder of radius RR and height hh is parameterized by:

r(θ,z)=(Rcosθ,  Rsinθ,  z),θ[0,2π],  z[0,h].\mathbf{r}(\theta, z) = (R\cos\theta,\; R\sin\theta,\; z), \quad \theta \in [0, 2\pi],\; z \in [0, h].

The tangent vectors are rθ=(Rsinθ,  Rcosθ,  0)\mathbf{r}_\theta = (-R\sin\theta,\; R\cos\theta,\; 0) and rz=(0,0,1)\mathbf{r}_z = (0, 0, 1). The cross product is:

rθ×rz=(Rcosθ,  Rsinθ,  0),\mathbf{r}_\theta \times \mathbf{r}_z = (R\cos\theta,\; R\sin\theta,\; 0),

with magnitude rθ×rz=R\|\mathbf{r}_\theta \times \mathbf{r}_z\| = R. The lateral surface area is:

Area=02π0hRdzdθ=2πRh.\text{Area} = \int_0^{2\pi}\int_0^h R\,dz\,d\theta = 2\pi R h.

📝 Example 3 (Graph Surface z = g(x,y))

A surface defined as the graph of a function z=g(x,y)z = g(x, y) over a region DR2D \subseteq \mathbb{R}^2 has the natural parameterization r(x,y)=(x,y,g(x,y))\mathbf{r}(x, y) = (x, y, g(x,y)). The tangent vectors are:

rx=(1,0,gx),ry=(0,1,gy).\mathbf{r}_x = (1, 0, g_x), \quad \mathbf{r}_y = (0, 1, g_y).

The cross product is:

rx×ry=(gx,gy,1),\mathbf{r}_x \times \mathbf{r}_y = (-g_x, -g_y, 1),

with magnitude rx×ry=1+gx2+gy2\|\mathbf{r}_x \times \mathbf{r}_y\| = \sqrt{1 + g_x^2 + g_y^2}. So the surface area element for a graph is:

dS=1+gx2+gy2dA,dS = \sqrt{1 + g_x^2 + g_y^2}\,dA,

where dA=dxdydA = dx\,dy. This is a formula worth memorizing: for a graph surface, the area element is the Euclidean area dAdA scaled by the factor 1+g2\sqrt{1 + \|\nabla g\|^2}, which measures how much the surface tilts away from horizontal.

🔷 Proposition 1 (Parameterization Independence)

The surface area Area(S)=Dru×rvdudv\text{Area}(S) = \iint_{D^*} \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv is independent of the parameterization. If r~=rϕ\tilde{\mathbf{r}} = \mathbf{r} \circ \phi is a reparameterization via a C1C^1 diffeomorphism ϕ:D~D\phi: \tilde{D}^* \to D^*, then Area\text{Area} computed via r~\tilde{\mathbf{r}} equals Area\text{Area} computed via r\mathbf{r}.

Proof.

By the chain rule, the Jacobian of r~=rϕ\tilde{\mathbf{r}} = \mathbf{r} \circ \phi is Jr~=JrJϕJ_{\tilde{\mathbf{r}}} = J_{\mathbf{r}} \cdot J_\phi, where JϕJ_\phi is the 2×22 \times 2 Jacobian of the reparameterization. The tangent vectors transform as:

r~s×r~t=(ru×rv)detJϕ.\tilde{\mathbf{r}}_s \times \tilde{\mathbf{r}}_t = (\mathbf{r}_u \times \mathbf{r}_v) \cdot \det J_\phi.

Taking magnitudes and applying the change of variables theorem (Topic 14):

D~r~s×r~tdsdt=D~ru×rvdetJϕdsdt=Dru×rvdudv.\iint_{\tilde{D}^*} \|\tilde{\mathbf{r}}_s \times \tilde{\mathbf{r}}_t\|\,ds\,dt = \iint_{\tilde{D}^*} \|\mathbf{r}_u \times \mathbf{r}_v\| \cdot |\det J_\phi|\,ds\,dt = \iint_{D^*} \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv.

The detJϕ|\det J_\phi| from the cross product formula is absorbed by the detJϕ1|\det J_\phi|^{-1} from the change-of-variables substitution, leaving the original integral. \square

Parameterized surface with tangent vectors, normal vector, and infinitesimal area element

Selected (u, v): (3.142, 1.571)
r(u,v): (-1.000, 0.000, 0.000)
‖r_u × r_v‖: 1.000
r_u: (-0.000, -1.000, 0.000)
r_v: (-0.000, 0.000, -1.000)
n̂: (1.000, -0.000, -0.000)

Click on the parameter domain (left) to select a point. Drag to rotate the 3D view (right).

3. Scalar Surface Integrals

The scalar surface integral SfdS\iint_S f\,dS sums the values of a function ff over a surface SS, weighted by the surface area element. This is the 2D analog of the scalar line integral Cfds\int_C f\,ds from Topic 15 — where the wire becomes a thin shell.

Imagine a thin hemispherical dome with a density that varies from point to point — thicker at the base, thinner at the top. The total mass is SfdS\iint_S f\,dS, where ff is the density (mass per unit area) at each point on the surface. We weight by dSdS rather than dudvdu\,dv because the physical mass depends on the geometry of the surface, not on the parameterization. Just as a wire’s mass depended on arc length, a shell’s mass depends on surface area.

📐 Definition 4 (Scalar Surface Integral)

Let SS be a regular parameterized surface r:DR3\mathbf{r}: D^* \to \mathbb{R}^3, and let f:SRf: S \to \mathbb{R} be continuous. The scalar surface integral of ff over SS is:

SfdS=Df(r(u,v))ru×rvdudv.\iint_S f\,dS = \iint_{D^*} f(\mathbf{r}(u,v))\,\|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv.

This is a double Riemann integral (Topic 13) of the composite function (u,v)f(r(u,v))ru×rv(u,v) \mapsto f(\mathbf{r}(u,v)) \cdot \|\mathbf{r}_u \times \mathbf{r}_v\| over the parameter domain DD^*.

💡 Remark 2 (Parameterization Independence)

The scalar surface integral SfdS\iint_S f\,dS is independent of the parameterization, including orientation. The proof is identical to Proposition 1: the detJϕ|\det J_\phi| from the area element cancels with the detJϕ1|\det J_\phi|^{-1} from the change of variables in the double integral. The absolute value ensures the result holds regardless of whether the reparameterization preserves or reverses orientation.

📝 Example 4 (Mass of a Hemispherical Shell)

Let SS be the upper hemisphere of radius RR (i.e., x2+y2+z2=R2x^2 + y^2 + z^2 = R^2 with z0z \ge 0) and let f(x,y,z)=zf(x,y,z) = z be the density. From Example 1, we use the spherical parameterization with ϕ[0,π/2]\phi \in [0, \pi/2] (upper hemisphere only) and rθ×rϕ=R2sinϕ\|\mathbf{r}_\theta \times \mathbf{r}_\phi\| = R^2\sin\phi. On the sphere, z=Rcosϕz = R\cos\phi, so:

SzdS=02π0π/2(Rcosϕ)(R2sinϕ)dϕdθ=R302πdθ0π/2sinϕcosϕdϕ.\iint_S z\,dS = \int_0^{2\pi}\int_0^{\pi/2} (R\cos\phi)(R^2\sin\phi)\,d\phi\,d\theta = R^3 \int_0^{2\pi} d\theta \int_0^{\pi/2} \sin\phi\cos\phi\,d\phi.

The inner integral: 0π/2sinϕcosϕdϕ=120π/2sin(2ϕ)dϕ=12[cos(2ϕ)2]0π/2=121+12=12\int_0^{\pi/2} \sin\phi\cos\phi\,d\phi = \frac{1}{2}\int_0^{\pi/2}\sin(2\phi)\,d\phi = \frac{1}{2}\left[-\frac{\cos(2\phi)}{2}\right]_0^{\pi/2} = \frac{1}{2}\cdot\frac{1+1}{2} = \frac{1}{2}.

So SzdS=R32π12=πR3\iint_S z\,dS = R^3 \cdot 2\pi \cdot \frac{1}{2} = \pi R^3.

📝 Example 5 (Average Temperature on a Surface)

The average value of ff over SS is:

fˉ=1Area(S)SfdS,\bar{f} = \frac{1}{\text{Area}(S)}\iint_S f\,dS,

directly analogous to fˉ=1baabf(x)dx\bar{f} = \frac{1}{b-a}\int_a^b f(x)\,dx from single-variable calculus (Topic 7) and fˉ=1L(C)Cfds\bar{f} = \frac{1}{L(C)}\int_C f\,ds for curves (Topic 15, Example 5). For the hemispherical shell with f=zf = z, the average height is zˉ=πR32πR2=R2\bar{z} = \frac{\pi R^3}{2\pi R^2} = \frac{R}{2} — the average height on the hemisphere is half the radius, which matches geometric intuition (the hemisphere is “top-heavy” in the zz-direction but the area element weights the equatorial region more heavily).

Scalar surface integral as mass of a thin shell with varying density

4. Oriented Surfaces & Flux Integrals

We now move from scalar functions to vector fields. The question changes from “how much density sits on the surface?” to “how much fluid flows through the surface?”

Think of F\mathbf{F} as the velocity field of a fluid and SS as a fishing net stretched across the flow. At each point on the net, only the component of F\mathbf{F} normal to the surface passes through — the tangential component slides along the net without crossing it. The flux is the integral of Fn^\mathbf{F} \cdot \hat{\mathbf{n}} over the surface: the total rate at which fluid passes through from one side to the other.

To define flux, we need a consistent notion of “which side is which” — a choice of orientation, meaning a continuous choice of unit normal vector n^\hat{\mathbf{n}} at each point.

📐 Definition 5 (Oriented Surface)

A surface SS is orientable if it admits a continuous unit normal vector field n^:SR3\hat{\mathbf{n}}: S \to \mathbb{R}^3 with n^=1\|\hat{\mathbf{n}}\| = 1 and n^\hat{\mathbf{n}} perpendicular to the tangent plane at every point. An oriented surface is an orientable surface together with a specific choice of n^\hat{\mathbf{n}}.

For a parameterized surface, the two orientations correspond to n^=ru×rvru×rv\hat{\mathbf{n}} = \frac{\mathbf{r}_u \times \mathbf{r}_v}{\|\mathbf{r}_u \times \mathbf{r}_v\|} and n^=ru×rvru×rv\hat{\mathbf{n}} = -\frac{\mathbf{r}_u \times \mathbf{r}_v}{\|\mathbf{r}_u \times \mathbf{r}_v\|}.

Not every surface is orientable. The Mobius strip is the classic counterexample: if you start with a normal vector and slide it continuously around the strip, it returns pointing the opposite way. There is no globally consistent choice of “inside” and “outside.”

💡 Remark 3 (Orientation Conventions)

Two standard conventions govern orientation:

  1. Closed surfaces (surfaces that enclose a volume, like a sphere or cube): the outward-pointing normal is the positive orientation. Flux with the outward normal measures net outflow.

  2. Surfaces with boundary (for Stokes’ theorem): the right-hand rule determines the orientation. If you curl the fingers of your right hand in the direction of traversal along the boundary curve CC, your thumb points in the direction of n^\hat{\mathbf{n}}. Equivalently: walking along CC with n^\hat{\mathbf{n}} pointing up from your head, the surface is on your left.

📐 Definition 6 (Flux Integral)

Let SS be an oriented surface parameterized by r:DR3\mathbf{r}: D^* \to \mathbb{R}^3 (with the orientation given by ru×rv\mathbf{r}_u \times \mathbf{r}_v), and let F:SR3\mathbf{F}: S \to \mathbb{R}^3 be a continuous vector field. The flux integral (or surface integral of a vector field) is:

SFdS=DF(r(u,v))(ru×rv)dudv.\iint_S \mathbf{F} \cdot d\mathbf{S} = \iint_{D^*} \mathbf{F}(\mathbf{r}(u,v)) \cdot (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv.

Equivalently, writing dS=n^dSd\mathbf{S} = \hat{\mathbf{n}}\,dS:

SFdS=S(Fn^)dS.\iint_S \mathbf{F} \cdot d\mathbf{S} = \iint_S (\mathbf{F} \cdot \hat{\mathbf{n}})\,dS.

The flux measures the net flow of F\mathbf{F} through SS in the direction of n^\hat{\mathbf{n}}. Positive flux means net flow in the n^\hat{\mathbf{n}} direction; negative flux means net flow opposite to n^\hat{\mathbf{n}}.

📝 Example 6 (Flux Through a Hemisphere)

Let F=(0,0,z)\mathbf{F} = (0, 0, z) (a vertical field, stronger at greater heights) and let SS be the upper hemisphere of the unit sphere (R=1R = 1) with the outward normal. From Example 1, rθ×rϕ=sinϕ(sinϕcosθ,  sinϕsinθ,  cosϕ)\mathbf{r}_\theta \times \mathbf{r}_\phi = -\sin\phi\,(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi).

Since we want the outward normal, we use rϕ×rθ=sinϕ(sinϕcosθ,  sinϕsinθ,  cosϕ)\mathbf{r}_\phi \times \mathbf{r}_\theta = \sin\phi\,(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi). On the sphere, z=cosϕz = \cos\phi, so F(r(θ,ϕ))=(0,0,cosϕ)\mathbf{F}(\mathbf{r}(\theta,\phi)) = (0, 0, \cos\phi).

F(rϕ×rθ)=(0,0,cosϕ)sinϕ(sinϕcosθ,  sinϕsinθ,  cosϕ)=sinϕcos2ϕ.\mathbf{F} \cdot (\mathbf{r}_\phi \times \mathbf{r}_\theta) = (0, 0, \cos\phi) \cdot \sin\phi\,(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi) = \sin\phi\cos^2\phi.

The flux is:

SFdS=02π0π/2sinϕcos2ϕdϕdθ=2π0π/2sinϕcos2ϕdϕ.\iint_S \mathbf{F} \cdot d\mathbf{S} = \int_0^{2\pi}\int_0^{\pi/2} \sin\phi\cos^2\phi\,d\phi\,d\theta = 2\pi \int_0^{\pi/2} \sin\phi\cos^2\phi\,d\phi.

With the substitution u=cosϕu = \cos\phi, du=sinϕdϕdu = -\sin\phi\,d\phi:

2π10u2(du)=2π01u2du=2π13=2π3.2\pi \int_1^0 u^2\,(-du) = 2\pi \int_0^1 u^2\,du = 2\pi \cdot \frac{1}{3} = \frac{2\pi}{3}.

📝 Example 7 (Flux of the Position Field Through a Sphere)

Let F(x,y,z)=(x,y,z)\mathbf{F}(x,y,z) = (x, y, z) — the position (or radial) field — and let SS be the sphere of radius RR with the outward normal. On the sphere, n^=1R(x,y,z)\hat{\mathbf{n}} = \frac{1}{R}(x, y, z), so:

Fn^=1R(x2+y2+z2)=R2R=R.\mathbf{F} \cdot \hat{\mathbf{n}} = \frac{1}{R}(x^2 + y^2 + z^2) = \frac{R^2}{R} = R.

The normal component of F\mathbf{F} is the constant RR on the entire sphere. The flux is:

SFdS=SRdS=RArea(S)=R4πR2=4πR3.\iint_S \mathbf{F} \cdot d\mathbf{S} = \iint_S R\,dS = R \cdot \text{Area}(S) = R \cdot 4\pi R^2 = 4\pi R^3.

We can verify this using the divergence theorem (Theorem 3, Section 7): F=xx+yy+zz=3\nabla \cdot \mathbf{F} = \frac{\partial x}{\partial x} + \frac{\partial y}{\partial y} + \frac{\partial z}{\partial z} = 3, so E3dV=343πR3=4πR3\iiint_E 3\,dV = 3 \cdot \frac{4}{3}\pi R^3 = 4\pi R^3. The two sides match — this is the divergence theorem at work.

💡 Remark 4 (Orientation Reversal)

Reversing the orientation of SS — replacing n^\hat{\mathbf{n}} by n^-\hat{\mathbf{n}} — negates the flux integral:

SFdS=SFdS.\iint_{-S} \mathbf{F} \cdot d\mathbf{S} = -\iint_S \mathbf{F} \cdot d\mathbf{S}.

This is the surface analog of the line integral identity CFdr=CFdr\int_{-C} \mathbf{F} \cdot d\mathbf{r} = -\int_C \mathbf{F} \cdot d\mathbf{r} from Topic 15, Remark 2. The scalar surface integral SfdS\iint_S f\,dS is orientation-independent (it uses ru×rv\|\mathbf{r}_u \times \mathbf{r}_v\|, which is always positive), but the flux integral is orientation-sensitive (it uses ru×rv\mathbf{r}_u \times \mathbf{r}_v directly, including its sign).

Flux integral: vector field arrows passing through an oriented surface, normal component highlighted

Total flux S F · dS ≈ 12.5696
Surface: Unit sphere x² + y² + z² = 1, outward normal
Field: Uniform expansion from origin. div = 3 everywhere, curl = 0.
Div thm: E ∇·F dV (predicted)

Drag to rotate. Green = positive flux (outward), Red = negative flux (inward).

5. The 3D Curl and Divergence

Before stating Stokes’ theorem and the divergence theorem, we need the two differential operators that generalize the 2D curl from Topic 15 to three dimensions.

The geometric intuition is direct. The curl of a vector field F\mathbf{F} at a point measures the rotation — the axis and angular velocity of the infinitesimal “paddlewheel” that F\mathbf{F} would spin. The divergence measures the source strength — how much F\mathbf{F} is “spreading out” or “converging” at that point. If F\mathbf{F} is a fluid velocity field, ×F\nabla \times \mathbf{F} points along the local rotation axis, and F\nabla \cdot \mathbf{F} is the rate of volume expansion per unit volume.

📐 Definition 7 (3D Curl)

For a C1C^1 vector field F=(P,Q,R):DR3\mathbf{F} = (P, Q, R): D \to \mathbb{R}^3, the curl of F\mathbf{F} is:

×F=(RyQz)i^+(PzRx)j^+(QxPy)k^.\nabla \times \mathbf{F} = \left(\frac{\partial R}{\partial y} - \frac{\partial Q}{\partial z}\right)\hat{\mathbf{i}} + \left(\frac{\partial P}{\partial z} - \frac{\partial R}{\partial x}\right)\hat{\mathbf{j}} + \left(\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}\right)\hat{\mathbf{k}}.

Using the determinant mnemonic:

×F=i^j^k^xyzPQR.\nabla \times \mathbf{F} = \begin{vmatrix} \hat{\mathbf{i}} & \hat{\mathbf{j}} & \hat{\mathbf{k}} \\ \frac{\partial}{\partial x} & \frac{\partial}{\partial y} & \frac{\partial}{\partial z} \\ P & Q & R \end{vmatrix}.

The k^\hat{\mathbf{k}}-component of ×F\nabla \times \mathbf{F} is QxPy\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} — exactly the 2D curl from Topic 15, Definition 9. The 3D curl extends the 2D curl by adding components for rotation about the xx- and yy-axes.

📐 Definition 8 (3D Divergence)

For a C1C^1 vector field F=(P,Q,R):DR3\mathbf{F} = (P, Q, R): D \to \mathbb{R}^3, the divergence of F\mathbf{F} is the scalar field:

F=Px+Qy+Rz.\nabla \cdot \mathbf{F} = \frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y} + \frac{\partial R}{\partial z}.

The divergence is the “scalar product” of the formal vector =(x,y,z)\nabla = (\partial_x, \partial_y, \partial_z) with F\mathbf{F}. It measures the net outflow per unit volume at each point.

🔷 Proposition 2 (Divergence as Infinitesimal Flux)

Let F\mathbf{F} be C1C^1 at a point p\mathbf{p}. Let Br(p)B_r(\mathbf{p}) be the ball of radius rr centered at p\mathbf{p} and Sr(p)S_r(\mathbf{p}) its boundary sphere (with outward normal). Then:

F(p)=limr0143πr3Sr(p)FdS.\nabla \cdot \mathbf{F}(\mathbf{p}) = \lim_{r \to 0} \frac{1}{\frac{4}{3}\pi r^3} \oiint_{S_r(\mathbf{p})} \mathbf{F} \cdot d\mathbf{S}.

The divergence is the flux per unit volume in the limit of infinitesimally small enclosing surfaces.

Proof.

By the divergence theorem (Theorem 3, which we will prove in Section 7), SrFdS=BrFdV\oiint_{S_r} \mathbf{F} \cdot d\mathbf{S} = \iiint_{B_r} \nabla \cdot \mathbf{F}\,dV. By the Mean Value Theorem for triple integrals (Topic 13), there exists prBr(p)\mathbf{p}_r \in B_r(\mathbf{p}) such that:

BrFdV=(F)(pr)43πr3.\iiint_{B_r} \nabla \cdot \mathbf{F}\,dV = (\nabla \cdot \mathbf{F})(\mathbf{p}_r) \cdot \frac{4}{3}\pi r^3.

Dividing by the volume and taking r0r \to 0: as r0r \to 0, prp\mathbf{p}_r \to \mathbf{p}, and continuity of F\nabla \cdot \mathbf{F} gives (F)(pr)(F)(p)(\nabla \cdot \mathbf{F})(\mathbf{p}_r) \to (\nabla \cdot \mathbf{F})(\mathbf{p}). \square

🔷 Proposition 3 (Curl as Infinitesimal Circulation (3D))

Let F\mathbf{F} be C1C^1 at a point p\mathbf{p}, and let n^\hat{\mathbf{n}} be a unit vector. Let CrC_r be the circle of radius rr centered at p\mathbf{p} in the plane perpendicular to n^\hat{\mathbf{n}}, oriented by the right-hand rule. Then:

(×F)(p)n^=limr01πr2CrFdr.(\nabla \times \mathbf{F})(\mathbf{p}) \cdot \hat{\mathbf{n}} = \lim_{r \to 0} \frac{1}{\pi r^2} \oint_{C_r} \mathbf{F} \cdot d\mathbf{r}.

The component of the curl along n^\hat{\mathbf{n}} is the circulation per unit area in the plane perpendicular to n^\hat{\mathbf{n}}, in the limit of infinitesimally small loops. This generalizes Topic 15, Proposition 2 from 2D (where n^=k^\hat{\mathbf{n}} = \hat{\mathbf{k}} always) to arbitrary directions in 3D.

🔷 Theorem 1 (Key Vector Identities)

Let ff be a C2C^2 scalar field and F\mathbf{F} a C2C^2 vector field on an open domain in R3\mathbb{R}^3. Then:

  1. ×(f)=0\nabla \times (\nabla f) = \mathbf{0} — the curl of a gradient is always zero.

  2. (×F)=0\nabla \cdot (\nabla \times \mathbf{F}) = 0 — the divergence of a curl is always zero.

  3. ×(×F)=(F)2F\nabla \times (\nabla \times \mathbf{F}) = \nabla(\nabla \cdot \mathbf{F}) - \nabla^2 \mathbf{F} — the curl-curl identity, where 2F\nabla^2 \mathbf{F} is the vector Laplacian (Laplacian applied component-wise).

Identity (1) says gradient fields are curl-free — this is the 3D version of the exactness condition P/y=Q/x\partial P/\partial y = \partial Q/\partial x from Topic 15. Identity (2) says curl fields are divergence-free. Together, they form an exact sequence:

C(D)Vec(D)×Vec(D)C(D),C^\infty(D) \xrightarrow{\nabla} \text{Vec}(D) \xrightarrow{\nabla \times} \text{Vec}(D) \xrightarrow{\nabla \cdot} C^\infty(D),

where the composition of any two consecutive arrows is zero. In the language of differential forms, this is the de Rham complex Ω0dΩ1dΩ2dΩ3\Omega^0 \xrightarrow{d} \Omega^1 \xrightarrow{d} \Omega^2 \xrightarrow{d} \Omega^3 with d2=0d^2 = 0 (→ Smooth Manifolds on formalML).

📝 Example 8 (Computing Curl and Divergence)

Let F(x,y,z)=(yz,  xz,  xy)\mathbf{F}(x, y, z) = (yz,\; xz,\; xy). We compute:

×F=((xy)y(xz)z)i^+((yz)z(xy)x)j^+((xz)x(yz)y)k^\nabla \times \mathbf{F} = \left(\frac{\partial(xy)}{\partial y} - \frac{\partial(xz)}{\partial z}\right)\hat{\mathbf{i}} + \left(\frac{\partial(yz)}{\partial z} - \frac{\partial(xy)}{\partial x}\right)\hat{\mathbf{j}} + \left(\frac{\partial(xz)}{\partial x} - \frac{\partial(yz)}{\partial y}\right)\hat{\mathbf{k}}

=(xx)i^+(yy)j^+(zz)k^=0.= (x - x)\hat{\mathbf{i}} + (y - y)\hat{\mathbf{j}} + (z - z)\hat{\mathbf{k}} = \mathbf{0}.

The curl vanishes because F=(xyz)\mathbf{F} = \nabla(xyz) — it is a gradient field, and Identity (1) from Theorem 1 guarantees ×(f)=0\nabla \times (\nabla f) = \mathbf{0}.

The divergence is F=(yz)x+(xz)y+(xy)z=0+0+0=0\nabla \cdot \mathbf{F} = \frac{\partial(yz)}{\partial x} + \frac{\partial(xz)}{\partial y} + \frac{\partial(xy)}{\partial z} = 0 + 0 + 0 = 0. This field is both curl-free and divergence-free — it has no rotation and no sources or sinks.

📝 Example 9 (Non-Trivial Curl)

Let F(x,y,z)=(y,x,0)\mathbf{F}(x, y, z) = (-y, x, 0) — the 3D extension of the rotation field from Topic 15.

×F=(0yxz)i^+((y)z0x)j^+(xx(y)y)k^=(0,0,2).\nabla \times \mathbf{F} = \left(\frac{\partial 0}{\partial y} - \frac{\partial x}{\partial z}\right)\hat{\mathbf{i}} + \left(\frac{\partial(-y)}{\partial z} - \frac{\partial 0}{\partial x}\right)\hat{\mathbf{j}} + \left(\frac{\partial x}{\partial x} - \frac{\partial(-y)}{\partial y}\right)\hat{\mathbf{k}} = (0, 0, 2).

The curl is (0,0,2)(0, 0, 2), pointing in the k^\hat{\mathbf{k}} direction with magnitude 2. The k^\hat{\mathbf{k}}-component is xx(y)y=1+1=2\frac{\partial x}{\partial x} - \frac{\partial(-y)}{\partial y} = 1 + 1 = 2, exactly the 2D curl from Topic 15, Example 15. The 3D curl encodes the same rotation information, plus the fact that the rotation axis is vertical.

The divergence is F=0+0+0=0\nabla \cdot \mathbf{F} = 0 + 0 + 0 = 0 — rotation without expansion.

3D curl as rotation axis with paddlewheel, divergence as source/sink strength

6. Stokes’ Theorem

Green’s theorem (Topic 15, Theorem 5) says CFdr=DcurlFdA\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_D \text{curl}\,\mathbf{F}\,dA — the circulation around a closed curve CC equals the integral of the curl over the enclosed region DD. This worked in 2D because CC bounds a flat region DD in the plane.

Stokes’ theorem is the 3D generalization. The “enclosed region” is now a surface SS bounded by the curve CC, and the “integral of the curl” becomes the flux of the curl through SS. The flat 2D region becomes a potentially curved surface — the boundary is still a curve, but the “inside” can be any surface spanning that curve.

🔷 Theorem 2 (Stokes' Theorem)

Let SS be an oriented, piecewise-smooth surface in R3\mathbb{R}^3 bounded by a simple, closed, piecewise-smooth curve C=SC = \partial S, with orientation induced by the right-hand rule. Let F=(P,Q,R)\mathbf{F} = (P, Q, R) be a C1C^1 vector field on an open region containing SS. Then:

CFdr=S(×F)dS.\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S}.

The circulation of F\mathbf{F} around the boundary CC equals the flux of the curl ×F\nabla \times \mathbf{F} through the surface SS.

Proof.

We prove Stokes’ theorem for the case where SS is the graph of a C2C^2 function z=g(x,y)z = g(x, y) over a region DR2D \subseteq \mathbb{R}^2. The general case follows by decomposing an arbitrary surface into graph patches using a partition of unity.

Setup. Parameterize SS as r(x,y)=(x,y,g(x,y))\mathbf{r}(x, y) = (x, y, g(x,y)) for (x,y)D(x, y) \in D. The boundary curve C=SC = \partial S lies above the boundary D\partial D. If D\partial D is parameterized by (x(t),y(t))(x(t), y(t)) for t[a,b]t \in [a, b], then CC is parameterized by (x(t),y(t),g(x(t),y(t)))(x(t), y(t), g(x(t), y(t))).

Left side (line integral). We compute CPdx+Qdy+Rdz\oint_C P\,dx + Q\,dy + R\,dz. On CC, dz=gxdx+gydydz = g_x\,dx + g_y\,dy (by the chain rule), so:

CPdx+Qdy+Rdz=C(P+Rgx)dx+(Q+Rgy)dy,\oint_C P\,dx + Q\,dy + R\,dz = \oint_C (P + Rg_x)\,dx + (Q + Rg_y)\,dy,

where all functions are evaluated at (x,y,g(x,y))(x, y, g(x,y)). This is now a 2D line integral around D\partial D.

Right side (surface integral). We compute S(×F)dS\iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S}. From Example 3, rx×ry=(gx,gy,1)\mathbf{r}_x \times \mathbf{r}_y = (-g_x, -g_y, 1). Writing ×F=(RyQz,  PzRx,  QxPy)\nabla \times \mathbf{F} = (R_y - Q_z,\; P_z - R_x,\; Q_x - P_y), the flux of the curl is:

D[(RyQz)(gx)+(PzRx)(gy)+(QxPy)]dA,\iint_D [(R_y - Q_z)(-g_x) + (P_z - R_x)(-g_y) + (Q_x - P_y)]\,dA,

where all partial derivatives of PP, QQ, RR are evaluated at (x,y,g(x,y))(x, y, g(x,y)).

Showing equality via Green’s theorem. We apply Green’s theorem (Topic 15, Theorem 5) to the 2D line integral:

D(P+Rgx)dx+(Q+Rgy)dy=D[(Q+Rgy)x(P+Rgx)y]dA.\oint_{\partial D} (P + Rg_x)\,dx + (Q + Rg_y)\,dy = \iint_D \left[\frac{\partial(Q + Rg_y)}{\partial x} - \frac{\partial(P + Rg_x)}{\partial y}\right]\,dA.

We now expand the integrand on the right. We must use the chain rule carefully, because PP, QQ, RR depend on z=g(x,y)z = g(x,y).

Expanding x(Q+Rgy)\frac{\partial}{\partial x}(Q + Rg_y):

Qx=Qx+Qzgx,(Rgy)x=Rxgy+Rzgxgy+Rgyx,\frac{\partial Q}{\partial x} = Q_x + Q_z g_x, \quad \frac{\partial(Rg_y)}{\partial x} = R_x g_y + R_z g_x g_y + R g_{yx},

so x(Q+Rgy)=Qx+Qzgx+Rxgy+Rzgxgy+Rgyx\frac{\partial}{\partial x}(Q + Rg_y) = Q_x + Q_z g_x + R_x g_y + R_z g_x g_y + R g_{yx}.

Expanding y(P+Rgx)\frac{\partial}{\partial y}(P + Rg_x):

Py=Py+Pzgy,(Rgx)y=Rygx+Rzgygx+Rgxy,\frac{\partial P}{\partial y} = P_y + P_z g_y, \quad \frac{\partial(Rg_x)}{\partial y} = R_y g_x + R_z g_y g_x + R g_{xy},

so y(P+Rgx)=Py+Pzgy+Rygx+Rzgygx+Rgxy\frac{\partial}{\partial y}(P + Rg_x) = P_y + P_z g_y + R_y g_x + R_z g_y g_x + R g_{xy}.

Subtracting. Since gxy=gyxg_{xy} = g_{yx} (by C2C^2 regularity), the RgxyRg_{xy} and RgyxRg_{yx} terms cancel. The RzgxgyR_z g_x g_y terms also cancel. We are left with:

(Q+Rgy)x(P+Rgx)y=(QxPy)+(QzgxPzgy)+(RxgyRygx).\frac{\partial(Q + Rg_y)}{\partial x} - \frac{\partial(P + Rg_x)}{\partial y} = (Q_x - P_y) + (Q_z g_x - P_z g_y) + (R_x g_y - R_y g_x).

Rearranging:

=(QxPy)+(gx)(RyQz)+(gy)(PzRx).= (Q_x - P_y) + (-g_x)(R_y - Q_z) + (-g_y)(P_z - R_x).

Wait — let us verify. We have (QzgxPzgy)+(RxgyRygx)(Q_z g_x - P_z g_y) + (R_x g_y - R_y g_x). Grouping by the gg factors:

=gx(RyQz)gy(PzRx)+(QxPy).= -g_x(R_y - Q_z) - g_y(P_z - R_x) + (Q_x - P_y).

This is exactly (×F)(gx,gy,1)=(×F)(rx×ry)(\nabla \times \mathbf{F}) \cdot (-g_x, -g_y, 1) = (\nabla \times \mathbf{F}) \cdot (\mathbf{r}_x \times \mathbf{r}_y).

So by Green’s theorem:

CFdr=D(×F)(rx×ry)dA=S(×F)dS.\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_D (\nabla \times \mathbf{F}) \cdot (\mathbf{r}_x \times \mathbf{r}_y)\,dA = \iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S}.

For a general oriented surface SS that is not a single graph, we decompose SS into graph patches S1,,SkS_1, \ldots, S_k using a partition of unity. The interior boundary contributions from adjacent patches cancel (their normals point in opposite directions along the shared edge), leaving only the exterior boundary C=SC = \partial S. \square

📝 Example 10 (Verifying Stokes' on a Hemisphere)

Let F=(y,x,0)\mathbf{F} = (-y, x, 0) and let SS be the upper hemisphere of the unit sphere, bounded by CC: the unit circle in the xyxy-plane.

Line integral. The unit circle is r(t)=(cost,sint,0)\mathbf{r}(t) = (\cos t, \sin t, 0) for t[0,2π]t \in [0, 2\pi], traversed counterclockwise. Then F(r(t))=(sint,cost,0)\mathbf{F}(\mathbf{r}(t)) = (-\sin t, \cos t, 0) and r(t)=(sint,cost,0)\mathbf{r}'(t) = (-\sin t, \cos t, 0):

CFdr=02π[sin2t+cos2t]dt=2π.\oint_C \mathbf{F} \cdot d\mathbf{r} = \int_0^{2\pi} [\sin^2 t + \cos^2 t]\,dt = 2\pi.

Surface integral. From Example 9, ×F=(0,0,2)\nabla \times \mathbf{F} = (0, 0, 2). The flux of (0,0,2)(0, 0, 2) through the upper hemisphere with outward normal: using dS=sinϕ(sinϕcosθ,  sinϕsinθ,  cosϕ)dϕdθd\mathbf{S} = \sin\phi\,(\sin\phi\cos\theta,\; \sin\phi\sin\theta,\; \cos\phi)\,d\phi\,d\theta:

S(0,0,2)dS=02π0π/22sinϕcosϕdϕdθ=22π12=2π.\iint_S (0, 0, 2) \cdot d\mathbf{S} = \int_0^{2\pi}\int_0^{\pi/2} 2\sin\phi\cos\phi\,d\phi\,d\theta = 2 \cdot 2\pi \cdot \frac{1}{2} = 2\pi.

Both sides equal 2π2\pi. Stokes’ theorem confirmed.

📝 Example 11 (Stokes' Theorem as a Computation Tool)

Compute CFdr\oint_C \mathbf{F} \cdot d\mathbf{r} where F=(y2,z2,x2)\mathbf{F} = (y^2, z^2, x^2) and CC is the triangle with vertices (1,0,0)(1, 0, 0), (0,1,0)(0, 1, 0), (0,0,1)(0, 0, 1), oriented counterclockwise when viewed from the direction of the outward normal.

Strategy: computing the line integral directly would require parameterizing three edges and adding three integrals. Instead, we use Stokes’ theorem with the flat triangular surface SS spanning CC.

The triangle lies in the plane x+y+z=1x + y + z = 1, which we can write as z=1xyz = 1 - x - y. So gx=1g_x = -1 and gy=1g_y = -1, and rx×ry=(1,1,1)\mathbf{r}_x \times \mathbf{r}_y = (1, 1, 1) (pointing outward, since the components are all positive — this is the outward normal to the plane x+y+z=1x + y + z = 1).

The curl is:

×F=(RyQz,  PzRx,  QxPy)=(02z,  02x,  02y)=(2z,2x,2y).\nabla \times \mathbf{F} = (R_y - Q_z,\; P_z - R_x,\; Q_x - P_y) = (0 - 2z,\; 0 - 2x,\; 0 - 2y) = (-2z, -2x, -2y).

The flux of the curl through the triangle:

S(×F)dS=D(2z,2x,2y)(1,1,1)dA=D(2z2x2y)dA.\iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S} = \iint_D (-2z, -2x, -2y) \cdot (1, 1, 1)\,dA = \iint_D (-2z - 2x - 2y)\,dA.

On the plane z=1xyz = 1 - x - y: 2z2x2y=2(1xy)2x2y=2-2z - 2x - 2y = -2(1 - x - y) - 2x - 2y = -2. So:

D(2)dA=2Area(D),\iint_D (-2)\,dA = -2 \cdot \text{Area}(D),

where DD is the projection of the triangle onto the xyxy-plane: the triangle with vertices (1,0)(1,0), (0,1)(0,1), (0,0)(0,0). Its area is 12\frac{1}{2}. Therefore:

CFdr=212=1.\oint_C \mathbf{F} \cdot d\mathbf{r} = -2 \cdot \frac{1}{2} = -1.

💡 Remark 5 (Surface Independence)

Stokes’ theorem implies that if two oriented surfaces S1S_1 and S2S_2 share the same boundary curve CC (with compatible orientations), then:

S1(×F)dS=CFdr=S2(×F)dS.\iint_{S_1} (\nabla \times \mathbf{F}) \cdot d\mathbf{S} = \oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_{S_2} (\nabla \times \mathbf{F}) \cdot d\mathbf{S}.

The flux of the curl depends only on the boundary, not on which surface spans it. In Example 10, we could replace the hemisphere with any surface bounded by the unit circle — a flat disk, a paraboloid, a cone — and the curl flux would still be 2π2\pi.

This is the surface analog of path independence for conservative fields: for a curl field (one of the form ×G\nabla \times \mathbf{G}), the flux integral is “surface-independent” in the same way that a gradient field’s line integral is “path-independent.”

Stokes' theorem: boundary curve C, spanning surface S, curl vectors through the surface

Line integral C F·dr ≈ 0.0000
Curl flux S (∇×F)·dS ≈ 6.2848
Difference 0.001615

Stokes' theorem: ∮C F·dr = ∬S (∇×F)·dS. Drag to rotate.

7. The Divergence Theorem

The divergence theorem is the 3D analog of the 2D divergence form of Green’s theorem. It relates the flux of a vector field through a closed surface to the integral of the divergence over the enclosed volume.

The geometric intuition is the same telescoping argument that underlies the Fundamental Theorem of Calculus. Chop the volume EE into tiny cubes. Each tiny cube contributes its flux through its six faces. But adjacent cubes share faces, and the flux through a shared face is counted out of one cube and in to its neighbor — these contributions cancel. The only faces that survive are those on the outer boundary E\partial E. So the total flux through E\partial E equals the sum of the divergences (the infinitesimal net outflows) over all the tiny cubes — which in the limit is EFdV\iiint_E \nabla \cdot \mathbf{F}\,dV.

🔷 Theorem 3 (The Divergence Theorem (Gauss's Theorem))

Let ER3E \subseteq \mathbb{R}^3 be a bounded region with piecewise-smooth boundary surface S=ES = \partial E, oriented with the outward-pointing normal. Let F=(P,Q,R)\mathbf{F} = (P, Q, R) be a C1C^1 vector field on an open region containing E\overline{E}. Then:

SFdS=EFdV.\oiint_S \mathbf{F} \cdot d\mathbf{S} = \iiint_E \nabla \cdot \mathbf{F}\,dV.

The net outward flux of F\mathbf{F} through the closed surface SS equals the total divergence of F\mathbf{F} in the enclosed volume EE.

Proof.

We prove the divergence theorem for regions that are simultaneously Type I, Type II, and Type III — that is, regions that can be described as bounded above and below by graphs in each coordinate direction. It suffices to show:

SPdydz=EPxdV,SQdxdz=EQydV,SRdxdy=ERzdV,\oiint_S P\,dy\,dz = \iiint_E \frac{\partial P}{\partial x}\,dV, \quad \oiint_S Q\,dx\,dz = \iiint_E \frac{\partial Q}{\partial y}\,dV, \quad \oiint_S R\,dx\,dy = \iiint_E \frac{\partial R}{\partial z}\,dV,

and then add all three. We prove the third identity; the first two are analogous.

Proof of SRdxdy=ERzdV\oiint_S R\,dx\,dy = \iiint_E \frac{\partial R}{\partial z}\,dV. Let EE be a Type I region in the zz-direction:

E={(x,y,z):(x,y)D,  g1(x,y)zg2(x,y)},E = \{(x, y, z) : (x, y) \in D,\; g_1(x, y) \le z \le g_2(x, y)\},

where DD is the projection of EE onto the xyxy-plane, and z=g1(x,y)z = g_1(x,y) is the bottom surface, z=g2(x,y)z = g_2(x,y) is the top surface.

Right side (volume integral). By Fubini’s theorem (Topic 13):

ERzdV=D[g1(x,y)g2(x,y)Rzdz]dA=D[R(x,y,g2(x,y))R(x,y,g1(x,y))]dA.\iiint_E \frac{\partial R}{\partial z}\,dV = \iint_D \left[\int_{g_1(x,y)}^{g_2(x,y)} \frac{\partial R}{\partial z}\,dz\right]\,dA = \iint_D [R(x, y, g_2(x,y)) - R(x, y, g_1(x,y))]\,dA.

The inner integral is evaluated by the Fundamental Theorem of Calculus.

Left side (surface integral). The boundary S=ES = \partial E consists of three pieces:

  • Top surface S2S_2: z=g2(x,y)z = g_2(x, y) with upward (outward) normal. Parameterized as r(x,y)=(x,y,g2(x,y))\mathbf{r}(x,y) = (x, y, g_2(x,y)), so rx×ry=((g2)x,(g2)y,1)\mathbf{r}_x \times \mathbf{r}_y = (-(g_2)_x, -(g_2)_y, 1). The RR-component of the flux uses only the k^\hat{\mathbf{k}}-component of dSd\mathbf{S}, which is 1dA1\,dA. So S2Rdxdy=DR(x,y,g2(x,y))dA\iint_{S_2} R\,dx\,dy = \iint_D R(x, y, g_2(x,y))\,dA.

  • Bottom surface S1S_1: z=g1(x,y)z = g_1(x, y) with downward (outward) normal. The outward normal points downward, so rx×ry=((g1)x,(g1)y,1)\mathbf{r}_x \times \mathbf{r}_y = ((g_1)_x, (g_1)_y, -1) (we reverse the cross product). The k^\hat{\mathbf{k}}-component is 1dA-1\,dA. So S1Rdxdy=DR(x,y,g1(x,y))dA\iint_{S_1} R\,dx\,dy = -\iint_D R(x, y, g_1(x,y))\,dA.

  • Lateral surface S3S_3: the vertical sides. On vertical surfaces, the normal is horizontal — the k^\hat{\mathbf{k}}-component of dSd\mathbf{S} is zero. So S3Rdxdy=0\iint_{S_3} R\,dx\,dy = 0.

Adding all three pieces:

SRdxdy=DR(x,y,g2)DR(x,y,g1)=D[R(x,y,g2)R(x,y,g1)]dA.\oiint_S R\,dx\,dy = \iint_D R(x, y, g_2) - \iint_D R(x, y, g_1) = \iint_D [R(x, y, g_2) - R(x, y, g_1)]\,dA.

This matches the volume integral computed above.

The proofs for the PP and QQ components are identical, using the Type II and Type III descriptions of EE respectively. Adding all three:

SFdS=E(Px+Qy+Rz)dV=EFdV.\oiint_S \mathbf{F} \cdot d\mathbf{S} = \iiint_E \left(\frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y} + \frac{\partial R}{\partial z}\right)\,dV = \iiint_E \nabla \cdot \mathbf{F}\,dV.

For general regions that are not simultaneously Type I/II/III, we decompose EE into finitely many subregions E1,,EkE_1, \ldots, E_k, each of which is Type I/II/III. The divergence theorem holds on each piece. The flux contributions from shared interior faces cancel (they have opposite outward normals on adjacent subregions), leaving only the flux through the external boundary E\partial E. \square

📝 Example 12 (Verification on a Cube)

Let F=(x2,y2,z2)\mathbf{F} = (x^2, y^2, z^2) and let E=[0,1]3E = [0, 1]^3 be the unit cube.

Divergence: F=2x+2y+2z\nabla \cdot \mathbf{F} = 2x + 2y + 2z.

Volume integral:

E(2x+2y+2z)dV=010101(2x+2y+2z)dzdydx.\iiint_E (2x + 2y + 2z)\,dV = \int_0^1\int_0^1\int_0^1 (2x + 2y + 2z)\,dz\,dy\,dx.

By symmetry (each variable contributes equally), this is 30101012xdzdydx=321211=33\int_0^1\int_0^1\int_0^1 2x\,dz\,dy\,dx = 3 \cdot 2 \cdot \frac{1}{2} \cdot 1 \cdot 1 = 3.

Surface integral (direct computation). The cube has six faces. On x=1x = 1: n^=(1,0,0)\hat{\mathbf{n}} = (1,0,0), Fn^=1\mathbf{F} \cdot \hat{\mathbf{n}} = 1, integral =1= 1. On x=0x = 0: n^=(1,0,0)\hat{\mathbf{n}} = (-1,0,0), Fn^=0\mathbf{F} \cdot \hat{\mathbf{n}} = 0, integral =0= 0. By symmetry, the yy-faces contribute 1+0=11 + 0 = 1 and the zz-faces contribute 1+0=11 + 0 = 1.

Total flux: 1+0+1+0+1+0=31 + 0 + 1 + 0 + 1 + 0 = 3.

Both sides equal 3. Divergence theorem confirmed.

📝 Example 13 (The Inverse-Square Field)

Let F(r)=rr3=(x,y,z)(x2+y2+z2)3/2\mathbf{F}(\mathbf{r}) = \frac{\mathbf{r}}{r^3} = \frac{(x, y, z)}{(x^2 + y^2 + z^2)^{3/2}} — the gravitational/electrostatic field of a point source at the origin.

Away from the origin, F=0\nabla \cdot \mathbf{F} = 0. We can verify this by direct computation:

xx(x2+y2+z2)3/2=(x2+y2+z2)3/2x32(x2+y2+z2)1/22x(x2+y2+z2)3=r23x2r5.\frac{\partial}{\partial x}\frac{x}{(x^2+y^2+z^2)^{3/2}} = \frac{(x^2+y^2+z^2)^{3/2} - x \cdot \frac{3}{2}(x^2+y^2+z^2)^{1/2} \cdot 2x}{(x^2+y^2+z^2)^3} = \frac{r^2 - 3x^2}{r^5}.

Summing the three components: 3r23(x2+y2+z2)r5=0\frac{3r^2 - 3(x^2+y^2+z^2)}{r^5} = 0.

On a sphere of radius RR centered at the origin, n^=rR\hat{\mathbf{n}} = \frac{\mathbf{r}}{R} and Fn^=RR3=1R2\mathbf{F} \cdot \hat{\mathbf{n}} = \frac{R}{R^3} = \frac{1}{R^2}. The flux is:

SRFdS=1R24πR2=4π.\oiint_{S_R} \mathbf{F} \cdot d\mathbf{S} = \frac{1}{R^2} \cdot 4\pi R^2 = 4\pi.

The flux is 4π4\pi regardless of RR. If the divergence theorem applied to the ball containing the origin, we would get FdV=0\iiint \nabla \cdot \mathbf{F}\,dV = 0, contradicting the nonzero flux. The resolution: F\mathbf{F} is not C1C^1 at the origin. The divergence theorem does not apply to regions containing the singularity. In the distributional sense, F=4πδ(r)\nabla \cdot \mathbf{F} = 4\pi\delta(\mathbf{r}), where δ\delta is the Dirac delta — the “divergence” is zero everywhere except at the origin, where it is infinite in a precise measure-theoretic sense.

💡 Remark 6 (Conservation Laws)

The divergence theorem is the mathematical engine behind conservation laws. Consider a quantity with density ρ(x,t)\rho(\mathbf{x}, t) that flows with flux density J(x,t)\mathbf{J}(\mathbf{x}, t). If the quantity is conserved (neither created nor destroyed), the amount in any region EE changes only by flow through the boundary:

ddtEρdV=EJdS.\frac{d}{dt}\iiint_E \rho\,dV = -\oiint_{\partial E} \mathbf{J} \cdot d\mathbf{S}.

Applying the divergence theorem to the right side and moving the time derivative inside the integral (assuming sufficient smoothness):

EρtdV=EJdV.\iiint_E \frac{\partial \rho}{\partial t}\,dV = -\iiint_E \nabla \cdot \mathbf{J}\,dV.

Since this holds for every region EE, the integrands must be equal pointwise:

ρt+J=0.\frac{\partial \rho}{\partial t} + \nabla \cdot \mathbf{J} = 0.

This is the continuity equation — the local form of conservation. For a fluid with density ρ\rho and velocity v\mathbf{v}, J=ρv\mathbf{J} = \rho\mathbf{v}, giving ρt+(ρv)=0\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho\mathbf{v}) = 0, which is exactly the equation we started with in Section 1.

💡 Remark 7 (The Generalized Stokes' Theorem)

The Fundamental Theorem of Calculus, the Gradient Theorem, Green’s theorem, Stokes’ theorem, and the divergence theorem are all instances of a single result — the generalized Stokes’ theorem:

Mdω=Mω,\int_M d\omega = \int_{\partial M} \omega,

where MM is an oriented manifold with boundary M\partial M, ω\omega is a differential form, and dd is the exterior derivative.

Dimension of MMMMM\partial Mω\omegaTheorem
1 (interval)[a,b][a, b]{a,b}\{a, b\}ff (0-form)FTC: abfdx=f(b)f(a)\int_a^b f'\,dx = f(b) - f(a)
1 (curve)Curve CCEndpointsff (0-form)Gradient Theorem: Cfdr=f(b)f(a)\int_C \nabla f \cdot d\mathbf{r} = f(\mathbf{b}) - f(\mathbf{a})
2 (region)Region DR2D \subseteq \mathbb{R}^2Curve D\partial D1-form Pdx+QdyP\,dx + Q\,dyGreen’s theorem
2 (surface)Surface SR3S \subseteq \mathbb{R}^3Curve S\partial S1-form Fdr\mathbf{F} \cdot d\mathbf{r}Stokes’ theorem
3 (volume)Volume ER3E \subseteq \mathbb{R}^3Surface E\partial E2-form FdS\mathbf{F} \cdot d\mathbf{S}Divergence theorem

In every case: integrating a derivative (dωd\omega) over the interior equals integrating the original (ω\omega) over the boundary. The exterior derivative dd unifies the gradient, curl, and divergence as the same operation on forms of different degrees.

This unification is the starting point for differential geometry and the theory of integration on manifolds — see Smooth Manifolds on formalML for the full treatment.

Divergence theorem: volume E, boundary surface S, divergence as net outflow per unit volume

Surface flux S F·dS ≈ 12.5696
Volume integral E ∇·F dV ≈ 12.6103
Difference 0.0407

Divergence theorem: ∬S F·dS = ∭E ∇·F dV. Drag to rotate.

8. Graphs, Applications & Computation

For surfaces defined as graphs z=g(x,y)z = g(x, y), the formulas simplify considerably. These are the most common surfaces in applications, and the simplified formulas are worth isolating.

🔷 Proposition 4 (Surface Integrals over a Graph)

Let SS be the graph of z=g(x,y)z = g(x, y) for (x,y)D(x, y) \in D, and let F=(P,Q,R)\mathbf{F} = (P, Q, R). Then:

Scalar integral:

SfdS=Df(x,y,g(x,y))1+gx2+gy2dA.\iint_S f\,dS = \iint_D f(x, y, g(x,y))\,\sqrt{1 + g_x^2 + g_y^2}\,dA.

Flux integral (with upward-pointing normal):

SFdS=D[PgxQgy+R]dA,\iint_S \mathbf{F} \cdot d\mathbf{S} = \iint_D \left[-P\,g_x - Q\,g_y + R\right]\,dA,

where all functions are evaluated at (x,y,g(x,y))(x, y, g(x,y)). This follows from rx×ry=(gx,gy,1)\mathbf{r}_x \times \mathbf{r}_y = (-g_x, -g_y, 1) (Example 3).

📝 Example 14 (Area of a Saddle Surface)

Compute the surface area of z=x2y2z = x^2 - y^2 over the unit disk D:x2+y21D: x^2 + y^2 \le 1.

We have gx=2xg_x = 2x, gy=2yg_y = -2y, so 1+gx2+gy2=1+4x2+4y2=1+4r2\sqrt{1 + g_x^2 + g_y^2} = \sqrt{1 + 4x^2 + 4y^2} = \sqrt{1 + 4r^2} in polar coordinates.

Area=D1+4r2dA=02π011+4r2  rdrdθ.\text{Area} = \iint_D \sqrt{1 + 4r^2}\,dA = \int_0^{2\pi}\int_0^1 \sqrt{1 + 4r^2}\;r\,dr\,d\theta.

The inner integral: let u=1+4r2u = 1 + 4r^2, du=8rdrdu = 8r\,dr, so rdr=du/8r\,dr = du/8:

01r1+4r2dr=1815udu=1823[u3/2]15=112(551).\int_0^1 r\sqrt{1 + 4r^2}\,dr = \frac{1}{8}\int_1^5 \sqrt{u}\,du = \frac{1}{8}\cdot\frac{2}{3}[u^{3/2}]_1^5 = \frac{1}{12}(5\sqrt{5} - 1).

So Area=2π112(551)=π6(551)5.33\text{Area} = 2\pi \cdot \frac{1}{12}(5\sqrt{5} - 1) = \frac{\pi}{6}(5\sqrt{5} - 1) \approx 5.33.

(Note: the saddle surface warps significantly over the unit disk — its area is nearly 70% larger than the flat disk’s area of π3.14\pi \approx 3.14.)

📝 Example 15 (Flux via the Divergence Theorem)

Compute the flux of F=(x3,y3,z3)\mathbf{F} = (x^3, y^3, z^3) through the unit sphere SS.

Direct computation would require parameterizing the sphere and evaluating a complicated surface integral. Instead, we use the divergence theorem:

F=3x2+3y2+3z2=3r2.\nabla \cdot \mathbf{F} = 3x^2 + 3y^2 + 3z^2 = 3r^2.

Converting to spherical coordinates (rr, θ\theta, ϕ\phi) with dV=r2sinϕdrdϕdθdV = r^2\sin\phi\,dr\,d\phi\,d\theta:

E3r2dV=02π0π013r2r2sinϕdrdϕdθ=302πdθ0πsinϕdϕ01r4dr.\iiint_E 3r^2\,dV = \int_0^{2\pi}\int_0^\pi\int_0^1 3r^2 \cdot r^2\sin\phi\,dr\,d\phi\,d\theta = 3\int_0^{2\pi}d\theta\int_0^\pi\sin\phi\,d\phi\int_0^1 r^4\,dr.

Evaluating each factor: 02πdθ=2π\int_0^{2\pi}d\theta = 2\pi, 0πsinϕdϕ=2\int_0^\pi\sin\phi\,d\phi = 2, 01r4dr=15\int_0^1 r^4\,dr = \frac{1}{5}.

SFdS=32π215=12π5.\oiint_S \mathbf{F} \cdot d\mathbf{S} = 3 \cdot 2\pi \cdot 2 \cdot \frac{1}{5} = \frac{12\pi}{5}.

Graph surface z = g(x,y) with area element and normal vector

9. Computational Notes

In practice, surface integrals are computed by reducing to double integrals over the parameter domain, then applying numerical quadrature. Here are the key patterns.

Scalar surface integral:

import numpy as np
from scipy.integrate import dblquad

def scalar_surface_integral(f, r, r_u, r_v, u_bounds, v_bounds):
    """Compute ∬_S f dS via parameterization."""
    def integrand(v, u):
        point = r(u, v)
        cross = np.cross(r_u(u, v), r_v(u, v))
        return f(*point) * np.linalg.norm(cross)
    result, _ = dblquad(integrand, *u_bounds,
                        lambda u: v_bounds[0], lambda u: v_bounds[1])
    return result

Flux integral:

def flux_integral(F, r, r_u, r_v, u_bounds, v_bounds):
    """Compute ∬_S F · dS via parameterization."""
    def integrand(v, u):
        point = r(u, v)
        F_val = np.array(F(*point))
        cross = np.cross(r_u(u, v), r_v(u, v))
        return np.dot(F_val, cross)
    result, _ = dblquad(integrand, *u_bounds,
                        lambda u: v_bounds[0], lambda u: v_bounds[1])
    return result

Numerical curl and divergence:

def curl_3d(F, x, y, z, h=1e-7):
    """Compute ∇ × F at (x, y, z) via central differences."""
    Px, Qx, Rx = F(x + h, y, z)
    Pmx, Qmx, Rmx = F(x - h, y, z)
    Py, Qy, Ry = F(x, y + h, z)
    Pmy, Qmy, Rmy = F(x, y - h, z)
    Pz, Qz, Rz = F(x, y, z + h)
    Pmz, Qmz, Rmz = F(x, y, z - h)
    dRdy = (Ry - Rmy) / (2 * h)
    dQdz = (Qz - Qmz) / (2 * h)
    dPdz = (Pz - Pmz) / (2 * h)
    dRdx = (Rx - Rmx) / (2 * h)
    dQdx = (Qx - Qmx) / (2 * h)
    dPdy = (Py - Pmy) / (2 * h)
    return (dRdy - dQdz, dPdz - dRdx, dQdx - dPdy)

def divergence_3d(F, x, y, z, h=1e-7):
    """Compute ∇ · F at (x, y, z) via central differences."""
    dPdx = (F(x + h, y, z)[0] - F(x - h, y, z)[0]) / (2 * h)
    dQdy = (F(x, y + h, z)[1] - F(x, y - h, z)[1]) / (2 * h)
    dRdz = (F(x, y, z + h)[2] - F(x, y, z - h)[2]) / (2 * h)
    return dPdx + dQdy + dRdz

Verifying Stokes’ theorem numerically:

from scipy.integrate import quad

# F = (-y, x, 0), hemisphere bounded by unit circle
# Line integral around unit circle
def stokes_line(t):
    F = (-np.sin(t), np.cos(t), 0)
    dr = (-np.sin(t), np.cos(t), 0)
    return sum(f * d for f, d in zip(F, dr))

circulation, _ = quad(stokes_line, 0, 2 * np.pi)  # → 2π

# Surface integral of curl through hemisphere
def stokes_surface(phi, theta):
    # curl F = (0, 0, 2), outward normal on hemisphere
    sin_phi = np.sin(phi)
    cos_phi = np.cos(phi)
    curl_dot_n = 2 * sin_phi * cos_phi  # (0,0,2) · n̂ * |r_θ × r_φ|
    return curl_dot_n

curl_flux, _ = dblquad(stokes_surface, 0, 2 * np.pi,
                        0, np.pi / 2)  # → 2π

Verifying the divergence theorem numerically:

# F = (x², y², z²) on unit cube [0,1]³
from scipy.integrate import tplquad

# Volume integral of divergence
div_vol, _ = tplquad(
    lambda z, y, x: 2*x + 2*y + 2*z,
    0, 1, 0, 1, 0, 1
)  # → 3.0

# Flux through each face (computed analytically: 3.0)
# Face x=1: ∫∫ 1 dy dz = 1, face x=0: 0
# Face y=1: ∫∫ 1 dx dz = 1, face y=0: 0
# Face z=1: ∫∫ 1 dx dy = 1, face z=0: 0
# Total flux = 3.0

10. Connections to ML

Surface integrals and the divergence theorem are not abstract curiosities — they are the mathematical backbone of several active areas in modern machine learning.

10.1 Stein’s Identity and SVGD

The divergence theorem yields one of the most powerful tools in computational statistics: Stein’s identity. Let p(x)p(\mathbf{x}) be a smooth probability density on Rn\mathbb{R}^n with p(x)0p(\mathbf{x}) \to 0 as x\|\mathbf{x}\| \to \infty, and let F:RnRn\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n be a smooth test function. Applying the divergence theorem to pFp\mathbf{F} over a large ball BRB_R and taking RR \to \infty:

Rn(pF)dV=limRSRpFdS=0,\iiint_{\mathbb{R}^n} \nabla \cdot (p\mathbf{F})\,dV = \lim_{R \to \infty} \oiint_{S_R} p\mathbf{F} \cdot d\mathbf{S} = 0,

since p0p \to 0 on the boundary. Expanding (pF)=pF+Fp=pF+pFlogp\nabla \cdot (p\mathbf{F}) = p\,\nabla \cdot \mathbf{F} + \mathbf{F} \cdot \nabla p = p\,\nabla \cdot \mathbf{F} + p\,\mathbf{F} \cdot \nabla \log p:

Ep[F(X)+F(X)logp(X)]=0.\mathbb{E}_p[\nabla \cdot \mathbf{F}(\mathbf{X}) + \mathbf{F}(\mathbf{X}) \cdot \nabla \log p(\mathbf{X})] = 0.

This is Stein’s identity. It is the foundation of the kernel Stein discrepancy — a measure of how far a distribution qq is from pp — and of Stein variational gradient descent (SVGD), which transports particles to approximate a target distribution by following the steepest descent direction in the Stein discrepancy.

The entire machinery rests on the divergence theorem: the boundary flux vanishes, converting a volume integral of a divergence into a useful identity involving expectations.

-> Measure-Theoretic Probability on formalML

10.2 Physics-Informed Neural Networks (PINNs)

Physics-informed neural networks enforce PDE constraints as loss terms. Many of these PDEs are conservation laws, and the divergence theorem is the link between the differential (pointwise) and integral (global) forms.

For example, consider training a neural network to satisfy the heat equation ut=κ2u\frac{\partial u}{\partial t} = \kappa\nabla^2 u on a domain Ω\Omega. The integral form, obtained via the divergence theorem, is:

ddtΩudV=κΩudS.\frac{d}{dt}\iiint_\Omega u\,dV = \kappa\oiint_{\partial\Omega} \nabla u \cdot d\mathbf{S}.

The PINN loss enforces both the pointwise PDE (at interior collocation points) and the boundary conditions (at boundary points). The divergence theorem guarantees that satisfying the pointwise PDE implies the integral conservation law — and in practice, adding integral conservation constraints (derived via the divergence theorem) as additional loss terms improves training stability and physical fidelity.

10.3 Flow-Matching Generative Models

As introduced in Section 1, flow-matching models learn a time-dependent velocity field v(x,t)\mathbf{v}(\mathbf{x}, t) that transports a base distribution p0p_0 to a target distribution p1p_1. The continuity equation:

pt+(pv)=0\frac{\partial p}{\partial t} + \nabla \cdot (p\mathbf{v}) = 0

governs the evolution of p(x,t)p(\mathbf{x}, t). The divergence theorem ensures probability conservation: for any region EE:

ddtEpdV=EpvdS.\frac{d}{dt}\iiint_E p\,dV = -\oiint_{\partial E} p\mathbf{v} \cdot d\mathbf{S}.

Probability only enters or leaves EE through boundary flux — it is never created or destroyed. This conservation property is what makes flow-matching models well-defined as generative models: the learned flow preserves total probability mass exactly, so the output is a valid probability distribution.

-> Gradient Descent on formalML

10.4 Gradient Flow Conservation

For the gradient flow θ˙=L(θ)\dot{\theta} = -\nabla L(\theta), the divergence of the velocity field is:

(L)=ΔL=tr(HL),\nabla \cdot (-\nabla L) = -\Delta L = -\text{tr}(H_L),

where ΔL=i2Lθi2\Delta L = \sum_i \frac{\partial^2 L}{\partial \theta_i^2} is the Laplacian of the loss and HLH_L is the Hessian (Topic 11). The divergence theorem gives:

S(L)dS=E(ΔL)dV.\oiint_S (-\nabla L) \cdot d\mathbf{S} = \iiint_E (-\Delta L)\,dV.

In regions where ΔL>0\Delta L > 0 (the Hessian has positive trace — the loss is “subharmonic”), the right side is negative, meaning there is net inflow of gradient trajectories: trajectories converge. In regions where ΔL<0\Delta L < 0, trajectories diverge. The Laplacian of the loss — the trace of the Hessian — controls the focusing and defocusing of optimization trajectories, and the divergence theorem makes this connection precise.

-> Gradient Descent on formalML

ML connections: Stein's identity, PINNs conservation, flow-matching continuity, gradient flow divergence

11. Connections & Further Reading

Prerequisites Used

  • Multiple Integrals & Fubini’s Theorem — Surface integrals reduce to double integrals over the parameter domain. The divergence theorem converts surface integrals to triple integrals. The Fubini evaluation strategy is used in every computation.
  • Change of Variables & the Jacobian Determinant — The surface area element dS=ru×rvdudvdS = \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv is the Gram determinant, the 2D generalization of the Jacobian determinant. Spherical and cylindrical coordinates from Topic 14 evaluate the volume integrals.
  • Line Integrals & Conservative Fields — Green’s theorem is the 2D case of Stokes’ theorem. The 2D curl generalizes to the full 3D curl. Stokes’ theorem relates a line integral to a surface integral of the curl, extending the boundary-interior principle.
  • The Gradient & Directional Derivatives — The gradient, curl, and divergence are all built from the same differential operator \nabla. The exact sequence ×\nabla \to \nabla\times \to \nabla\cdot with d2=0d^2 = 0 unifies the theory.
  • The Jacobian & Multivariate Chain Rule — The Jacobian matrix JrJ_{\mathbf{r}} of the parameterization determines the surface normal and area element via the Gram determinant det(JrTJr)\sqrt{\det(J_{\mathbf{r}}^T J_{\mathbf{r}})}.
  • The Inverse & Implicit Function Theorems — Implicit surfaces F(x,y,z)=0F(x,y,z) = 0 have normal F/F\nabla F / \|\nabla F\|, providing an alternative to parameterization for surface integrals on level sets.
  • The Riemann Integral — After parameterization, every surface integral reduces to a double Riemann integral. Existence follows from the integrability theory of Topic 7.

What Comes Next

  • Series Convergence & Tests and Uniform Convergence — Fourier coefficients are computed via inner products that involve surface integrals in higher dimensions. The divergence theorem appears in the theory of Fourier series on domains.
  • Stability & Dynamical Systems — Lyapunov stability analysis uses the divergence theorem to establish energy conservation and dissipation: the flux of energy through a surface bounds the rate of change of energy in the interior.
  • Sigma-Algebras & Measures (coming soon) — Surface measures and the co-area formula generalize the surface integral to the measure-theoretic setting.
  • Inner Product & Hilbert Spaces — Hilbert space structure for Sobolev spaces. Weak derivatives and Sobolev spaces are developed in Calculus of Variations.
  • Calculus of Variations — The Euler-Lagrange equation for area-minimizing surfaces, Sobolev spaces as the natural domain for weak solutions, and the connection to minimal surfaces.
  • -> Gradient Descent — The divergence theorem connects the Laplacian of the loss to the convergence geometry of gradient flow.
  • -> Measure-Theoretic Probability — Stein’s identity via the divergence theorem; kernel Stein discrepancy and SVGD.
  • -> Smooth Manifolds — The generalized Stokes’ theorem Mdω=Mω\int_M d\omega = \int_{\partial M}\omega unifies all classical integral theorems.
  • -> Information Geometry — Fisher-Rao volume form, surface integrals on the statistical manifold, natural gradient flow conservation.

Prerequisite DAG

This topic has three inbound prerequisite edges — it is the only topic in the curriculum with this property:

multiple-integrals ──┐

change-of-variables ─┼──→ surface-integrals

line-integrals ──────┘

Each predecessor contributes specific machinery: double/triple integrals and Fubini (Topic 13), Jacobian area scaling and coordinate changes (Topic 14), and Green’s theorem as the 2D prototype of Stokes’ (Topic 15). Surface integrals synthesize all three into the capstone results of multivariable calculus.

This topic completes the Multivariable Integral Calculus track (4/4).

References

  1. book Spivak (1965). Calculus on Manifolds Chapter 5 — the generalized Stokes' theorem in the language of differential forms, unifying all classical integral theorems
  2. book Hubbard & Hubbard (2015). Vector Calculus, Linear Algebra, and Differential Forms Chapter 6 — surface integrals, flux, and the divergence theorem with geometric exposition and careful orientation treatment
  3. book Schey (2005). Div, Grad, Curl, and All That Chapters 3-4 — physical motivation for surface integrals via flux, the divergence theorem as a conservation law
  4. book Munkres (1991). Analysis on Manifolds Chapters 6-7 — rigorous treatment of surface integrals, the classical Stokes' and divergence theorems
  5. book Rudin (1976). Principles of Mathematical Analysis Chapter 10 — integration of differential forms, Stokes' theorem in ℝⁿ
  6. paper Liu, Lee & Jordan (2016). “A Kernelized Stein Discrepancy for Goodness-of-fit Tests” Stein's identity via the divergence theorem — the foundation of kernel Stein discrepancy and SVGD
  7. paper Lipman, Chen, Ben-Hamu, Nickel (2023). “Flow Matching for Generative Modeling” The continuity equation ∂_t p + ∇ · (pv) = 0 governs density evolution — a direct application of the divergence theorem
  8. paper Raissi, Perdikaris & Karniadakis (2019). “Physics-Informed Neural Networks” PDE constraints enforced via the divergence theorem — conservation laws as loss terms