Multivariable Integral · intermediate · 50 min read

Line Integrals & Conservative Fields

Integrating functions and vector fields along curves — the Gradient Theorem as the FTC for paths, conservative fields and potential functions, path independence, Green's theorem connecting circulation to double integrals, and gradient flow as continuous-time optimization.

Abstract. Line integrals extend integration from intervals and regions to curves. Given a parameterized curve C in ℝⁿ and a scalar function f, the scalar line integral ∫_C f ds = ∫_a^b f(r(t)) ‖r'(t)‖ dt sums f along C weighted by arc length. Given a vector field F, the vector line integral (work integral) ∫_C F · dr = ∫_a^b F(r(t)) · r'(t) dt measures the total work done by F along C. A vector field F is conservative if F = ∇f for some potential function f. The Gradient Theorem — the Fundamental Theorem of Calculus for line integrals — states that ∫_C ∇f · dr = f(r(b)) − f(r(a)): the integral depends only on the endpoints, not the path. This is equivalent to path independence: the integral of a conservative field between two points is the same regardless of which curve connects them. In ℝ², the exactness criterion ∂P/∂y = ∂Q/∂x characterizes conservative fields on simply connected domains. On domains with holes, closed fields need not be exact — the topology of the domain matters, as demonstrated by the vortex field. Green's theorem provides the bridge between line integrals and double integrals: the circulation ∮_C F · dr around a closed curve equals the double integral ∬_D (∂Q/∂x − ∂P/∂y) dA over the enclosed region. The integrand is the 2D curl of F, measuring infinitesimal rotation. Green's theorem also yields the area formula A = ½ ∮_C (x dy − y dx). In machine learning, gradient flow dθ/dt = −∇L(θ) traces a curve in parameter space along which the loss decreases monotonically — the Gradient Theorem guarantees L(θ(T)) − L(θ(0)) = −∫₀ᵀ ‖∇L(θ(t))‖² dt ≤ 0. Energy-based models define a scalar potential whose gradient field governs the model's dynamics. The natural gradient follows geodesics on the statistical manifold rather than straight lines in parameter space.

Where this leads → formalML

  • formalML Gradient flow dθ/dt = −∇L(θ) traces a curve in parameter space. The Gradient Theorem gives L(θ(T)) − L(θ(0)) = −∫₀ᵀ ‖∇L‖² dt ≤ 0, proving the loss decreases monotonically along the flow. Discrete gradient descent approximates this continuous path, and the integral quantifies convergence.
  • formalML Line integrals are integrals of differential 1-forms ω = P dx + Q dy along curves. Conservative fields are exact forms (ω = df). The gap between closed and exact forms — measured by de Rham cohomology H¹ — is the topological obstruction to conservativeness. Green's theorem is the 2D Stokes' theorem.
  • formalML Geodesics on the statistical manifold minimize the Fisher-Rao length functional — a line integral of the metric tensor. The natural gradient follows these geodesics, and path length in the Fisher-Rao metric measures statistical distinguishability.

1. Overview & Motivation

You’re training a neural network. At each step, gradient descent moves the parameter vector θ\theta a small distance in the direction L(θ)-\nabla L(\theta). Over many steps, the parameters trace a curve through parameter space — a winding path from initialization to (hopefully) a minimum. How much does the loss decrease along that entire path?

The answer is a line integral: ΔL=CLdr\Delta L = \int_C \nabla L \cdot d\mathbf{r}. The Gradient Theorem — the subject of this topic — says this integral equals L(θfinal)L(θinit)L(\theta_{\text{final}}) - L(\theta_{\text{init}}), regardless of the path’s shape. This is the Fundamental Theorem of Calculus, generalized from intervals to curves in Rn\mathbb{R}^n.

But not every vector field is a gradient. When a field is a gradient — when it is conservative — integration becomes dramatically simpler: the integral depends only on the endpoints, not the path. When a field is not conservative, the path matters, and the distinction between “conservative” and “not conservative” becomes a topological question about the domain. That question — when does the shape of the path matter? — is the central thread of this topic.

2. Parameterized Curves

Before we can integrate along curves, we need to say precisely what a “curve” is and how to measure length along it.

A curve in Rn\mathbb{R}^n is a path traced out by a moving point. We describe it by giving the position at each “time” tt: the parameterization r(t)=(x1(t),,xn(t))\mathbf{r}(t) = (x_1(t), \ldots, x_n(t)) for t[a,b]t \in [a, b]. The velocity vector r(t)\mathbf{r}'(t) points along the curve, and its magnitude r(t)\|\mathbf{r}'(t)\| is the speed. Arc length — the total distance traveled — is the integral of speed.

📐 Definition 1 (Parameterized Curve)

A parameterized curve in Rn\mathbb{R}^n is a continuous function r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n. The curve is smooth if r\mathbf{r} is C1C^1 and r(t)0\mathbf{r}'(t) \neq \mathbf{0} for all t(a,b)t \in (a, b) — the velocity never vanishes, so the particle never stops. The curve is piecewise smooth if [a,b][a, b] can be partitioned into finitely many subintervals on each of which r\mathbf{r} is smooth.

📐 Definition 2 (Arc Length)

The arc length of a smooth curve r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n is

L(C)=abr(t)dt.L(C) = \int_a^b \|\mathbf{r}'(t)\|\,dt.

The arc length element is ds=r(t)dtds = \|\mathbf{r}'(t)\|\,dt, representing the infinitesimal distance traveled in an infinitesimal time dtdt.

💡 Remark 1 (Reparameterization Invariance)

If ϕ:[α,β][a,b]\phi: [\alpha, \beta] \to [a, b] is a C1C^1 bijection with ϕ(τ)>0\phi'(\tau) > 0 (orientation-preserving), then r~(τ)=r(ϕ(τ))\tilde{\mathbf{r}}(\tau) = \mathbf{r}(\phi(\tau)) traces the same curve in the same direction. By the substitution rule (Topic 14, Theorem 1):

αβr~(τ)dτ=abr(t)dt.\int_\alpha^\beta \|\tilde{\mathbf{r}}'(\tau)\|\,d\tau = \int_a^b \|\mathbf{r}'(t)\|\,dt.

Arc length does not depend on how fast we traverse the curve — it is a geometric property of the curve itself.

📝 Example 1 (Circle of Radius R)

Let r(t)=(Rcost,  Rsint)\mathbf{r}(t) = (R\cos t,\; R\sin t) for t[0,2π]t \in [0, 2\pi]. Then r(t)=(Rsint,  Rcost)\mathbf{r}'(t) = (-R\sin t,\; R\cos t) and r(t)=R\|\mathbf{r}'(t)\| = R. The arc length is

L=02πRdt=2πR.L = \int_0^{2\pi} R\,dt = 2\pi R.

The constant speed RR means the particle moves uniformly — the arc length is simply speed times time.

📝 Example 2 (Helix)

A helix r(t)=(cost,sint,t)\mathbf{r}(t) = (\cos t, \sin t, t) for t[0,2π]t \in [0, 2\pi] climbs one full turn. We have r(t)=sin2t+cos2t+1=2\|\mathbf{r}'(t)\| = \sqrt{\sin^2 t + \cos^2 t + 1} = \sqrt{2}, giving L=2π2L = 2\pi\sqrt{2}. The vertical climb adds length to the horizontal circle.

📝 Example 3 (Parabolic Arc)

The parabola r(t)=(t,t2)\mathbf{r}(t) = (t, t^2) for t[0,1]t \in [0, 1] has r(t)=1+4t2\|\mathbf{r}'(t)\| = \sqrt{1 + 4t^2}. The arc length integral 011+4t2dt\int_0^1 \sqrt{1 + 4t^2}\,dt requires the sinh1\sinh^{-1} formula or numerical quadrature — not every arc length computation is elementary.

Parameterized curves: circle, helix (3D projection), and parabolic arc with velocity vectors

3. Scalar Line Integrals

The scalar line integral Cfds\int_C f\,ds sums the values of a function ff along a curve CC, weighted by arc length. If f=1f = 1, we recover the arc length itself. If ff represents density (mass per unit length), the integral gives total mass.

Imagine a wire bent into the shape of CC, with density f(x,y)f(x, y) at each point. The total mass is Cfds\int_C f\,ds. The wire analogy makes clear why we weight by dsds rather than dtdt: the physical mass depends on the curve’s geometry, not on how fast we parameterize it.

📐 Definition 3 (Scalar Line Integral)

Let CC be a smooth curve parameterized by r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n, and let f:CRf: C \to \mathbb{R} be continuous. The scalar line integral of ff over CC is:

Cfds=abf(r(t))r(t)dt.\int_C f\,ds = \int_a^b f(\mathbf{r}(t))\,\|\mathbf{r}'(t)\|\,dt.

This is a Riemann integral (Topic 7) of the composite function tf(r(t))r(t)t \mapsto f(\mathbf{r}(t)) \cdot \|\mathbf{r}'(t)\| over [a,b][a, b].

🔷 Proposition 1 (Parameterization Independence)

The scalar line integral Cfds\int_C f\,ds is independent of the parameterization of CC (including orientation). Any two smooth parameterizations of the same curve give the same value.

Proof.

Let r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n and r~=rϕ:[α,β]Rn\tilde{\mathbf{r}} = \mathbf{r} \circ \phi: [\alpha, \beta] \to \mathbb{R}^n with ϕ\phi a C1C^1 bijection. By the chain rule, r~(τ)=r(ϕ(τ))ϕ(τ)\tilde{\mathbf{r}}'(\tau) = \mathbf{r}'(\phi(\tau)) \cdot \phi'(\tau), so r~(τ)=r(ϕ(τ))ϕ(τ)\|\tilde{\mathbf{r}}'(\tau)\| = \|\mathbf{r}'(\phi(\tau))\| \cdot |\phi'(\tau)|. Then:

αβf(r~(τ))r~(τ)dτ=αβf(r(ϕ(τ)))r(ϕ(τ))ϕ(τ)dτ.\int_\alpha^\beta f(\tilde{\mathbf{r}}(\tau))\,\|\tilde{\mathbf{r}}'(\tau)\|\,d\tau = \int_\alpha^\beta f(\mathbf{r}(\phi(\tau)))\,\|\mathbf{r}'(\phi(\tau))\|\,|\phi'(\tau)|\,d\tau.

By the substitution rule (Topic 14, Theorem 1) with t=ϕ(τ)t = \phi(\tau), this equals abf(r(t))r(t)dt\int_a^b f(\mathbf{r}(t))\,\|\mathbf{r}'(t)\|\,dt. The absolute value ϕ(τ)|\phi'(\tau)| ensures the result holds regardless of whether ϕ\phi preserves or reverses orientation.

📝 Example 4 (Mass of a Semicircular Wire)

A wire follows C:r(t)=(cost,sint)C: \mathbf{r}(t) = (\cos t, \sin t) for t[0,π]t \in [0, \pi] with density f(x,y)=yf(x, y) = y. Then:

Cfds=0πsint1dt=[cost]0π=2.\int_C f\,ds = \int_0^\pi \sin t \cdot 1\,dt = [-\cos t]_0^\pi = 2.

The wire is heaviest at the top (y=1y = 1) and weightless at the endpoints (y=0y = 0). The total mass is 2.

📝 Example 5 (Average Value Along a Curve)

The average value of ff over CC is fˉ=1L(C)Cfds\bar{f} = \frac{1}{L(C)} \int_C f\,ds, directly analogous to fˉ=1baabf(x)dx\bar{f} = \frac{1}{b-a} \int_a^b f(x)\,dx from single-variable calculus (Topic 7).

Scalar line integral: wire density f(x,y) = y along semicircle, with ds elements shown

4. Vector Line Integrals — The Work Integral

The vector line integral CFdr\int_C \mathbf{F} \cdot d\mathbf{r} measures the work done by a force field F\mathbf{F} on a particle moving along CC. Unlike the scalar line integral, this integral is orientation-sensitive — reversing the direction of traversal negates the result.

At each point on CC, the vector field F\mathbf{F} has a component tangent to the curve and a component perpendicular to it. Only the tangent component contributes to work. The dot product Fr(t)\mathbf{F} \cdot \mathbf{r}'(t) extracts exactly this tangent component (times the speed). Integrating over tt sums up the infinitesimal contributions Fdr\mathbf{F} \cdot d\mathbf{r} along the entire path.

📐 Definition 4 (Vector Line Integral)

Let CC be a smooth curve parameterized by r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n and F:RnRn\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n a continuous vector field. The vector line integral (or work integral) of F\mathbf{F} along CC is:

CFdr=abF(r(t))r(t)dt.\int_C \mathbf{F} \cdot d\mathbf{r} = \int_a^b \mathbf{F}(\mathbf{r}(t)) \cdot \mathbf{r}'(t)\,dt.

In R2\mathbb{R}^2, writing F=(P,Q)\mathbf{F} = (P, Q) and dr=(dx,dy)d\mathbf{r} = (dx, dy), this becomes CPdx+Qdy=ab[P(r(t))x(t)+Q(r(t))y(t)]dt\int_C P\,dx + Q\,dy = \int_a^b [P(\mathbf{r}(t))\,x'(t) + Q(\mathbf{r}(t))\,y'(t)]\,dt.

💡 Remark 2 (Orientation Matters)

Reversing the curve CC — traversing from r(b)\mathbf{r}(b) to r(a)\mathbf{r}(a) — negates the integral: CFdr=CFdr\int_{-C} \mathbf{F} \cdot d\mathbf{r} = -\int_C \mathbf{F} \cdot d\mathbf{r}. This is because r(t)\mathbf{r}'(t) reverses sign under orientation reversal, and the dot product is linear. By contrast, the scalar line integral Cfds\int_C f\,ds is orientation-independent because r(t)\|\mathbf{r}'(t)\| is always positive.

💡 Remark 3 (Parameterization Independence)

The vector line integral is independent of the orientation-preserving parameterization. Any two parameterizations that traverse CC in the same direction yield the same value. The proof is the same substitution argument as Proposition 1, but without the absolute value — the sign of ϕ(τ)\phi'(\tau) cancels the reversed limits, preserving the integral’s value.

🔷 Theorem 1 (Properties of Line Integrals)

Let CC, C1C_1, C2C_2 be piecewise-smooth curves, F\mathbf{F}, G\mathbf{G} continuous vector fields, and α,βR\alpha, \beta \in \mathbb{R}.

  1. Linearity: C(αF+βG)dr=αCFdr+βCGdr\int_C (\alpha\mathbf{F} + \beta\mathbf{G}) \cdot d\mathbf{r} = \alpha\int_C \mathbf{F} \cdot d\mathbf{r} + \beta\int_C \mathbf{G} \cdot d\mathbf{r}.

  2. Additivity over path concatenation: If C=C1+C2C = C_1 + C_2 (the endpoint of C1C_1 is the start of C2C_2), then CFdr=C1Fdr+C2Fdr\int_C \mathbf{F} \cdot d\mathbf{r} = \int_{C_1} \mathbf{F} \cdot d\mathbf{r} + \int_{C_2} \mathbf{F} \cdot d\mathbf{r}.

  3. Orientation reversal: CFdr=CFdr\int_{-C} \mathbf{F} \cdot d\mathbf{r} = -\int_C \mathbf{F} \cdot d\mathbf{r}.

📝 Example 6 (Work by a Constant Force)

Let F=(3,4)\mathbf{F} = (3, 4) and CC be the line segment from (0,0)(0, 0) to (2,1)(2, 1): r(t)=(2t,t)\mathbf{r}(t) = (2t, t) for t[0,1]t \in [0, 1]. Then r(t)=(2,1)\mathbf{r}'(t) = (2, 1) and:

CFdr=01(32+41)dt=10.\int_C \mathbf{F} \cdot d\mathbf{r} = \int_0^1 (3 \cdot 2 + 4 \cdot 1)\,dt = 10.

For a constant field, the work equals FΔr=(3,4)(2,1)=10\mathbf{F} \cdot \Delta\mathbf{r} = (3, 4) \cdot (2, 1) = 10 — the integral is just a dot product.

📝 Example 7 (Work by a Radial Field)

Let F(x,y)=(x,y)\mathbf{F}(x, y) = (x, y) and CC be the upper semicircle from (1,0)(1, 0) to (1,0)(-1, 0): r(t)=(cost,sint)\mathbf{r}(t) = (\cos t, \sin t) for t[0,π]t \in [0, \pi]. Then r(t)=(sint,cost)\mathbf{r}'(t) = (-\sin t, \cos t):

CFdr=0π[(cost)(sint)+(sint)(cost)]dt=0π0dt=0.\int_C \mathbf{F} \cdot d\mathbf{r} = \int_0^\pi [(\cos t)(-\sin t) + (\sin t)(\cos t)]\,dt = \int_0^\pi 0\,dt = 0.

The radial field is everywhere perpendicular to the circle — it does zero work along any circular arc.

📝 Example 8 (Work by a Non-Conservative Field)

Let F(x,y)=(y,x)\mathbf{F}(x, y) = (-y, x). Compute the work along two different paths from (1,0)(1, 0) to (0,1)(0, 1):

Path C1C_1: Line segment r(t)=(1t,t)\mathbf{r}(t) = (1 - t, t) for t[0,1]t \in [0, 1]. Then r(t)=(1,1)\mathbf{r}'(t) = (-1, 1) and F(r(t))=(t,1t)\mathbf{F}(\mathbf{r}(t)) = (-t, 1 - t):

C1Fdr=01[(t)(1)+(1t)(1)]dt=011dt=1.\int_{C_1} \mathbf{F} \cdot d\mathbf{r} = \int_0^1 [(-t)(-1) + (1-t)(1)]\,dt = \int_0^1 1\,dt = 1.

Path C2C_2: Quarter-circle r(t)=(cost,sint)\mathbf{r}(t) = (\cos t, \sin t) for t[0,π/2]t \in [0, \pi/2]. Then F(r(t))=(sint,cost)=r(t)\mathbf{F}(\mathbf{r}(t)) = (-\sin t, \cos t) = \mathbf{r}'(t):

C2Fdr=0π/2(sin2t+cos2t)dt=π2.\int_{C_2} \mathbf{F} \cdot d\mathbf{r} = \int_0^{\pi/2} (\sin^2 t + \cos^2 t)\,dt = \frac{\pi}{2}.

Different paths, different integrals (1π/21 \neq \pi/2). This field is not conservative.

Vector field with curve, tangent component projection at sample points

Progress:0%
Position: (1.000, 0.000)
F(r(t)): (0.000, 1.000)
Work so far: 0.0000
Total ∫C F · dr: 6.2832

Rigid counterclockwise rotation. Constant curl = 2 everywhere.

5. Conservative Fields & the Gradient Theorem

This section contains the most important result in the topic — the Fundamental Theorem of Calculus for line integrals. It explains why “gradient” and “conservative” are the same concept.

If F=f\mathbf{F} = \nabla f, then the work integral CFdr\int_C \mathbf{F} \cdot d\mathbf{r} is just the total change in ff along the curve — the difference between the “heights” at the endpoints. Think of ff as elevation: a hiker following a trail gains elevation f(end)f(start)f(\text{end}) - f(\text{start}) regardless of the trail’s shape. The gradient field f\nabla f always points uphill, so walking along a contour (level curve of ff) does zero work — the gradient is perpendicular to level sets (Topic 9).

📐 Definition 5 (Conservative Vector Field)

A vector field F:DRn\mathbf{F}: D \to \mathbb{R}^n (where DRnD \subseteq \mathbb{R}^n is open and connected) is conservative if there exists a C1C^1 function f:DRf: D \to \mathbb{R} such that F=f\mathbf{F} = \nabla f on DD. The function ff is called a potential function (or scalar potential) for F\mathbf{F}.

💡 Remark 4 (Potential Functions Are Unique Up to a Constant)

If ff and gg are both potential functions for F\mathbf{F} on a connected domain DD, then (fg)=0\nabla(f - g) = \mathbf{0} on DD, so fgf - g is constant. This follows from the fact that a function with zero gradient on a connected domain must be constant — a consequence of the Mean Value Theorem (Topic 6).

🔷 Theorem 2 (The Gradient Theorem (FTC for Line Integrals))

Let f:DRf: D \to \mathbb{R} be a C1C^1 function on an open set DRnD \subseteq \mathbb{R}^n, and let CC be a piecewise-smooth curve in DD from a\mathbf{a} to b\mathbf{b}. Then:

Cfdr=f(b)f(a).\int_C \nabla f \cdot d\mathbf{r} = f(\mathbf{b}) - f(\mathbf{a}).

Proof.

Define g(t)=f(r(t))g(t) = f(\mathbf{r}(t)) for t[a,b]t \in [a, b]. By the chain rule (Topic 5 for scalar functions, Topic 10 for the multivariable version):

g(t)=f(r(t))r(t).g'(t) = \nabla f(\mathbf{r}(t)) \cdot \mathbf{r}'(t).

This is the key identity: the integrand of the line integral is exactly g(t)g'(t). By the Fundamental Theorem of Calculus (Topic 7, Theorem 2):

Cfdr=abg(t)dt=g(b)g(a)=f(r(b))f(r(a)).\int_C \nabla f \cdot d\mathbf{r} = \int_a^b g'(t)\,dt = g(b) - g(a) = f(\mathbf{r}(b)) - f(\mathbf{r}(a)). \qquad \square

The proof is strikingly short — it’s the chain rule plus the FTC. The chain rule converts the multivariable line integral into a single-variable integral, and the FTC evaluates it. This is why the Gradient Theorem is the “FTC for line integrals.”

📝 Example 9 (Gravitational Potential)

Let F(x,y)=(2x,2y)\mathbf{F}(x, y) = (2x, 2y) with potential f(x,y)=x2+y2f(x, y) = x^2 + y^2. For any curve CC from (1,0)(1, 0) to (0,3)(0, 3):

CFdr=f(0,3)f(1,0)=91=8.\int_C \mathbf{F} \cdot d\mathbf{r} = f(0, 3) - f(1, 0) = 9 - 1 = 8.

No parameterization needed — just endpoint evaluation.

📝 Example 10 (Verifying Example 7 via the Gradient Theorem)

The radial field F(x,y)=(x,y)= ⁣(x2+y22)\mathbf{F}(x, y) = (x, y) = \nabla\!\left(\frac{x^2 + y^2}{2}\right). The curve from (1,0)(1, 0) to (1,0)(-1, 0) gives:

f(1,0)f(1,0)=1212=0.f(-1, 0) - f(1, 0) = \frac{1}{2} - \frac{1}{2} = 0.

The Gradient Theorem reproduces Example 7’s result — zero work — without any integration.

Potential surface z = f(x,y) with two paths between same endpoints, height difference labeled

The surface shows φ(x, y). The gradient field ∇φ is projected onto the floor plane. For any curve C from point A to point B, the Gradient Theorem gives ∫C ∇φ · dr = φ(B) − φ(A).

6. Path Independence & the Exactness Criterion

When is a vector field conservative? The Gradient Theorem shows that conservative fields have path-independent integrals. The converse is also true: path independence implies conservativeness. And there is a practical, computable test.

📐 Definition 6 (Path Independence)

A vector field F:DRn\mathbf{F}: D \to \mathbb{R}^n has path-independent line integrals if C1Fdr=C2Fdr\int_{C_1} \mathbf{F} \cdot d\mathbf{r} = \int_{C_2} \mathbf{F} \cdot d\mathbf{r} for every pair of piecewise-smooth curves C1,C2C_1, C_2 in DD that share the same endpoints.

📐 Definition 7 (Closed Curve)

A curve CC parameterized by r:[a,b]Rn\mathbf{r}: [a, b] \to \mathbb{R}^n is closed if r(a)=r(b)\mathbf{r}(a) = \mathbf{r}(b). We write C\oint_C for integrals over closed curves.

🔷 Theorem 3 (Equivalence of Conservative, Path-Independent, and Zero-Circulation)

Let F:DRn\mathbf{F}: D \to \mathbb{R}^n be a continuous vector field on an open connected domain DD. The following are equivalent:

  1. F\mathbf{F} is conservative (F=f\mathbf{F} = \nabla f for some C1C^1 function ff).
  2. CFdr\int_C \mathbf{F} \cdot d\mathbf{r} is path-independent in DD.
  3. CFdr=0\oint_C \mathbf{F} \cdot d\mathbf{r} = 0 for every piecewise-smooth closed curve CC in DD.

Proof.

(1) \Rightarrow (2): Immediate from the Gradient Theorem — the integral equals f(b)f(a)f(\mathbf{b}) - f(\mathbf{a}), which depends only on the endpoints.

(2) \Rightarrow (3): If CC is closed, its start and end points coincide: a=b\mathbf{a} = \mathbf{b}. Split CC at any interior point p\mathbf{p} into two curves C1C_1 (from a\mathbf{a} to p\mathbf{p}) and C2C_2 (from p\mathbf{p} to a\mathbf{a}). By path independence, C1Fdr=C2Fdr=C2Fdr\int_{C_1} \mathbf{F} \cdot d\mathbf{r} = \int_{-C_2} \mathbf{F} \cdot d\mathbf{r} = -\int_{C_2} \mathbf{F} \cdot d\mathbf{r}, so C=C1+C2=0\oint_C = \int_{C_1} + \int_{C_2} = 0.

(3) \Rightarrow (1): Fix a base point aD\mathbf{a} \in D and define f(x)=CFdrf(\mathbf{x}) = \int_C \mathbf{F} \cdot d\mathbf{r} where CC is any path from a\mathbf{a} to x\mathbf{x}. The zero-circulation condition ensures this is well-defined (different paths give the same value).

To show f=F\nabla f = \mathbf{F}: compute fxi(x)\frac{\partial f}{\partial x_i}(\mathbf{x}) by choosing the path to x+hei\mathbf{x} + h\mathbf{e}_i as any path from a\mathbf{a} to x\mathbf{x}, then a straight segment from x\mathbf{x} to x+hei\mathbf{x} + h\mathbf{e}_i. The difference f(x+hei)f(x)=0hFi(x+sei)dsf(\mathbf{x} + h\mathbf{e}_i) - f(\mathbf{x}) = \int_0^h F_i(\mathbf{x} + s\mathbf{e}_i)\,ds. By the FTC (Topic 7), dividing by hh and taking h0h \to 0 gives fxi(x)=Fi(x)\frac{\partial f}{\partial x_i}(\mathbf{x}) = F_i(\mathbf{x}). \square

📐 Definition 8 (Simply Connected Domain)

An open connected domain DR2D \subseteq \mathbb{R}^2 is simply connected if every closed curve in DD can be continuously shrunk to a point without leaving DD. Informally: DD has no holes. The full plane R2\mathbb{R}^2 is simply connected; the punctured plane R2{(0,0)}\mathbb{R}^2 \setminus \{(0,0)\} is not.

🔷 Theorem 4 (Exactness Criterion)

Let F=(P,Q):DR2\mathbf{F} = (P, Q): D \to \mathbb{R}^2 be a C1C^1 vector field on an open, simply connected domain DR2D \subseteq \mathbb{R}^2. Then F\mathbf{F} is conservative if and only if:

Py=Qxon D.\frac{\partial P}{\partial y} = \frac{\partial Q}{\partial x} \quad \text{on } D.

💡 Remark 5 (Why 'Simply Connected'?)

The condition P/y=Q/x\partial P / \partial y = \partial Q / \partial x says F\mathbf{F} is closed — its 1-form Pdx+QdyP\,dx + Q\,dy is closed. On simply connected domains, closed = exact (= conservative). On domains with holes, closed \neq exact. The gap is topological, not analytical — it is measured by de Rham cohomology HdR1H^1_{\text{dR}} (→ Smooth Manifolds on formalML).

📝 Example 11 (Testing Conservativeness)

Let F(x,y)=(2xy+y2,  x2+2xy)\mathbf{F}(x, y) = (2xy + y^2,\; x^2 + 2xy). Check: Py=2x+2y=Qx\frac{\partial P}{\partial y} = 2x + 2y = \frac{\partial Q}{\partial x}. Conservative.

Find ff: from fx=2xy+y2f_x = 2xy + y^2 we get f(x,y)=x2y+xy2+g(y)f(x, y) = x^2 y + xy^2 + g(y). Then fy=x2+2xy+g(y)=x2+2xyf_y = x^2 + 2xy + g'(y) = x^2 + 2xy forces g(y)=0g'(y) = 0, so f(x,y)=x2y+xy2+Cf(x, y) = x^2 y + xy^2 + C.

📝 Example 12 (The Vortex Field — Topology Matters)

The vortex field F(x,y)=(yx2+y2,  xx2+y2)\mathbf{F}(x, y) = \left(\frac{-y}{x^2+y^2},\; \frac{x}{x^2+y^2}\right) on D=R2{(0,0)}D = \mathbb{R}^2 \setminus \{(0,0)\}.

Check the exactness condition: Py=y2x2(x2+y2)2=Qx\frac{\partial P}{\partial y} = \frac{y^2 - x^2}{(x^2+y^2)^2} = \frac{\partial Q}{\partial x}. The condition holds, yet the circulation around the unit circle is:

CFdr=2π0.\oint_C \mathbf{F} \cdot d\mathbf{r} = 2\pi \neq 0.

The catch: DD is not simply connected — it has a hole at the origin. The “potential function” f(x,y)=arctan(y/x)f(x,y) = \arctan(y/x) is multi-valued; it gains 2π2\pi each time we circle the origin. The vortex field is the canonical example showing that topology matters.

Four-panel: conservative field with three paths (same integral), non-conservative field with three paths (different integrals)

Vortex field with circulation 2π around origin, highlighting the hole

Conservative field

Straight line1.5000
Parabolic arc1.5000
Circular arc1.5000

All paths give the same value

Non-conservative field

Straight line0.0000
Parabolic arc0.7500
Circular arc1.2843

Different paths, different values

7. Green’s Theorem

Green’s theorem converts a line integral around a closed curve into a double integral over the enclosed region. This is the 2D special case of the generalized Stokes’ theorem — the single most powerful identity in vector calculus.

Walk around the boundary of a region DD. At each point, the vector field F\mathbf{F} pushes you along (or against) your direction of travel. The total work around the loop — the circulation — equals the integral of the “rotation” of F\mathbf{F} over the interior. That “rotation” is QxPy\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}, the 2D curl.

🔷 Theorem 5 (Green's Theorem)

Let DR2D \subseteq \mathbb{R}^2 be a bounded region with piecewise-smooth boundary D\partial D oriented counterclockwise. Let F=(P,Q):DˉR2\mathbf{F} = (P, Q): \bar{D} \to \mathbb{R}^2 be a C1C^1 vector field. Then:

DPdx+Qdy=D(QxPy)dA.\oint_{\partial D} P\,dx + Q\,dy = \iint_D \left(\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}\right)\,dA.

Proof.

We show DPdx=DPydA\oint_{\partial D} P\,dx = -\iint_D \frac{\partial P}{\partial y}\,dA and DQdy=DQxdA\oint_{\partial D} Q\,dy = \iint_D \frac{\partial Q}{\partial x}\,dA separately, then add.

Proof that Pdx=PydA\oint P\,dx = -\iint \frac{\partial P}{\partial y}\,dA: Let DD be a Type I region: axba \le x \le b, g1(x)yg2(x)g_1(x) \le y \le g_2(x). The right side is:

DPydA=abg1(x)g2(x)Py(x,y)dydx=ab[P(x,g2(x))P(x,g1(x))]dx.-\iint_D \frac{\partial P}{\partial y}\,dA = -\int_a^b \int_{g_1(x)}^{g_2(x)} \frac{\partial P}{\partial y}(x, y)\,dy\,dx = -\int_a^b \bigl[P(x, g_2(x)) - P(x, g_1(x))\bigr]\,dx.

The boundary D\partial D traversed counterclockwise consists of: the bottom curve C1:y=g1(x)C_1: y = g_1(x) from x=ax = a to x=bx = b, the right side, the top curve C3:y=g2(x)C_3: y = g_2(x) from x=bx = b to x=ax = a, and the left side. On C1C_1: C1Pdx=abP(x,g1(x))dx\int_{C_1} P\,dx = \int_a^b P(x, g_1(x))\,dx. On C3C_3 (reversed): C3Pdx=abP(x,g2(x))dx\int_{C_3} P\,dx = -\int_a^b P(x, g_2(x))\,dx. On the vertical sides, dx=0dx = 0, so their contributions vanish. Adding:

DPdx=abP(x,g1(x))dxabP(x,g2(x))dx=ab[P(x,g2(x))P(x,g1(x))]dx.\oint_{\partial D} P\,dx = \int_a^b P(x, g_1(x))\,dx - \int_a^b P(x, g_2(x))\,dx = -\int_a^b [P(x, g_2(x)) - P(x, g_1(x))]\,dx.

The proof for QdyQ\,dy is analogous using a Type II description of DD. For general regions, decompose into Type I and Type II pieces; interior boundary contributions cancel in pairs. \square

📝 Example 13 (Circulation of F = (−y, x) Around the Unit Circle)

Direct computation: C(ydx+xdy)\oint_C (-y\,dx + x\,dy) with r(t)=(cost,sint)\mathbf{r}(t) = (\cos t, \sin t) gives:

02π[sin2t+cos2t]dt=2π.\int_0^{2\pi} [\sin^2 t + \cos^2 t]\,dt = 2\pi.

Via Green’s theorem: QxPy=1(1)=2\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} = 1 - (-1) = 2, so:

D2dA=2π=2π.\iint_D 2\,dA = 2 \cdot \pi = 2\pi.

Both give 2π2\pi. The rotation field (y,x)(-y, x) has constant curl 2 — every point in the disk contributes equally to the circulation.

📝 Example 14 (Area via Green's Theorem)

Setting P=y/2P = -y/2, Q=x/2Q = x/2 gives QxPy=12+12=1\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} = \frac{1}{2} + \frac{1}{2} = 1, so:

A(D)=DdA=12D(xdyydx).A(D) = \iint_D dA = \frac{1}{2}\oint_{\partial D} (x\,dy - y\,dx).

This is the Shoelace formula for polygonal areas (a special case when D\partial D is a polygon) and the formula used by mechanical planimeters.

💡 Remark 6 (Green's Theorem as a Conservation Law)

Green’s theorem says that the “total rotation inside DD” equals the “total circulation around D\partial D.” The interior quantity (curl) and the boundary quantity (circulation) are related by an exact balance. This is the prototype of all conservation laws in physics — and the 2D instance of Stokes’ theorem, which will be generalized to surfaces and volumes in Surface Integrals & the Divergence Theorem.

Region D with boundary traversal and interior curl heatmap, both sides computed

∮ F · dr = 6.283185
∬ curl(F) dA = 6.400000
Difference: 1.17e-1

8. Curl & Circulation

The integrand in Green’s theorem — QxPy\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} — is the 2D curl. It measures how much the vector field “rotates” around each point. We can formalize this by taking a limit of circulations over shrinking loops.

📐 Definition 9 (2D Curl (Scalar Curl))

For F=(P,Q):DR2\mathbf{F} = (P, Q): D \to \mathbb{R}^2 of class C1C^1, the 2D curl (or scalar curl) is:

curlF(x,y)=QxPy.\operatorname{curl}\mathbf{F}(x, y) = \frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}.

This is the k^\hat{\mathbf{k}}-component of the 3D curl ×F\nabla \times \mathbf{F}, with F\mathbf{F} viewed as the 3D field (P,Q,0)(P, Q, 0).

🔷 Proposition 2 (Curl as Infinitesimal Circulation)

Let F\mathbf{F} be C1C^1 at p\mathbf{p}, and let CrC_r be the circle of radius rr centered at p\mathbf{p}, oriented counterclockwise. Then:

curlF(p)=limr01πr2CrFdr.\operatorname{curl}\mathbf{F}(\mathbf{p}) = \lim_{r \to 0} \frac{1}{\pi r^2} \oint_{C_r} \mathbf{F} \cdot d\mathbf{r}.

The curl is the circulation per unit area in the limit of infinitesimally small loops.

Proof.

By Green’s theorem, CrFdr=DrcurlFdA\oint_{C_r} \mathbf{F} \cdot d\mathbf{r} = \iint_{D_r} \operatorname{curl}\mathbf{F}\,dA. By the Mean Value Theorem for double integrals (Topic 13), DrcurlFdA=curlF(pr)πr2\iint_{D_r} \operatorname{curl}\mathbf{F}\,dA = \operatorname{curl}\mathbf{F}(\mathbf{p}_r) \cdot \pi r^2 for some prDr\mathbf{p}_r \in D_r. As r0r \to 0, prp\mathbf{p}_r \to \mathbf{p} and continuity of curlF\operatorname{curl}\mathbf{F} gives the limit. \square

💡 Remark 7 (Conservative ⟺ Curl-Free (on Simply Connected Domains))

Theorem 4 restated: on a simply connected domain, F\mathbf{F} is conservative if and only if curlF=0\operatorname{curl}\mathbf{F} = 0 everywhere. Green’s theorem explains why: if curlF=0\operatorname{curl}\mathbf{F} = 0 on DD, then CFdr=DcurlFdA=0\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_D \operatorname{curl}\mathbf{F}\,dA = 0 for every closed curve CC bounding a region in DD. On simply connected domains, every closed curve bounds a region in DD, so the zero-circulation condition (Theorem 3) is satisfied.

📝 Example 15 (Identifying Rotation)

Three vector fields, three curl values:

  • Rotation field F=(y,x)\mathbf{F} = (-y, x): curlF=1(1)=2\operatorname{curl}\mathbf{F} = 1 - (-1) = 2. Constant positive curl — rigid counterclockwise rotation.
  • Shear field F=(y,0)\mathbf{F} = (y, 0): curlF=01=1\operatorname{curl}\mathbf{F} = 0 - 1 = -1. Constant negative curl — clockwise shearing.
  • Expansion field F=(x,y)\mathbf{F} = (x, y): curlF=00=0\operatorname{curl}\mathbf{F} = 0 - 0 = 0. Curl-free — pure expansion, no rotation. This field is conservative (it’s a gradient field).

Three-panel: positive curl (rotation), negative curl (shear), zero curl (expansion) with paddlewheels

curl(F) at center: 2.0000
∮ F · dr around circle: 1.0053
Area πr²: 0.5027
∮/πr² (≈ curl): 2.0000

Drag to move the probe circle. As the radius shrinks, ∮/πr² → curl(F) at the center.

9. Computational Notes

In practice, line integrals are computed by reducing to single-variable integrals via parameterization, then applying numerical quadrature. Here are the key patterns:

Computing CFdr\int_C \mathbf{F} \cdot d\mathbf{r} given F\mathbf{F} and r(t)\mathbf{r}(t):

import numpy as np
from scipy.integrate import quad

def line_integral_vector(F, r, r_prime, a, b):
    """Compute ∫_C F · dr via parameterization."""
    def integrand(t):
        x, y = r(t)
        Fx, Fy = F(x, y)
        dx, dy = r_prime(t)
        return Fx * dx + Fy * dy
    result, _ = quad(integrand, a, b)
    return result

Testing conservativeness via finite differences:

def is_conservative(F, domain, grid_size=50, tol=1e-6):
    """Check ∂P/∂y ≈ ∂Q/∂x on a grid."""
    h = 1e-7
    xs = np.linspace(*domain[0], grid_size)
    ys = np.linspace(*domain[1], grid_size)
    max_dev = 0
    for x in xs:
        for y in ys:
            dP_dy = (F(x, y + h)[0] - F(x, y - h)[0]) / (2 * h)
            dQ_dx = (F(x + h, y)[1] - F(x - h, y)[1]) / (2 * h)
            max_dev = max(max_dev, abs(dQ_dx - dP_dy))
    return max_dev < tol

Recovering a potential function:

def find_potential(F, x, y):
    """Recover φ(x,y) by integrating along L-shaped path from (0,0)."""
    # Horizontal: ∫₀ˣ P(s, 0) ds
    phi_x, _ = quad(lambda s: F(s, 0)[0], 0, x)
    # Vertical: ∫₀ʸ Q(x, s) ds
    phi_y, _ = quad(lambda s: F(x, s)[1], 0, y)
    return phi_x + phi_y

Verifying Green’s theorem numerically:

# Line integral around unit circle
circulation = line_integral_vector(
    F=lambda x, y: (-y, x),
    r=lambda t: (np.cos(t), np.sin(t)),
    r_prime=lambda t: (-np.sin(t), np.cos(t)),
    a=0, b=2 * np.pi
)  # → 2π

# Double integral of curl over unit disk
from scipy.integrate import dblquad
curl_integral, _ = dblquad(
    lambda y, x: 2,  # curl = 2 everywhere
    -1, 1,
    lambda x: -np.sqrt(1 - x**2),
    lambda x: np.sqrt(1 - x**2)
)  # → 2π

10. Connections to ML

Line integrals appear in machine learning in three distinct ways. These are not afterthoughts — they are the mathematical backbone of how optimization, energy models, and natural gradients work.

10.1 Gradient Flow as Continuous-Time Gradient Descent

The ODE θ˙(t)=L(θ(t))\dot{\theta}(t) = -\nabla L(\theta(t)) defines a curve θ(t)\theta(t) in parameter space. The total loss change along this curve is:

L(θ(T))L(θ(0))=0TL(θ(t))θ˙(t)dt=0TL(θ(t))2dt0.L(\theta(T)) - L(\theta(0)) = \int_0^T \nabla L(\theta(t)) \cdot \dot{\theta}(t)\,dt = -\int_0^T \|\nabla L(\theta(t))\|^2\,dt \le 0.

The first equality is the chain rule; the second substitutes θ˙=L\dot{\theta} = -\nabla L. The integral 0TL2dt\int_0^T \|\nabla L\|^2\,dt is the “total gradient magnitude” along the path — it quantifies how much the loss decreases. This is the Gradient Theorem (Theorem 2) applied to f=Lf = L, giving the loss difference as a line integral of L\nabla L.

Discrete gradient descent θt+1=θtηL(θt)\theta_{t+1} = \theta_t - \eta\nabla L(\theta_t) approximates this flow. The step size η\eta controls how closely the discrete path follows the continuous one. When η\eta is small, the discrete path stays near the continuous flow, and convergence analysis borrows from the continuous theory.

Gradient Descent on formalML

10.2 Energy-Based Models

An energy-based model defines a scalar potential E(x;θ)E(\mathbf{x}; \theta) over input space. The negative gradient xE-\nabla_{\mathbf{x}} E pushes inputs toward low-energy configurations. The dynamics x˙=xE\dot{\mathbf{x}} = -\nabla_{\mathbf{x}} E are a gradient flow in input space — a conservative system where the “work done” on x\mathbf{x} equals the energy change E(xfinal)E(xinit)E(\mathbf{x}_{\text{final}}) - E(\mathbf{x}_{\text{init}}), independent of path. Hopfield networks, Boltzmann machines, and score-based diffusion models all define energy landscapes whose gradient fields govern inference and generation.

10.3 Natural Gradient & Geodesic Paths

Standard gradient descent follows the direction L-\nabla L in Euclidean parameter space. The natural gradient follows I(θ)1L-I(\theta)^{-1}\nabla L, where I(θ)I(\theta) is the Fisher information matrix. This corresponds to steepest descent in the Fisher-Rao metric on the statistical manifold — the direction that maximally decreases the loss per unit of statistical distance.

The length of a curve θ(t)\theta(t) in the Fisher-Rao metric is:

abθ˙(t)TI(θ(t))θ˙(t)dt\int_a^b \sqrt{\dot{\theta}(t)^T I(\theta(t))\, \dot{\theta}(t)}\,dt

This is a scalar line integral (Definition 3) with the arc length element of the Fisher-Rao metric replacing the Euclidean one. Geodesics are curves that minimize this length integral — the calculus of variations provides the Euler-Lagrange equation for finding them.

Information Geometry on formalML

Four-panel: gradient flow path, energy-based model landscape, natural gradient vs. Euclidean gradient, discrete vs. continuous paths

11. Connections & Further Reading

Prerequisites Used

  • Multiple Integrals & Fubini’s Theorem — Green’s theorem converts line integrals to double integrals using the Fubini machinery.
  • The Gradient & Directional Derivatives — The Gradient Theorem is Cfdr=f(b)f(a)\int_C \nabla f \cdot d\mathbf{r} = f(\mathbf{b}) - f(\mathbf{a}). Conservative fields are gradient fields.
  • The Derivative & the Chain Rule — The chain rule ddtf(r(t))=fr(t)\frac{d}{dt}f(\mathbf{r}(t)) = \nabla f \cdot \mathbf{r}'(t) is the key step in the Gradient Theorem proof.
  • The Jacobian & Multivariate Chain Rule — Parameterization independence of line integrals follows from the substitution rule, which is the 1D Jacobian.
  • The Riemann Integral — After parameterization, every line integral is a 1D Riemann integral.
  • Epsilon-Delta Continuity — Continuity of F\mathbf{F} along the curve is the hypothesis that makes the integral well-defined.
  • Completeness & Compactness — Compactness of the curve image r([a,b])\mathbf{r}([a,b]) ensures F\mathbf{F} is bounded on CC.

What Comes Next

  • Surface Integrals & the Divergence Theorem — Stokes’ theorem generalizes Green’s theorem from 2D to 3D: CFdr=S(×F)dS\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S}. The divergence theorem relates surface integrals to volume integrals.
  • First-Order ODEs & Existence Theorems — Exact differential equations Mdx+Ndy=0M\,dx + N\,dy = 0 are exact when My=NxM_y = N_x, the same criterion as conservative fields. The integrating factor technique corresponds to finding a potential function.
  • Metric Spaces & Topology — the topological vocabulary (open sets, continuity via preimages, homeomorphism, fundamental groups) underlying simply connected vs. non-simply-connected domains.
  • Calculus of Variations — Functionals J[γ]=abL(γ,γ,t)dtJ[\gamma] = \int_a^b L(\gamma, \gamma', t)\,dt are line integrals over path space. Extremal paths satisfy the Euler-Lagrange equation.
  • Gradient Descent — Gradient flow as continuous-time optimization.
  • Smooth Manifolds — Differential 1-forms, closed vs. exact, de Rham cohomology.
  • Information Geometry — Fisher-Rao metric, natural gradient, geodesics on the statistical manifold.

References

  1. book Spivak (1965). Calculus on Manifolds Chapter 4 — integration on chains, Stokes' theorem in the language of differential forms
  2. book Hubbard & Hubbard (2015). Vector Calculus, Linear Algebra, and Differential Forms Chapter 6 — line integrals, conservative fields, Green's theorem with geometric exposition
  3. book Munkres (1991). Analysis on Manifolds Chapter 5 — line integrals and Green's theorem with rigorous measurability conditions
  4. book Schey (2005). Div, Grad, Curl, and All That Chapters 2-3 — physical motivation for line integrals via work, circulation, and flux
  5. book Rudin (1976). Principles of Mathematical Analysis Chapter 10 — differential forms and Stokes' theorem in Rⁿ
  6. paper LeCun, Chopra, Hadsell, Ranzato & Huang (2006). “A Tutorial on Energy-Based Learning” Energy functions as potential functions whose gradient fields govern model dynamics
  7. paper Amari (1998). “Natural Gradient Works Efficiently in Learning” The natural gradient as a geodesic direction on the statistical manifold — line integral of the Fisher information metric