Gradient

For a scalar-valued differentiable function $f: \mathbb{R}^{n} \rightarrow \mathbb{R}$ , its gradient
$\nabla f: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}$ is defined at the point $p=(x_1, \ldots, x_n)$ in n-dimensional space as the vector:

$\nabla f(p) = \langle \frac{\partial f}{\partial x_1}(p), \ldots, \frac{\partial f}{\partial x_n}(p) \rangle.$

e.g. Consider the function $f(x_1, x_2) = x_1^2 + 2x_2$ , the gradient function is:

$\nabla f(p) = \langle \frac{\partial f}{\partial x_1}(p), \frac{\partial f}{\partial x_2}(p) \rangle = \langle 2x_1, 2 \rangle$

At a point $p=(0, 1)$ , the gradient is:

$\nabla f(p) = \langle 2 \cdot 0, 2 \rangle = \langle 0, 2 \rangle$

Training Courses for Graduate Students

Basic Mathematics for Machine Learning

Outline

Linear Algebra

Scalars, Vectors, Matrices, and Tensors

Identity, Transpose, and Inverse

Inner Product, Element-wise Product, and Matrix Product

Linear Transformation

Affine Transformation

Affine Transformation Examples in 2-D

Norms

Norms

References

Calculus

Derivative and Differentiation

Derivative and Differentiation

Notations

Chain Rule

Gradient

Computing Derivatives

Numerical Differentiation

Symbolic Differentiation

Automatic Differentiation

Three-part Notation

Forward Mode

Example

Computation Graph

Efficiency

Reverse Mode

Example

Efficiency

References