Supervised Learning
Given a set of data points {x(1),…,x(m)} associated to a set of outcomes {y(1),…,y(m)}, we want to build a classifier fθ that learns how to predict y from x.
- y≈f(x;θ)
- y^=f(x;θ) and y^≈y
- Loss function: L(y,y^)
- Optimization: minθL(y,y^)