Through algorithms, machines can learn rules from a large number of data and make decisions on new sample
Four Elements in Machine Learning
- Data
- Model
- Learning Rule
- Optimization Algorithm
Learning Rules
A good model should be consistent with the real mapping function in all values:
Loss Func
it is a non-negative real function, used to quantify the difference between model’s prediction and true label
For example,Quadratic loss function:
Empirical risk minimization
After selecting the appropriate risk function, we look for a parameter
ML problem is transformed into an optimization problem
Expected Risk
期望风险(真实风险):
: Real data distribution
Empirical Risk(经验风险)
Expected risk is unknown, approximated by empirical risk
Stochastic Gradient Descent
SGD: sampling one samples in each iteration
Generalization Error (泛化误差)
Generalization error:
Regularization
the principle of empirical risk minimization can easily lead to a low error rate in the training set, but a high error rate in the unknown data.