机器学习（六） — 评估模型

这篇具有很好参考价值的文章主要介绍了机器学习（六） — 评估模型。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

Evaluate model

1 test set

split the training set into training set and a test set

the test set is used to evaluate the model

1. linear regression

compute test error

$J_{test}(\vec w, b) = \frac{1}{2m_{test}}\sum_{i=1}^{m_{test}} \left [ (f(x_{test}^{(i)}) - y_{test}^{(i)})^2 \right ]$

2. classification regression

compute test error

$J_{test}(\vec w, b) = -\frac{1}{m_{test}}\sum_{i=1}^{m_{test}} \left [ y_{test}^{(i)}log(f(x_{test}^{(i)})) + (1 - y_{test}^{(i)})log(1 - f(x_{test}^{(i)}) \right ]$

2 cross-validation set

split the training set into training set, cross-validation set and test set

the cross-validation set is used to automatically choose the better model, and the test set is used to evaluate the model that chosed

3 bias and variance

high bias: $J_{train}$ and $J_{cv}$ is both high

high variance: $J_{train}$ is low, but $J_{cv}$ is high

机器学习（六） — 评估模型,机器学习,机器学习,人工智能

if high bias: get more training set is helpless

if high variance: get more training set is helpful

4 regularization

if $\lambda$ is too small, it will lead to overfitting(high variance)

if $\lambda$ is too large, it will lead to underfitting(high bias)

机器学习（六） — 评估模型,机器学习,机器学习,人工智能

5 method

fix high variance:

get more training set

try smaller set of features

reduce some of the higher-order terms

increase $\lambda$

fix high bias:

get more addtional features

add polynomial features

decrease $\lambda$