## 代价函数

• L = total number of layers in the network
• $s_l$ = number of units (not counting bias unit) in layer l
• K = number of output units/classes

J(\theta) = -\frac{1}{m}\sum_{i=1}^{m}[y^(i)log(h_\theta(x^(i)))+(1-y^(i))log(1-h_\theta(x^(i)))] + \frac{\lambda}{2m}\sum_{j=1}^{n}\theta_j^2

• the double sum simply adds up the logistic regression costs calculated for each cell in the output layer
• the triple sum simply adds up the squares of all the individual Θs in the entire network.
• the i in the triple sum does not refer to training example i