Loss of Classification softmax label Network Mean Square Error (MSE) Cross-entropy Minimizing cross-entropy is equivalent to maximizing likelihood. 8
http: //speech. ee. ntu. edu. tw/~tlkagk/courses/MLDS_2015_2/Lecture/Deep%20 More%20(v 2). ecm. mp 4/index. html softmax -10 ~ 10 large. Mean Square Error (MSE) loss -10 ~ 10 -1000 large loss Network Cross-entropy stuck! small loss Changing the loss function can change the difficulty of optimization. 9