1. What is gradient descent
Gradient Descent (GD) is an optimization algorithm used for minimizing the cost function in Machine Learning and Deep Learning.
A easy way to understand : Gradient Descent is one of algorithm to training your model in Machine Learning and Deep Learning.
2. Gradient Descent Algorithm
Like I said , Gradient Decsent is algorithm used for minimizing cost function , so if we have cost function like this :
and we can define Gradient Descent like :
where j = 0 and j = 1
3. How Gradient Descent work
Firs , we initialization θj and assign θj into algorithm to compute new θj and repeat until find out θ so that cost function J(θj) is small enough.
Like this :
4. Type of Gradient Descent
- Batch Gradient Descent
- Stochastic Gradient Descent
- Mini Batch Gradient Descent
5. Tips for Gradient Descent
- Chosse Learning rate : Learning rate maybe good is 0.0001~0.0003 , 0.001~0.003, 0.01~0.03 or 0.1~0.3
- Debugging Gradient Descent : building graph of cost function with number of interations and check it, if cost function value increases, then you probably need to decrease learning rate
- To make Gradient Descent run faster : Feature Normalization. We can speed up Gradient Desent by having each of our input values in roughly the same range.Two techniques to help with this are feature scaling and mean normalization.
- some time using Gradient Descent with Momentum is better, for example let look at graph:
Graph source : https://machinelearningcoban.com/2017/01/16/gradientdescent2/