Help with gradient calculation

mctaff

New member
Joined
Mar 19, 2017
Messages
1
Hi all,


I've not touched calculus post my maths degree many years ago.


I need to calculate a gradient, to use in back-propagation in a neural network, but its making my head hurt!


I have the formula


Code:
δ(t)  = tanh[ <w, f(t) > + b + uδ(t−1) ]


where t, t-1 represent time steps


and need to calculate


Code:
dδ(t)/dθ = ∂δ(t)/∂θ + ∂δ(t)/∂δ(t−1) * dδ(t-1)/dθ


where θ comprises the parameter set [w, b, u]. b and u are scalars, but w is a vector of multiple elements, e.g. w1, w2 etc..




The derivative of tanh(x) is 1 - tanh^2(x), so I assume the partial derivative,


Code:
∂δ(t)/∂δ(t−1) = u (1 - tanh^2[ <w, f(t) > + b + uδ(t−1) ])

              = u (1 - δ(t)^2)


The final element (dδ(t-1)/dθ) comes from recursively calculating the gradient, so I'm happy with that.




I'm less sure about the first element, given the θ decomposition into its constituent elements.


Any ideas?!
 
Last edited:
Top