Gradients and Derivatives of Matrices

kephera

New member
Joined
Oct 10, 2022
Messages
1
I'm just confused about the transpose aspect. Let's say A is an element of |R^(mxn) and we want to find the gradient of f(x) = Capture.JPGThen by chain rule we get 2(Ax) * d/dx Ax. Does d/dx Ax = A or A^T and why? Also, just for clarity as I'm confused about it, does something like 2(Ax+b)A = 2A^T(Ax+b)? Like, if you change the order of matrix multiplication do you need to change A to A^T? I know matrix multiplication is not commutative, but confused beyond that.
 
I agree that [imath]\frac{d}{dx} (Ax) = A[/imath].

The rule for transposing products is relatively simple:
[math](AB)^T = B^T A^T[/math]To differentiate [imath]||Ax||^2[/imath] I would use the fact that the squared norm of a vector is a dot product of the vector with itself, i.e.:
[math]f( x) = || A x||^2 = (Ax)^T (AX) = x^T A^T A x[/math]Can you take it from there?
 
Top