I just need some help hence why I can't show any working, I just am not sure where to begin....
Perhaps taking a step backwards . . .
Suppose you have a MODEL for a process, but you can't predict the coefficients of the model. For instance, you believe \(\displaystyle y\) should be proportional to A, but with a correction proportional to A^2, and another term depending on Z. Your model is
\(\displaystyle y = \beta_1 \times A + \beta_2 \times A^2 + \beta_3 \times Z\)
So you set up an experiment to measure y for as wide a range of the parameters (A,Z) as you can manage. Note that A and Z are either controlled, or are found as part of the measurement. In either case they are known "perfectly" without any uncertainty. All of the uncertainty of each measurement is associated with \(\displaystyle y\), and is usually expressed as a standard deviation, \(\displaystyle u\).
1st measurement: \(\displaystyle y_1 = \beta_1 A_1 + \beta_2 A_1^2 + \beta_3 Z_1 + u_1\)
2nd
....................\(\displaystyle y_2 = \beta_1 A_2 + \beta_2 A_2^2 + \beta_3 Z_2 + u_2\)
. . .
jth
......................\(\displaystyle y_j = \beta_1 A_j + \beta_2 A_j^2 + \beta_3 Z_j + u_j\)
The next task is to find a set of coefficients \(\displaystyle (\beta_1\ \beta_2\ \beta_3)\) that make a "best fit" of the model to the data. One of the most frequently used procedures is "ordinary least squares," in which the sum of the squares of the differences between the measured and predicted values of \(\displaystyle y\) is minimized. ["Ordinary" means the uncertainties \(\displaystyle u\) are not propagated - my own preference is to do a "weighted" least squares with each datum weighted as \(\displaystyle 1/u_j^2\).]
Let
.........\(\displaystyle \displaystyle S = \sum_{j=1}^N (y_j - \hat y)^2 = \sum_{j=1}^N(y_j - \beta_1 A_j - \beta_2 A_j^2 - \beta_3 Z_j )^2\)
then set
..\(\displaystyle \dfrac{\partial S}{\partial \beta_i} = 0\ \ \ \text{ for }i = 1 \text{ to }k\)
which is now a set of \(\displaystyle k\) equations in \(\displaystyle k\) unknowns. The questions you are being asked have to do with the properties of this system of equations.
Does it help to start with a somewhat more explicit model?