Need help with a (simple) transformation function, please.

meth222

New member
Joined
Oct 16, 2013
Messages
7
xy
00
-0.04-0.05
-0.06667-0.1
-0.1-0.2
-0.13-0.4
0.10.2
0.120.3
0.142860.5

So I have these values for X and Y. As you can see, as X decreases past 0, Y decreases at an increasing rate. As X increases past 0, Y increases at an increasing rate. What is the mathematical function that relates X to Y?
 
xy
00
-0.04-0.05
-0.06667-0.1
-0.1-0.2
-0.13-0.4
0.10.2
0.120.3
0.142860.5

So I have these values for X and Y. As you can see, as X decreases past 0, Y decreases at an increasing rate. As X increases past 0, Y increases at an increasing rate. What is the mathematical function that relates X to Y?

chỉ mục.jpg
You can go to http://graph.tk/ and draw function "y=x^3". it is very similar to what you describe!



Specific, perharps it is nearly "y = 220x^3 - 18x^2 + 1.62x"
 
Last edited:
Is there a way to determine the exact relationship?

Yes. Fist. I predicted that have form y=a.x^3 + b.x^2 + cx + d!
With x = 0, y = 0 => 0=a.0+b.0+c.0+d => d=0
With x = -0.1 -0.2 and other other value pairs I created a system of equations with a,b,c,d

Sorry, my english is not good
 
Is there a way to determine the exact relationship?
There are various methods for finding various relationships which may fit the listed data points to varying degrees of exactitude.

Your subject line refers to "transformation function". What do you mean by this?

What generated the listed data points? What methods of data-fitting have you been studying? What method(s) do you believe you are expected to apply to this exercise?

Please be complete. Thank you! ;)
 
Given any n data points (no two points having the same x value) there exist an n-1 degree polynomial that passes exactly through those points. "Leibniz formula" gives a way to calculate that polynomial: if the points are \(\displaystyle (x_1, y_1)\), \(\displaystyle (x_2, y_2)\), ..., \(\displaystyle (x_n, y_n)\), the formula is
\(\displaystyle \sum_{i= 1}^n y_i\dfrac{(x- x_1)\cdot\cdot\cdot(x- x_{i-1})(x- x_{i+1})\cdot\cdot\cdot (x- x_n)}{(x_i- x_1)\cdot\cdot\cdot(x_i- x_{i-1})(x_i- x_{i+1})\cdot\cdot\cdot (x_i- x_n)}\)
Notice that the "ith" term does NOT include "\(\displaystyle x- x_i\)" in the numerator but every other term does. If x is equal to one of the "\(\displaystyle x_i\), then every term except that "ith" term will include \(\displaystyle x_i- x_i= 0\) while in the "ith" term the fraction will be equal to 1 so the sum is just \(\displaystyle 0+ 0+ ...+ y_i+ ...+ 0+ 0= y_i\).

For this particular list of points, (0, 0), (-0.04, -0.05), (-0.06667, -0.1), (-0.1, -0.2), (-0.13, -0.4), (0.1, 0.2), (0.12, 0.3), (0.14286, 0.5), 9 points, this gives the 8th degree polynomial
\(\displaystyle -.05\dfrac{x(x+ 0.0667)(x+ 0.1)(x+ 0.13)(x- 0.1)(x- 0.3)(x- 0.14286)}{(-0.04)(-0.04+ 0.06667)(-0.04+ 0.1)(-0.04- 0.12)(-0.04- 0.14286)}- 0.1\dfrac{x(x+ 0.04)(x+ 0.1)(x+ 0.13)(x- 0.1)(x- 0.3)(x- 0.14286)}{(-0.06667)(+-0.06667+ 0.04)(-0.06667+ 0.1)(-0.06667- 0.12)(-0.06667- 0.14286)}+ \) etc.
 
Given any n data points (no two points having the same x value) there exist an n-1 degree polynomial that passes exactly through those points. "Leibniz formula" gives a way to calculate that polynomial: if the points are \(\displaystyle (x_1, y_1)\), \(\displaystyle (x_2, y_2)\), ..., \(\displaystyle (x_n, y_n)\), the formula is
\(\displaystyle \sum_{i= 1}^n y_i\dfrac{(x- x_1)\cdot\cdot\cdot(x- x_{i-1})(x- x_{i+1})\cdot\cdot\cdot (x- x_n)}{(x_i- x_1)\cdot\cdot\cdot(x_i- x_{i-1})(x_i- x_{i+1})\cdot\cdot\cdot (x_i- x_n)}\)
Notice that the "ith" term does NOT include "\(\displaystyle x- x_i\)" in the numerator but every other term does. If x is equal to one of the "\(\displaystyle x_i\), then every term except that "ith" term will include \(\displaystyle x_i- x_i= 0\) while in the "ith" term the fraction will be equal to 1 so the sum is just \(\displaystyle 0+ 0+ ...+ y_i+ ...+ 0+ 0= y_i\).

For this particular list of points, (0, 0), (-0.04, -0.05), (-0.06667, -0.1), (-0.1, -0.2), (-0.13, -0.4), (0.1, 0.2), (0.12, 0.3), (0.14286, 0.5), 9 points, this gives the 8th degree polynomial
\(\displaystyle -.05\dfrac{x(x+ 0.0667)(x+ 0.1)(x+ 0.13)(x- 0.1)(x- 0.3)(x- 0.14286)}{(-0.04)(-0.04+ 0.06667)(-0.04+ 0.1)(-0.04- 0.12)(-0.04- 0.14286)}- 0.1\dfrac{x(x+ 0.04)(x+ 0.1)(x+ 0.13)(x- 0.1)(x- 0.3)(x- 0.14286)}{(-0.06667)(+-0.06667+ 0.04)(-0.06667+ 0.1)(-0.06667- 0.12)(-0.06667- 0.14286)}+ \) etc.

Yes, the Lagrange Polynomial is very handy for interpolation.
But it is very poor (eg possibly wild swings between points) if the number of points are large or when there is "noise" in the data. In the cases for which a smooth approximation is desired, then regression using the generalized inverse of the Vandermonde model matrix yields better approximating polynomials.

P.S. If a square Vandermonde matrix is used then the result is identical to using the Lagrange method.
 
Last edited:
Is there a way to determine the exact relationship?

meth222,

I don't think you should even be concerning yourself with that question. You got your idea of the shape from post # 2.
If you already (or can use) an appropriate TI-80 something graphics calculator, for example, you would see that after
entering the data in two lists using these keys:

STAT

CALC

6: CubicReg

Enter

- - - - - - - - - - - -

gives


[Display]

\(\displaystyle y = ax^3+bx^2+cx+d\)

a = 140.7391002

b = .1747190332

c = .626117262

d = -.0092006397

\(\displaystyle R^2 \ \)= .9991219702


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -



With that last line of the display, that means that \(\displaystyle \ R \approx \ .99956.\)

So, for all intents and purposes, it is a cubic function. Seek out simplicity.



Edit: I will tell you that the third x-value is a rounded value for the exact value of \(\displaystyle \ \) -1/15,
and that the last x-value is a rounded value for the exact value of \(\displaystyle \ \) 1/7.
 
Last edited:
Yes, the Lagrange Polynomial is very handy for interpolation.
But it is very poor (eg possibly wild swings between points) if the number of points are large or when there is "noise" in the data. In the cases for which a smooth approximation is desired, then regression using the generalized inverse of the Vandermonde model matrix yields better approximating polynomials.

P.S. If a square Vandermonde matrix is used then the result is identical to using the Lagrange method.
True. But nothing was said in the initial post about approximation. If I wanted to avoid "wild swings", I would use a cubic spline that would pass through the given data points rather than an approximating polynomial.
 
[Display]

\(\displaystyle y = ax^3+bx^2+cx+d\)

a = 140.7391002

b = .1747190332

c = .626117262

d = -.0092006397

\(\displaystyle R^2 \ \)= .9991219702

Ok, I get the relationship but it doesn't help me with my cause. I guess I have to fully explain myself. (This is for a fantasy basketball team I'm doing). I'm trying to determine a measure for an NBA player's ability to shoot free throws with an appropriate weight placed on number of free throw attempts per game (FTA) and FT percentage (FT%).

For simplicity, let's assume the average FT% in the entire league is 0.5 and the average attempts per game is 100. Suppose I have two players on my team, A and B. Player A has 100 FTA and a FT% of .4. Player B has 200 FTA and a FT% of .6. This means my total free throws made for my team is 160 out of 300 attempts, or 53.33% for the Group %. How much better is this than average? About 6.66% [(.5333-.5)/.5]. This is labeled Group PMI in the excel file.

Now I want to create an individual measure (Ind. PMI) for player A and another one for player B where I can add these two values and get to 6.66%. (Perhaps it's not possible)

Here is what I have done: For player A, I took his (FT% (.4) minus the league average (.5)) times his FTA (100), divided by the average attempts in the league. (.4-.5)*100/100 = -.1 Similarly for player B: (.6-.5)*200/100 = .2. Adding or stacking the Ind. PMI for A and B, I get .1 or 10% which is a close approximation of 6.66%.

Changing the values for FTA for each player, I generated a list of X and Y in my original post. So what I really would like to know is if there is a way to create an Ind. PMI for each player so I can stack it and get the true Group PMI.

Please see the attached Excel file.
 

Attachments

  • look on website.zip
    6.6 KB · Views: 3
There are various methods for finding various relationships which may fit the listed data points to varying degrees of exactitude.

Your subject line refers to "transformation function". What do you mean by this?

What generated the listed data points? What methods of data-fitting have you been studying? What method(s) do you believe you are expected to apply to this exercise?

Please be complete. Thank you! ;)

Thanks for everyone's help in advance!
 
Ok, I get the relationship but it doesn't help me with my cause. I guess I have to fully explain myself. (This is for a fantasy basketball team I'm doing). I'm trying to determine a measure for an NBA player's ability to shoot free throws with an appropriate weight placed on number of free throw attempts per game (FTA) and FT percentage (FT%).

For simplicity, let's assume the average FT% in the entire league is 0.5 and the average attempts per game is 100. Suppose I have two players on my team, A and B. Player A has 100 FTA and a FT% of .4. Player B has 200 FTA and a FT% of .6. This means my total free throws made for my team is 160 out of 300 attempts, or 53.33% for the Group %. How much better is this than average? About 6.66% [(.5333-.5)/.5]. This is labeled Group PMI in the excel file.

Now I want to create an individual measure (Ind. PMI) for player A and another one for player B where I can add these two values and get to 6.66%. (Perhaps it's not possible)

Here is what I have done: For player A, I took his (FT% (.4) minus the league average (.5)) times his FTA (100), divided by the average attempts in the league. (.4-.5)*100/100 = -.1 Similarly for player B: (.6-.5)*200/100 = .2. Adding or stacking the Ind. PMI for A and B, I get .1 or 10% which is a close approximation of 6.66%.

Changing the values for FTA for each player, I generated a list of X and Y in my original post. So what I really would like to know is if there is a way to create an Ind. PMI for each player so I can stack it and get the true Group PMI.

Please see the attached Excel file.
I am not sure that I understand your question.

In general, \(\displaystyle b \ne d \implies\dfrac{a}{b} + \dfrac{c}{d} \ne \dfrac{a + c}{b + d}.\)

Or more generally

\(\displaystyle v_i \ne v_k\ and\ 1 \le i \le n\ and\ 1 \le k \le n \implies \dfrac{u_1}{v_1} +\ ...\ \dfrac{u_n}{v_n} \ne \dfrac{u_1 +\ ...\ u_n }{v_1 +\ ...\ v_n}.\)

Using your example, where it is assumed that the team had 300 FTA and made 160 for a FTP of about 53%:

Player 1 had 100 FTA (about 33% of the total) and made 40 for a FTP of 40%.

Player 2 had 200 FTA (about 67% of the total) and made 120 for a FTP of 60%.

(0.33 * 0.4) + (0.67 * 0.6) = 0.132 + 0.402 = 0.534, which is a decent approximation. When you divide (0.534 - 0.5) by 0.5 you get 0.068, which is a whole lot better estimate than 0.1.

Because you are using excel it is easy to use more decimal places to get greater accuracy.

The proper formula generally is

\(\displaystyle \left(\dfrac{v_1}{v_1 +\ ...\ v_n} * \dfrac{u_1}{v_1}\right) +\ ...\ \left(\dfrac{v_n}{v_1 +\ ...\ v_n} * \dfrac{u_n}{u_n}\right) = \dfrac{u_1 +\ ...\ u_n}{v_1 +\ ...\ v_n}.\)

Do you follow?
 
Careful Jeff! What if \(\displaystyle a=c=0\)

edit or

\(\displaystyle c=-a\left(\dfrac{d}{b}\right)^2\)

just being silly though.
 
Last edited:
I am not sure that I understand your question.

In general, \(\displaystyle b \ne d \implies\dfrac{a}{b} + \dfrac{c}{d} \ne \dfrac{a + c}{b + d}.\)

Or more generally

\(\displaystyle v_i \ne v_k\ and\ 1 \le i \le n\ and\ 1 \le k \le n \implies \dfrac{u_1}{v_1} +\ ...\ \dfrac{u_n}{v_n} \ne \dfrac{u_1 +\ ...\ u_n }{v_1 +\ ...\ v_n}.\)

Using your example, where it is assumed that the team had 300 FTA and made 160 for a FTP of about 53%:

Player 1 had 100 FTA (about 33% of the total) and made 40 for a FTP of 40%.

Player 2 had 200 FTA (about 67% of the total) and made 120 for a FTP of 60%.

(0.33 * 0.4) + (0.67 * 0.6) = 0.132 + 0.402 = 0.534, which is a decent approximation. When you divide (0.534 - 0.5) by 0.5 you get 0.068, which is a whole lot better estimate than 0.1.

Because you are using excel it is easy to use more decimal places to get greater accuracy.

The proper formula generally is

\(\displaystyle \left(\dfrac{v_1}{v_1 +\ ...\ v_n} * \dfrac{u_1}{v_1}\right) +\ ...\ \left(\dfrac{v_n}{v_1 +\ ...\ v_n} * \dfrac{u_n}{u_n}\right) = \dfrac{u_1 +\ ...\ u_n}{v_1 +\ ...\ v_n}.\)

Do you follow?

Yes, I do follow up to the last formula. And yes, I realized that A/B + C/D does not equal (A+B)/(C+D). That's why I wasn't sure if there was a way to solve my problem.

The problem is I'm trying to create an index for the entire list of players in the NBA, not just for the players on my team. My goal is to cross-sectionally compare the players with those players shooting below average as having a negative value and those players shooting above average to have a positive value, all weighted by the number of attempts.
 
I am not sure that I understand your question.

In general, \(\displaystyle b \ne d \implies\dfrac{a}{b} + \dfrac{c}{d} \ne \dfrac{a + c}{b + d}.\)

If b=d then it is the Arithmetic Mean
If a=c then it is the Harmonic Mean
 
Last edited:
Careful Jeff! What if \(\displaystyle a=c=0\)

edit or

\(\displaystyle c=-a\left(\dfrac{d}{b}\right)^2\)

just being silly though.
LOL Why I said "generally." Can't be too careful around here.
 
Yes, I do follow up to the last formula. And yes, I realized that A/B + C/D does not equal (A+B)/(C+D). That's why I wasn't sure if there was a way to solve my problem.

The problem is I'm trying to create an index for the entire list of players in the NBA, not just for the players on my team. My goal is to cross-sectionally compare the players with those players shooting below average as having a negative value and those players shooting above average to have a positive value, all weighted by the number of attempts.
The team ratio equals the sum of the products of (a) the success ratio of each player who is on the team and has made at least one attempt and (b) the ratio of that player's attempts to the total attempts made by all the players on the team. It is very easy to compute using excel.

The league ratio equals the sum of the products of (a) the success ratio of each player who is in the league and has made at least one attempt and (b) the ratio of that player's attempts to the total attempts made by all players in the league. It too is very easy to compute using excel.

Because the products are truncated, the sums may not be exactly equal, but they will be very close.
 
The team ratio equals the sum of the products of (a) the success ratio of each player who is on the team and has made at least one attempt and (b) the ratio of that player's attempts to the total attempts made by all the players on the team. It is very easy to compute using excel.

The league ratio equals the sum of the products of (a) the success ratio of each player who is in the league and has made at least one attempt and (b) the ratio of that player's attempts to the total attempts made by all players in the league. It too is very easy to compute using excel.

Because the products are truncated, the sums may not be exactly equal, but they will be very close.

I'm not too sure what you mean by the success ratio of each player. Are you referring to FT%? Also remember I'm talking about attempts per game, not total attempts.

I'm also not too sure if what you describe is what I need. The team % is calculated as the sum of shots made divided by the sum of shots attempted for all players on my team. What I need is some type of metric, calculated for each player in the entire league, such that the sum of individual metrics for each player on my team would inform me how much better my team % is compared to the league mean. This metric would have to account for FT% and the magnitude of shot attempts.

Thanks again.
 
Top