now sure what statistical formula to use (election results)

green66

New member
Joined
May 27, 2006
Messages
1
I have compiled some previous election data showing:

election year, D(emocratic party) candidate, R(epublican party) candidate, D percentage, R percentage, D yes, R yes, D win, R win,

Basically just tracking the year of the election, names of each candidate (Dem/Rep), percentages of vote each got (Dem/Rep), if they were supportive of an issue (0 for no, 1 for yes) (Dem/Rep), and if they won (two columns, 1 for yes and 0 for no)

An example from the governors race in New York with my actual data:
Code:
Year  R              D               R %    D %    R DP  D DP  R WIN  D WIN
2002  George Pataki  H. Carl McCall  49.4   33.5   1     0     1      0
1998  George Pataki  Pete Vallone    54.32  33.16  1     1     1      0
I am trying to figure out a correlation that basically answers this question:

If a candidate is supportive of x issue (1 in the column) they have a statistically better chance of being elected (1 in the win) based on historical data.

I have the voter percentage in my xls to work with if needed, but I was hoping to not use it since ultimately a win or loss is all that matters for my project (and not by how much)

I've searched for the past five hours through previous intro to stats material and am not sure how to work with this type of binary data (yes or no vs actual interval data). Can anyone help? If Excel 2003 will not have a function to work with binary data, I can run to a lab with SPSS on it to run functions.

Thanks.
 
Hi. I recommend a logit analysis. This link seems to do a good job of describing the logit model even though it is documentation for the SHAZAM software. It uses as an example an analysis of voting for school decisions. This link describes the SPSS Logistic Regression procedure you can use.

The simplest model to capture your question would be this. All variables have the values 1=Yes and 0=No. The dependent variable could be either Dem Wins or Rep Wins. The independent variables would be Dem Supports Issue, Rep Supports Issue and Both Dem and Rep Support Issue. You'll test whether the Issues variables are statistically significant. If the interaction variable Both Dem and Rep Support Issue is not statistically significant, you can drop it from the model.

I'd run the model twice, once for each of the two dependent variables, to see if the results are similar after taking into account that the estimated coefficients should have the opposite signs. The Issues variables should have the same statistical significance for the two models. If they don't, I'd have to think about why.

Have a look at the documentation. If you have any questions, post them here.

Have fun!
 
Top