Question:
Why do you need to use a dummy variable in statistical analysis?
anonymous
2009-03-18 18:58:10 UTC
Hi.. I'm a bit confused as to why people use dummy variables when running statistical analysis? In my data I have race as a variable. It is originally defined with the following value labels.

1=Asian/pacific Islander
2=Hispanic
3=Black
4=White
5=American Indian

I recoded them into dummy variables. I know how to do the procedure, but am unsure why I cannot just run the original variable instead of creating more data. Is there a difference between running the dummy variables versus running the variable that already account for all ethnicities?

As far as I can tell, the only benefit of dummy coding is that you can look at some of the groups if you rather than being forced to look at all.

Any ideas?

background info:

I used spss, i know how to recode to dummy variables, just curious why this is done.
Three answers:
Kal'El
2009-03-18 19:15:36 UTC
Hey -



I'm a little confused, but if you help me sort out your question, I hope I can sort out an answer for you. =)



Labelling ethnicities 1 - 5 IS turning them into dummy variables, isn't it? That is, the numbers themselves have no meaning (e.g., a rank-ordering or value). When you say you recoded them AGAIN into dummy variables, what are you saying you did with them? The only thing I can imagine is creating 4 columns and placing a one in the only appropriate column and zeroes in all the others (the ethnicity you're comparing against, probably caucasians, would just get all zeroes). Is this what you're describing? If so, this is generally used for mulitple regression...



If I'm totally off-base, please explain to me what your 5 dummy variables now look like instead of 1 = Asian... 5 = American Indian. I'm pretty sure if I understand what you did, I can help explain why. =)
Jeremy Miles
2009-03-19 15:13:01 UTC
I don't think the other answer is correct. These are not coded as dummies when they are coded 1-5, although if you use the GLM procedure in SPSS, you can do that.



You code them as dummies because if you enter them as they are you assume that the scores mean something and are in an order - i.e. that Hispanic is higher than Asian,. and white is higher than Black.



By making dummies, you compare the means of the groups, which is what you are interested in. I'd suggest you read a book on regression, to help you to understand this. One example would be Applying regression and correlation by Jeremy Miles and Mark Shevlin. :)
wichern
2016-09-10 07:56:28 UTC
You're watching at a Higher-Order ANOVA, considering you are evaluating separate IVs' results on a DV, and also you typically wish to research the combo results as good because the essential end result of every (friends/age/gender). Correlation/regression is what you would use while you wish to examine how 2 IVs paintings in combination (or do not) to have an impact on the stylish variable. You wish to make use of ANOVA right here, considering you are examining the separate and joint results of 3 unique unbiased variables on materialism.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...