We can interpret this coefficient as follows: For patients with an average amount of environmental hazards, there is a 3.67 unit increase in asthma symptoms associated with having the genetic mutation, t(96)=3.58, p<.001. Here we only have two levels in our factor, but if there were more, we would see a coefficient for each level other than the reference group.īecause there is an interaction between mutation_present and hazards in our model, the estimate for mutation_present is for patients with hazards=0, which is patients with an average amount of environmental hazards since we centered hazards before estimating the model. Because lm uses dummy coding for factors by default, R prints the name of the variable (mutation_present) with the level being tested (Y) against the reference group. The next coefficient is mutation_presentY. We can interpret it as follows in APA style: For each unit increase in the presence of environmental hazards, there is an estimated 0.7 unit increase in patients’ asthma symptom severity for the group without the genetic predisposition, t(96)=2.51, p=.014. Because we have a dummy-coded categorical variable in the model, this is the estimate for the effect of environmental hazards for the reference group, i.e. those without the genetic mutation. The next coefficient in our model is hazards. In this situation, as is often the case, the estimate for the intercept is not particularly meaningful - the ratings of asthma severity run from 31.89 to 73.64, so pretty much any estimate would be significantly different from zero 1. The significance test tells us that estimate is significantly different from zero. In this case, that would mean we would predict an asthma severity score of 48.98 for a patient with a hazard score of 0 (since we centered hazards, that would be a patient with average environmental hazards), and without the genetic mutation (mutation_present = “No”). The interpretation of the intercept depends on what predictors are in the model and how they are coded, but it is always the outcome estimate when all predictors = 0. The first coefficient will always be the intercept. They just provide a quick visual summary. Lm(formula = asthma_sx ~ hazards * mutation_present, data = asthma) If you have categorical variables in your model, check first that they are correctly labeled as factors in your data with the str command: Factors will automatically be dummy-coded with the first level as the reference group. Important note: The lm function pays attention to the types of variables you include (e.g. numeric vs. factor) as they are encoded in your data frame. It’s in the stats package, which is included by default in base R and loaded at the beginning of any R session - so you don’t need to install anything or run library(stats) to be able to use its functions they’re all available to you by default when you open R or RStudio. To estimate a linear model in R, use the function lm. Here are the first five cases in the data: asthma_sx The data include three variables: asthma_sx which gives a rating of severity of symptoms, mutation_present which indicates whether each patient has a genetic mutation associated with asthma (yes or no), and hazards which is a rating of the presence of environmental hazards in the patients home. 5*(hazards - mean(hazards)), 3 + 3*(hazards - mean(hazards)))Īsthma <- ame(asthma_sx, mutation_present, hazards)įor this example, we’ll be using some made-up data about asthma symptoms. Mutation_present <- factor(sample(c("Y", "N"), size = n, replace = TRUE)) # This is the code to generate the made-up data
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |