# loading packages
library(mlogit)
library(tidyverse)
library(stargazer)
library(rms)
library(readxl)

Ordered Logit

# loading data for ordered logit model
ordered<-data.frame(read_excel('ordered_health.xlsx'))
# checking the unique values for health status
unique(ordered$healthstatus)
## [1] "good"      "excellent" "fair"

We have to assign numbers to the different levels of health status before we can estimate the ordered logit model. We will let fair=1, good=2, and excellent=3. Order is important so think carefully about how you assign numbers to the responses.

# this is an alterantive to using the ifelse function multiple times
ordered$healthnumber<- as.numeric(
  recode(ordered$healthstatus, 
         'fair'='1', 
         'good'='2', 
         'excellent'='3'
         )
  )

We estimate the orderd logit model using the lrm() function.

# estimating the ordered logit model
# you need to rms package for this

ologit<-lrm(healthnumber~age+logincome+numberdiseases, data = ordered)

# viewing the output and coverting the coefficients to odds ratios
# the coefficients for 'y>=2' and 'y>=3' can be ignored
stargazer(ologit, coef = list(exp(ologit$coefficients)), type='text')
## 
## ==========================================
##                    Dependent variable:    
##                ---------------------------
##                       healthnumber        
## ------------------------------------------
## y> =2                   4.039***          
##                          (0.206)          
##                                           
## y> =3                    0.386*           
##                          (0.205)          
##                                           
## age                     0.971***          
##                          (0.002)          
##                                           
## logincome               1.328***          
##                          (0.023)          
##                                           
## numberdiseases          0.946***          
##                          (0.004)          
##                                           
## ------------------------------------------
## Observations              5,574           
## R2                        0.148           
## chi2               740.388*** (df = 3)    
## ==========================================
## Note:          *p<0.1; **p<0.05; ***p<0.01

We compare the odds ratios to 1. A person is less likely to report good or excellent health as age and the number of diseases increase. A person is more likely to report good or excellent health when their income increases.

Multinomial Logit

This data describes a person’s choice to fish at a pier, on a charter boat, on the beach, or on private land. The variables price and catch describe the price and catch rate of the different options. We call these alternative specific variables. The income variable describes the person making the choice. This is an alternative invariant variable.

# loading the data
FishingWide <- data.frame(read_excel('multinomial.xlsx')) 

# viewing the first few rows
head(FishingWide)
##      mode price.beach price.pier price.private price.charter catch.beach
## 1 charter     157.930    157.930       157.930       182.930      0.0678
## 2 charter      15.114     15.114        10.534        34.534      0.1049
## 3 private     161.874    161.874        24.334        59.334      0.5333
## 4    pier      15.134     15.134        55.930        84.930      0.0678
## 5 private     106.930    106.930        41.514        71.014      0.0678
## 6 charter     192.474    192.474        28.934        63.934      0.5333
##   catch.pier catch.private catch.charter   income
## 1     0.0503        0.2601        0.5391 7.083332
## 2     0.0451        0.1574        0.4671 1.250000
## 3     0.4522        0.2413        1.0266 3.750000
## 4     0.0789        0.1643        0.5391 2.083333
## 5     0.0503        0.1082        0.3240 4.583332
## 6     0.4522        0.1665        0.3975 4.583332

This data is in wide form. This means we have one observation per choice. The mlogit function we will use below requires the data to be in long form. Since there are four alternatives, there will be four rows per individual in the long form data. We can reshape the data ising the dfidx() function.

FishingLong <- FishingWide %>%
  dfidx(shape='wide', # this tells R the data are currently in wide form
        choice='mode', # this is the variable that describes the choice
        varying=2:9) # these are the variables that are alternative specific

# viewing the data
head(FishingLong)
## ~~~~~~~
##  first 10 observations out of 4728 
## ~~~~~~~
##     mode   income   price  catch    idx
## 1  FALSE 7.083332 157.930 0.0678 1:each
## 2   TRUE 7.083332 182.930 0.5391 1:rter
## 3  FALSE 7.083332 157.930 0.0503 1:pier
## 4  FALSE 7.083332 157.930 0.2601 1:vate
## 5  FALSE 1.250000  15.114 0.1049 2:each
## 6   TRUE 1.250000  34.534 0.4671 2:rter
## 7  FALSE 1.250000  15.114 0.0451 2:pier
## 8  FALSE 1.250000  10.534 0.1574 2:vate
## 9  FALSE 3.750000 161.874 0.5333 3:each
## 10 FALSE 3.750000  59.334 1.0266 3:rter
## 
## ~~~ indexes ~~~~
##    id1     id2
## 1    1   beach
## 2    1 charter
## 3    1    pier
## 4    1 private
## 5    2   beach
## 6    2 charter
## 7    2    pier
## 8    2 private
## 9    3   beach
## 10   3 charter
## indexes:  1, 2

We now have four rows for each individual. The mode variable now identifies the choice the person made. The id2 variable indices the different choices that were presented.

We estimate the multinomial logit model using the mlogit function. Since we do not include alternative specific variable in the multinomial logit model, we put “1” to the left of the “|”. Alternative invariant variables go to the right of the “|”. We indicate the reference group using the reflevel option. Since we want to use the logistic model we set probit equal to FALSE. If we set this equal to TRUE R would estimate the multinomial probit model.

# estimating the model
mlogit<-mlogit(mode ~ 1 | income, 
               data=FishingLong, 
               reflevel="charter",
               probit = FALSE)
# displaying the results and showing relative risk ratios.
stargazer(mlogit, coef=list(exp(mlogit$coefficients)), type='text')
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                mode            
## -----------------------------------------------
## (Intercept):beach              0.262           
##                               (0.195)          
##                                                
## (Intercept):pier             0.590***          
##                               (0.178)          
##                                                
## (Intercept):private          0.548***          
##                               (0.136)          
##                                                
## income:beach                 1.032***          
##                               (0.042)          
##                                                
## income:pier                  0.894***          
##                               (0.044)          
##                                                
## income:private               1.132***          
##                               (0.028)          
##                                                
## -----------------------------------------------
## Observations                   1,182           
## R2                             0.014           
## Log Likelihood              -1,477.151         
## LR Test                 41.145*** (df = 6)     
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

Again, we compare the coefficients to 1. Individuals are more likely to fish at the beach or on private land compared to fishing on a charter boat as income increases. Increases in income make an individual less likely to fish at a pier relative to a charter boat.

Conditional Logit

The conditional logit model is also estiamted using the mlogit function. The only difference between the multinomial and conditional models is that the conditional logit includes alternative specific variables. We add these to the left side of the “|”.

# estimating the model
clogit<-mlogit(mode ~ price+catch | income, 
               data=FishingLong, 
               reflevel="charter",
               probit = FALSE)
# displaying the results and showing relative risk ratios.
stargazer(clogit, coef=list(exp(clogit$coefficients)), type='text')
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                mode            
## -----------------------------------------------
## (Intercept):beach              0.184           
##                               (0.224)          
##                                                
## (Intercept):pier              0.400*           
##                               (0.207)          
##                                                
## (Intercept):private           0.311*           
##                               (0.159)          
##                                                
## price                        0.975***          
##                               (0.002)          
##                                                
## catch                        1.430***          
##                               (0.110)          
##                                                
## income:beach                 1.034***          
##                               (0.050)          
##                                                
## income:pier                  0.910***          
##                               (0.050)          
##                                                
## income:private               1.131***          
##                               (0.029)          
##                                                
## -----------------------------------------------
## Observations                   1,182           
## R2                             0.189           
## Log Likelihood              -1,215.138         
## LR Test                 565.171*** (df = 8)    
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

The coefficients for income are interpreted as they were in the multinomial logit model except we are now holding constant the price and catch rate.

Notice we only have one coefficient for price and catch. This is because these variables describe the alternatives. Individuals are less likely to choose an option when it becomes more expensive and are more likely to choose an option when the catch rate increases.