# loading packages
library(readxl)
library(tidyverse)
library(stargazer)
library(AER)
# loading data
wagedata<-data.frame(read_excel('Wages.xlsx')) %>%
  mutate(expersq=exper^2, # squared term for the polynomial regression
         exper_female=exper*female # interaction term
         )

I will model the relationship between wages and experience using the following regression equations:

  1. \(wage=\beta_0+\beta_1exper+\epsilon\)
    • This assume \(\Delta E[wage]/ \Delta exper\) is constant. For example, the first year of experience will have the same impact on wages as the 25th.
  2. \(wage=\beta_0+\beta_1exper+\beta_2exper^2+\epsilon\)
    • This allows \(\Delta E[wage]/ \Delta exper\) to depend on the level of experience. For example, the first year of experience may have a larger impact on wages than the 10th. This functional form is also flexible because it allows for four possible relationships between the independent and dependent variable. As we increase experience, wages could: increase at an increasing rate (\(\beta_1>0, \beta_2>0\)), increase at a decreasing rate (\(\beta_1>0, \beta_2<0\)), decrease at a decreasing rate (\(\beta_1<0,\beta_2>0\)), or decrease at an increasing rate (\(\beta_1<0, \beta_2<0\)). \(\Delta E[wage]/ \Delta exper\) for any level of experience is \(\beta_1+2*\beta_2*exper\).
  3. \(wage=\beta_0+\beta_1ln(exper)+\epsilon\)
    • \(\beta_1\) is no longer \(\Delta E[wage]/ \Delta exper\). Since we have included the natural log of experience, \(\Delta Wage=\beta_1/100*\%\Delta exper\).
  4. \(ln(wage)=\beta_0+\beta_1exper+\epsilon\)
    • Since the dependent variable is now \(ln(wage)\), \(\%\Delta wage=\beta_1*100*\Delta exper\).
  5. \(ln(wage)=\beta_0+\beta_1ln(exper)+\epsilon\)
    • \(\beta_1\) is now an elasticity because we have \(ln(wage)\) and \(ln(exper)\): \(\%\Delta Wage / \%\Delta exper=\beta_1\)
  6. \(wage=\beta_0+\beta_1exper+\beta_2female+\beta_3exper*female+\epsilon\)
    • This allows for different slopes and intercepts for males and females. When female=0, the intercept is \(\beta_0\) and \(\Delta E[wage]/ \Delta exper=\beta_1\). When female=1, the intercept is \(\beta_0+\beta_2\) and \(\Delta E[wage]/ \Delta exper=\beta_1+\beta_3\). \(\beta_2\) is the difference in wages for males and females that each have 0 experience.
# estimating regressions and combining results

# Regression 1
reg1<- lm(wage~exper, data=wagedata)
# Regression 2
reg2<-lm(wage~exper+expersq, data=wagedata)
# Regression 3
reg3<-lm(wage~log(exper), data=wagedata)
# Regression 4
reg4<-lm(log(wage)~exper, data=wagedata)
# Regression 5
reg5<-lm(log(wage)~log(exper), data=wagedata)
# Regression 6
reg6<-lm(wage~exper+female+exper_female, data=wagedata)


# combining output
stargazer(reg1,reg2,reg3,reg4,reg5,reg6, type='text', omit.stat=c('f','ser'))
## 
## ====================================================================
##                                Dependent variable:                  
##              -------------------------------------------------------
##                         wage                 log(wage)       wage  
##                (1)       (2)      (3)      (4)      (5)       (6)   
## --------------------------------------------------------------------
## exper        0.031*** 0.298***           0.004**           0.054*** 
##              (0.012)   (0.041)           (0.002)            (0.015) 
##                                                                     
## expersq               -0.006***                                     
##                        (0.001)                                      
##                                                                     
## log(exper)                      0.742***          0.117***          
##                                 (0.148)           (0.021)           
##                                                                     
## female                                                     -1.547***
##                                                             (0.482) 
##                                                                     
## exper_female                                               -0.055** 
##                                                             (0.022) 
##                                                                     
## Constant     5.373*** 3.725***  4.120*** 1.549*** 1.343*** 6.158*** 
##              (0.257)   (0.346)  (0.388)  (0.037)  (0.056)   (0.342) 
##                                                                     
## --------------------------------------------------------------------
## Observations   526       526      526      526      526       526   
## R2            0.013     0.093    0.046    0.012    0.055     0.136  
## Adjusted R2   0.011     0.089    0.044    0.011    0.053     0.131  
## ====================================================================
## Note:                                    *p<0.1; **p<0.05; ***p<0.01

The estimates in column 1 are from regression (1) above. On average each additional year of experience increases wages by $0.03. \(exper^2\) was included in column (2). \(\Delta \hat{wage}/ \Delta exper=.3-.012exper\). You can set exper to different values to see how wages evolve at each level of experience. For example, the slope when exper=1 is .3-.012*1=0.29. This says, on average, one additional year of experience will increase wages by $0.29 if the worker has one year of experience.

The natural log of experience was included in column 3. On average, a 1% increase in experience increases wages by .742/100=.00742. A 100% increase in experience would increase wages by .742, on average. The natural log of wage was used in column 4. On average, a one year increase in experience increases wages by .004*100=0.4%.

We estimated the elasticity between wages and experience in column 5. On average, a 1% increase in experience increases wages by 0.12%.

Lastly, column 6 allows for different slopes and intercepts for males and females. On average, females with no experience earn $1.55 less per hour than a male with no experience. For males, each additional year of experience increases wages by $0.054. On average, each year of experience increases wages by 0.054-0.055 \(\approx\) 0. for females.

The figure below plots wages against experience and the fitted regression lines from columns 1-5. Based on the fitted values and \(R^2\), I would choose the polynomial functional form.

ggplot(wagedata, aes(x=exper, y=wage)) +
  geom_point() +
  geom_line(aes(y=reg1$fitted.values, color='Linear')) +
  geom_line(aes(y=reg2$fitted.values, color='Polynomial')) +
  geom_line(aes(y=reg3$fitted.values, color='ln(exper)')) +
  geom_line(aes(y=exp(reg4$fitted.values), color='ln(wage)')) +
  geom_line(aes(y=exp(reg5$fitted.values), color='ln(wage),ln(exper)')) +
  theme_minimal() +
  labs(x='Experience',
       y='Wage')

The figure below plots wages against experience and the fitted values that allowed for different slopes and intercepts. Even though we only estimated one equation, the functional form allowed for separate regression lines for males and females.

ggplot(wagedata, aes(x=exper, y=wage, color=as.factor(female), group=as.factor(female))) +
  geom_point() +
  geom_line(aes(y=reg6$fitted.values)) +
  theme_minimal() +
  labs(x='Experience',
       y='Wage',
       color='Female')