4 min read

Estimating seed production with mixed effects

Warning: I need to see what the “seeds” here are—they may only be home pot seeds, in which case I need to adjust the intercept (and perhaps its variance) by the fraction dispersing from the dispersal analyses.

Returning to parameter estimation for density dependent seed production… This is based primarily on the post from 5 June 2017. The main change is that I want to use mixed effects to estimate the variance due to a random effect of generation, as well as a random effect of replicate within generation.

Note that I could also estimate generation as a fixed effect to recreate the observed sequence of temporal variability in seed production.

I will use glmer in the lme4 package. To get the nesting, the reps need to have unique values within each gen; to do that I create a genID variable.

The prepared data are in Ler_fecundity, precalculated from a munge script. See “Updated Ler fecundity data” on 10/23/17.

OK, now load the library and try fitting a model. The first lesson we learn is that “quasi” families cannot be used.

library(lme4)
seeds_DD <- try(glmer(Seedlings ~ log(Nm1) + (1 | Gen) + (1 | GenID), 
                  data = Ler_fecundity, family = quasipoisson))
print(seeds_DD, TRUE)
[1] "Error in lme4::glFormula(formula = Seedlings ~ log(Nm1) + (1 | Gen) +  : \n  \"quasi\" families cannot be used in glmer\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in lme4::glFormula(formula = Seedlings ~ log(Nm1) + (1 | Gen) +     (1 | GenID), data = Ler_fecundity, family = quasipoisson): "quasi" families cannot be used in glmer>
seeds_DD <- try(glmer(Seedlings ~ log(Nm1) + (1 | Gen) + (1 | GenID), 
                  data = Ler_fecundity, family = poisson))
summary(seeds_DD) 
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: poisson  ( log )
Formula: Seedlings ~ log(Nm1) + (1 | Gen) + (1 | GenID)
   Data: Ler_fecundity

     AIC      BIC   logLik deviance df.resid 
 21339.0  21354.9 -10665.5  21331.0      397 

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-20.5930  -3.6700  -0.3992   3.0677  26.8679 

Random effects:
 Groups Name        Variance Std.Dev.
 GenID  (Intercept) 0.48458  0.6961  
 Gen    (Intercept) 0.07978  0.2825  
Number of obs: 401, groups:  GenID, 134; Gen, 5

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  4.24444    0.14157   29.98   <2e-16 ***
log(Nm1)     0.23680    0.00245   96.66   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
         (Intr)
log(Nm1) -0.051

The estimates for the intercept and the effects of density are in the same ballpark as the prior fixed effects analysis (recall that to get the Gompertz parameter we need to subtract 1 from the coefficient on log(Nm1)). Interestingly, the among-rep variation is much larger than the among-generation variation. Gazing at the fixed effects coefficients from the prior analysis, this seems right.

Now, about the overdispersion. This post describes how to calculate the Poisson overdispersion parameter. There is also a function glmer.nb() for estimating using a negative binomial distribution; recall that is what we are using to generate quasipoisson random numbers. But back on 10/23/17 I found a paper saying that using negative binomial directly gives a quadratic mean-variance relationship, whereas the data are closer to linear.

So here’s the overdispersion calculation:

sum(residuals(seeds_DD, type="pearson")^2)/df.residual(seeds_DD)
[1] 48.37515

This value is quite a bit lower than the prior analysis.

Now we need to extract the coefficients of the model. A raw call to coef gives every intercept, not so useful.

fixef(seeds_DD)
(Intercept)    log(Nm1) 
  3.8990398   0.3218015 
fixef(seeds_DD)[1] # Extract the first fixed coefficient
(Intercept) 
    3.89904 
VarCorr(seeds_DD)
 Groups Name        Std.Dev.
 GenID  (Intercept) 0.64374 
 Gen    (Intercept) 0.24759 
as.data.frame(VarCorr(seeds_DD))
    grp        var1 var2       vcov     sdcor
1 GenID (Intercept) <NA> 0.41439496 0.6437352
2   Gen (Intercept) <NA> 0.06130243 0.2475933
as.data.frame(VarCorr(seeds_DD))$sdcor[1] # Extract the first RE sd.
[1] 0.6437352

So here’s the full code that will be needed in the parameter estimation script:

# Fit model
seeds_DD <- glmer(Seedlings ~ log(Nm1) + (1 | Gen) + (1 | GenID), 
                  data = Ler_fecundity, family = poisson)

# Extract parameters
Ler_params <- list(
  a_Gompertz = fixef(seeds_DD)[1],
  b_Gompertz = fixef(seeds_DD)[2] - 1,
  sigma_seed_time = as.data.frame(VarCorr(seeds_DD))$sdcor[2],
  sigma_seed_rep = as.data.frame(VarCorr(seeds_DD))$sdcor[1],
  theta_seed = sum(residuals(seeds_DD, type="pearson")^2)/df.residual(seeds_DD)
)

Ler_params
$a_Gompertz
(Intercept) 
   4.244445 

$b_Gompertz
  log(Nm1) 
-0.7632042 

$sigma_seed_time
[1] 0.2824573

$sigma_seed_rep
[1] 0.6961171

$theta_seed
[1] 43.42342