Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binomial regression #9

Open
sidravi1 opened this issue Aug 16, 2020 · 6 comments
Open

Binomial regression #9

sidravi1 opened this issue Aug 16, 2020 · 6 comments

Comments

@sidravi1
Copy link

Hi,

Thank you for putting this package together.
I'm trying to fit a binomial model but unsure how to specify the number of failures (or total trials).

data:

> d_sample
  y2                        id      lat      lon   N
1 25 IND0070940639070004_Del_U 28.83111 77.16795 100
2 37 IND0070940639100003_Del_U 28.80293 77.12073 120
3 25 IND0070940639110007_Del_U 28.78427 77.20626 130
4 39 IND0070940639130005_Del_U 28.77438 77.16210 100
5 31 IND0070940639140018_Del_U 28.76010 77.14313 110
6 23 IND0070940639200027_Del_U 28.76169 77.01645  90

models:

I'm not specifying N here.

m1 <- glmmfields(y2 ~ as.factor(id), 
           data = d_sample,  family = binomial(), lat = "lat", lon = "lon", 
           nknots = 2, iter = 500,  
           prior_sigma = half_t(3, 0, 3),
           prior_gp_theta = half_t(3, 0, 10),
           prior_gp_sigma = half_t(3, 0, 3),)

gives

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: bernoulli_logit_lpmf: n[1] is 25, but must be in the interval [0, 1]  (in 'model_glmmfields' at line 224)

Surprised that is using the bernoulli_logit_lpmf instead of binomial_logit_lpmf.

Here's another model

> m2 <- glmmfields(cbind(y2, N - y2) ~ as.factor(id), 
              data = d_sample, family = binomial(), lat = "lat", lon = "lon", 
              nknots = 2, iter = 500,  
              prior_sigma = half_t(3, 0, 3),
              prior_gp_theta = half_t(3, 0, 10),
              prior_gp_sigma = half_t(3, 0, 3),)

gives

Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=stationID; position=0; dims declared=(12); dims found=(6)  (in 'model_glmmfields' at line 6)

Session info

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] yaml_2.2.1       pool_0.1.4.3     DBI_1.1.0        lubridate_1.7.9  readxl_1.3.1     lme4_1.1-23      Matrix_1.2-18    here_0.1        
 [9] styler_1.3.2     cowplot_1.0.0    broom_0.7.0      geojsonio_0.9.2  dplyr_1.0.0      ggplot2_3.3.2    glmmfields_0.1.4 Rcpp_1.0.5  
@seananderson
Copy link
Owner

Yes, it looks like we used bernoulli_logit_lpmf instead of binomial_logit_lpmf and never set up a way of passing in the number of trials. That should be a relatively easy fix, although I don't have time at this very moment. Feel free to make a pull request and if not I will try to get to this shortly.

@sidravi1
Copy link
Author

Thanks! I'll let you know if I find time to put together a PR. It probably won't be till mid-Sept.

@ericward-noaa
Copy link
Collaborator

ericward-noaa commented Aug 26, 2020 via email

@ericward-noaa
Copy link
Collaborator

Leaving this issue open for the time being, because we might want to set this up slightly differently. Two more common options are to (1) rename the 'binomial_N' argument to glmmfields to be 'weights' -- and potentially allow for weights to be used elsewhere in the GLMMs, and (2) pass in arguments in terms of a matrix where each row is cbind(successes,failures)~ ...
Looking into glmmTMB, I think they allow both approaches.

@m-moinuddin
Copy link

Hi, I have tried with binomial_N but it giving me an error "Error: passing unknown arguments: binomial_N.".

@ericward-noaa
Copy link
Collaborator

Sorry you're having trouble. Here's a simple example that hopefully shows how to set up the data. We should add something like this to the vignette to make it clear,

library(glmmfields)
set.seed(1)
# simulate data -- spatial, because just 1 draw
s <- sim_glmmfields(n_draws = 1, n_knots = 12, gp_theta = 1.5,
gp_sigma = 0.2, sd_obs = 0.2)

simulate some binomial variables, and let N vary a bit by observation

s$dat$p = plogis(s$dat$y)
s$dat$N = sample(10:14, size=nrow(s$dat), replace=TRUE)
s$dat$n = rbinom(n = nrow(s$dat), size=s$dat$N, prob = s$dat$p)

fit the model

m <- glmmfields(n ~ 0, time = "time",binomial_N = "N",
  family = binomial(),
 lat = "lat", lon = "lon", data = s$dat,
 nknots = 12, iter = 1000, chains = 2, seed = 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants