Bayesian Multivariate Logistic Regression by Sean OBrien and
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008
Outlines Univariate logistic regression Multivariate logistic regression Prior specification and convergence Posterior computation Experimental result Conclusions
Univariate Logistic Regression Model Equivalent: zi: latent variable L( ): logistic density: CDF:
Univariate Logistic Regression Model Approximation using t distribution set
Multivariate Logistic Regression Model Binary variable for each output -- marginal pdf has univariate logistic density with , F-1( ) is the inverse CDF of density
Multivariate Logistic Regression Model Property § The marginal univariate densities of zj, for j=1, …, p, have univariate logistic form § p=1, reduce to the univariate logistic density § R is a correlation matrix (with 1’s on the diagonal), reflecting the correlations between zj, and hence the correlations between yj § R=diag(1, …, 1), reduce to a product of univariate logistic densities, and the elements of z are uncorrelated § Good convergence property for MCMC sampling
Multivariate Logistic Regression Model Likelihood M-ary variable for each output (ordered) Assume Define
Prior specification and convergence or R: uniform density [-1, 1] for each element in non-diagonal position
Posterior Computation Posterior: Prior and likelihood are not conjugate Importance sampling: sample to approximate samples from exact inference. from a proposal distribution , and use importance weights for Proposal distribution: = Use multivariate t distribution to approximate the multivariate logistic density in the likelihood part.
Posterior Computation Introduce latent variables and z, the proposal is expressed as z) Sample and z from the full conditionals since the likelihood is conjugate to prior. Update R using a Metropolis step (accept/reject) Set with probability otherwise
Posterior Computation Importance weights for inference weights
Application Subject: 584 twin pregnancies Output: small for gestational age (SGA), defined as a birthweight below the 10 th percentile for a given gestational age in a reference population. Binary output, yij={0, 1}, i=1, …, 584, j=1, 2 Covariates: xij for the ith pregnancy and the jth infant
Application § Obtain nearly identical estimates to the study of AP for the regression coefficients. § Female gender (β 1), prior preterm delivery (β 4, β 5) and smoking (β 8) are associated with an increased risk of SGA. § Outcomes for twins are highly correlated, represented by R.
Conclusions § Propose a multivariate logistic density for multivariate logistic regression model. § The proposed multivariate logistic density is closely approximated by a multivariate t distribution. § Has properties that facilitate efficient sampling and guaranteed convergence. § The marginals are univariate logistic densities. § Embed the correlation structure within the model.
- Slides: 14