Abstract
A “health disparity” refers to a higher burden of illness, injury, disability, or mortality experienced by one group relative to another. These disparities may be due to many factors including age, income, race, etc. This book will focus on their estimation, ranging from classical approaches including the quantification of a disparity, to more formal modelling, to modern approaches involving more flexible computational approaches.
Features: • Presents an overview of methods and applications of health disparity estimation • First book to synthesize research in this field in a unified statistical framework • Covers classical approaches, and builds to more modern computational techniques • Includes many worked examples and case studies using real data • Discusses available software for estimation The book is designed primarily for researchers and graduate students in biostatistics, data science, and computer science. It will also be useful to many quantitative modelers in genetics, biology, sociology, and epidemiology.
Foreword
Preface
1 Basic Concepts
1.1 What is a health disparity
1.2 A brief historical perspective
1.3 Some examples
1.4 Determinants of Health
1.4.1 Biology and genetics
1.4.2 Individual behavior
1.4.3 Health services
1.4.4 Social determinants of health
1.4.5 (Health) policies
The challenging issue of race
Racial segregation as a social determinant of health
Racism, segregation, and inequality
Role of data visualization in health disparities research
A note on notation adopted in this book
Overall Estimation of Health Disparities
Data and Measurement
Disparity indices
Total disparity indices
Disparity indices measuring differences between groups
Disparity indices from complex surveys
Randomized experiments: an idealized estimate of disparity
Model-based estimation: adjusting for confounders
Regression approach
Model-assisted survey regression
Peters-Belson approach
Peters-Belson approach for complex survey
2.4.2.2 Peters- Belson approach for clustered data
2.4.3 Disparity drivers
2.4.3.1 Disparity drivers for complex survey data
Matching and propensity scoring
Discrete outcomes
Binary outcomes
Nominal and ordinal outcomes
Poisson regression and log-linear models
Survival analysis
Survivor and hazard functions
Common parametric models
Estimation
Inference
Non-parametric estimation of S ( y )
Cox proportional hazards model
Multi-level modeling
Estimation and inference
Generalized estimating equations
Pseudo GEE for complex survey data
Bayesian methods
Intuitive motivation and practical advantages to Bayesian analyses
An overview of Bayesian inference
Markov Chain Monte Carlo (MCMC)
Domain-specific Estimates
What is a domain?
Direct estimates
Indirect estimates
Small area model-based estimates
Small area estimation models
Estimation
Inference
Bayesian approaches
Observed best prediction (OBP)
3.6.0.1 OBP versus the BLUP
Nonparametric/semi-parametric small area estimation
Model selection and diagnostics
A simplified adaptive fence procedure
Causality, Moderation and Mediation
Socioecological framework for health disparities
Causal inference in health disparities
Experimental versus observational studies
Challenges with certain variables to be treated causal factors
Average treatment effects
What are we trying to estimate and are these identifi-
able ?
Estimation of ATE and related quantities
Regression estimators
Matching estimators
Propensity score methods
Combination methods
Bayesian methods
Uncertainty estimation for ATE estimators
Assessing the assumptions
Use of instrumental variables
Verifying the assumptions
Estimation
Traditional mediation versus causal mediation
Effect identification
Mediation analysis for health disparities
Traditional moderation versus causal moderation
Parallel estimation framework
Causal moderation without randomized treatments
Machine Learning Based Approaches to Disparity Estima- tion
What is machine learning (ML)?
Supervised versus unsupervised machine learning
Why is ML relevant for health disparity research?
Tree-based models
Understanding the decision boundary
Bagging trees
Tree-based models for health disparity research
Tree-based models for complex survey data
Random forests
Hypothesis testing for feature significance
Shrinkage estimation
Generalized Ridge Regression (GRR)
Geometrical and theoretical properties of the
GRR estimator in high dimensions
Ideal variable selection for GRR
Spike and slab regression
Selective shrinkage and the oracle property
The elastic net (enet) and lasso
Model assisted lasso for complex survey data
Deep Learning
Deep architectures
Forward and backpropagation to train the ANN
Proofs
Health Disparity Estimation Under a Precision Medicine Paradigm
What is precision medicine?
The role of genomic data
Disparity subtype identification using tree-based methods
PRISM approximation
Level set estimation (LSE) for disparity subtypes
Classified mixed model prediction
Prediction of mixed effects associated with new obser- vations
CMMP without matching assumption
Prediction of responses of future observations
Some simulations
Assessing the uncertainty in classified mixed model predictions
Overview of Sumca
Implementation of Sumca to CMMP
A simulation study of Sumca
Proofs
7 Extended Topics
7.1 Correcting for sampling bias in disease surveillance studies
7.1.1 The model
7.1.2 Bias correction
7.1.2.1 Large values of s
7.1.2.2 Small values of s
7.1.2.3 Prevalence
7.1.2.4 Middle values of s
7.1.2.5 M= 2
7.1.3 Estimated variance
7.2 Geocoding
7.2.1 The Public Health Geocoding Project
7.2.2 Geocoding considerations
7.2.3 Pseudo-Bayesian classified mixed model prediction for imputing area-level covariates
7.2.3.1 Consistency and asymptotic optimality of MPPM search
7.3 Differential privacy and the impact on health disparities
7.4. R software for health disparity research
7.4.1 Typical R software development and testing/debugging workflow
7.4.2 The need for a health disparity R research repository
7.4.3 R packages relevant to Chapter 3
R packages relevant to Chapter 4
R packages relevant to Chapter 5
R packages relevant to Chapter 6
R packages relevant to Chapter 7
Bibliography
Index
J. Sunil Rao, Ph.D . is Professor of Biostatistics in the School of Public Health at the University of Minnesota, Twin Cities as well as Professor and Founding Director Emeritus in the Division of Biostatistics at the Miller School of Medicine, University of Miami.
He has published widely about methods for complex data modeling including high dimensional model selection, mixed model prediction, small area estimation, and bump hunting machine learning, as well as statistical methods for applied cancer biostatistics.
He is a Fellow of the American Statistical Association and an elected member of the International Statistical Institute.