Statistical Methods in Health Disparity Research

J. Sunil Rao

doi:10.1201/9781003119449

A “health disparity” refers to a higher burden of illness, injury, disability, or mortality experienced by one group relative to another. These disparities may be due to many factors including age, income, race, etc. This book will focus on their estimation, ranging from classical approaches including the quantification of a disparity, to more formal modelling, to modern approaches involving more flexible computational approaches. Features: • Presents an overview of methods and applications of health disparity estimation • First book to synthesize research in this field in a unified statistical framework • Covers classical approaches, and builds to more modern computational techniques • Includes many worked examples and case studies using real data • Discusses available software for estimation The book is designed primarily for researchers and graduate students in biostatistics, data science, and computer science. It will also be useful to many quantitative modelers in genetics, biology, sociology, and epidemiology. Foreword Preface 　 1 Basic Concepts 1.1 What is a health disparity 1.2 A brief historical perspective 1.3 Some examples 1.4 Determinants of Health 1.4.1 Biology and genetics 1.4.2 Individual behavior 1.4.3 Health services 1.4.4 Social determinants of health 1.4.5 (Health) policies 　 The challenging issue of race Racial segregation as a social determinant of health Racism, segregation, and inequality Role of data visualization in health disparities research A note on notation adopted in this book Overall Estimation of Health Disparities Data and Measurement Disparity indices Total disparity indices Disparity indices measuring differences between groups Disparity indices from complex surveys Randomized experiments: an idealized estimate of disparity Model-based estimation: adjusting for confounders Regression approach Model-assisted survey regression Peters-Belson approach Peters-Belson approach for complex survey 2.4.2.2 Peters- Belson approach for clustered data 2.4.3 Disparity drivers 2.4.3.1 Disparity drivers for complex survey data Matching and propensity scoring Discrete outcomes Binary outcomes Nominal and ordinal outcomes Poisson regression and log-linear models Survival analysis Survivor and hazard functions Common parametric models Estimation Inference Non-parametric estimation of S ( y ) Cox proportional hazards model Multi-level modeling Estimation and inference Generalized estimating equations Pseudo GEE for complex survey data Bayesian methods Intuitive motivation and practical advantages to Bayesian analyses An overview of Bayesian inference Markov Chain Monte Carlo (MCMC) Domain-specific Estimates What is a domain? Direct estimates Indirect estimates Small area model-based estimates Small area estimation models Estimation Inference Bayesian approaches Observed best prediction (OBP) 3.6.0.1 OBP versus the BLUP Nonparametric/semi-parametric small area estimation Model selection and diagnostics A simplified adaptive fence procedure Causality, Moderation and Mediation Socioecological framework for health disparities Causal inference in health disparities Experimental versus observational studies Challenges with certain variables to be treated causal factors Average treatment effects What are we trying to estimate and are these identifi- able ? Estimation of ATE and related quantities Regression estimators Matching estimators Propensity score methods Combination methods Bayesian methods Uncertainty estimation for ATE estimators Assessing the assumptions Use of instrumental variables Verifying the assumptions Estimation Traditional mediation versus causal mediation Effect identification Mediation analysis for health disparities Traditional moderation versus causal moderation Parallel estimation framework Causal moderation without randomized treatments Machine Learning Based Approaches to Disparity Estima- tion What is machine learning (ML)? Supervised versus unsupervised machine learning Why is ML relevant for health disparity research? Tree-based models Understanding the decision boundary Bagging trees Tree-based models for health disparity research Tree-based models for complex survey data Random forests Hypothesis testing for feature significance Shrinkage estimation Generalized Ridge Regression (GRR) Geometrical and theoretical properties of the GRR estimator in high dimensions Ideal variable selection for GRR Spike and slab regression Selective shrinkage and the oracle property The elastic net (enet) and lasso Model assisted lasso for complex survey data Deep Learning Deep architectures Forward and backpropagation to train the ANN Proofs Health Disparity Estimation Under a Precision Medicine Paradigm What is precision medicine? The role of genomic data Disparity subtype identification using tree-based methods PRISM approximation Level set estimation (LSE) for disparity subtypes Classified mixed model prediction Prediction of mixed effects associated with new obser- vations CMMP without matching assumption Prediction of responses of future observations Some simulations Assessing the uncertainty in classified mixed model predictions Overview of Sumca Implementation of Sumca to CMMP A simulation study of Sumca Proofs 7 Extended Topics 7.1 Correcting for sampling bias in disease surveillance studies 7.1.1 The model 7.1.2 Bias correction 7.1.2.1 Large values of s 7.1.2.2 Small values of s 7.1.2.3 Prevalence 7.1.2.4 Middle values of s 7.1.2.5 M= 2 7.1.3 Estimated variance 7.2 Geocoding 7.2.1 The Public Health Geocoding Project 7.2.2 Geocoding considerations 7.2.3 Pseudo-Bayesian classified mixed model prediction for imputing area-level covariates 7.2.3.1 Consistency and asymptotic optimality of MPPM search 7.3 Differential privacy and the impact on health disparities 7.4. R software for health disparity research 7.4.1 Typical R software development and testing/debugging workflow 7.4.2 The need for a health disparity R research repository 7.4.3 R packages relevant to Chapter 3 R packages relevant to Chapter 4 R packages relevant to Chapter 5 R packages relevant to Chapter 6 R packages relevant to Chapter 7 Bibliography Index J. Sunil Rao, Ph.D . is Professor of Biostatistics in the School of Public Health at the University of Minnesota, Twin Cities as well as Professor and Founding Director Emeritus in the Division of Biostatistics at the Miller School of Medicine, University of Miami. He has published widely about methods for complex data modeling including high dimensional model selection, mixed model prediction, small area estimation, and bump hunting machine learning, as well as statistical methods for applied cancer biostatistics. He is a Fellow of the American Statistical Association and an elected member of the International Statistical Institute.

Statistical Methods in Health Disparity Research

Abstract

Metrics

Details