Download Living Standards Analytics: Development through the Lens of Household Survey Data PDF

TitleLiving Standards Analytics: Development through the Lens of Household Survey Data
Author
LanguageEnglish
File Size4.8 MB
Total Pages337
Table of Contents
                            Cover
Statistics for Social and Behavioral Sciences
Living Standards Analytics
ISBN 9781461403845
Preface
About the Authors
Contents
Chapter 1 Graphical Methods
	1.1 Introduction
	1.2 Exploratory Graphical Methods
		1.2.1 Histograms
		1.2.2 Kernel Densities
		1.2.3 Boxplots
		1.2.4 Scatterplots
		1.2.5 Bagplots
	1.3 Presentational Graphics
	1.4 Statistics with Maps
	1.5 Conclusion
	References
Chapter 2 Regression
	2.1 Introduction
	2.2 Basics
		2.2.1 Inference
	2.3 Addressing Regression Problems
		2.3.1 Measurement Error
		2.3.2 Omitted Variable Bias
		2.3.3 Multicolinearity
		2.3.4 Heteroscedasticity
			2.3.4.1 Aside: Log Transformations
			2.3.4.2 Testing for Heteroscedasticity
		2.3.5 Clustering
		2.3.6 Outliers
			2.3.6.1 Quantile Regression
		2.3.7 Simultaneity
			2.3.7.1 Instrumental Variables
	2.4 Conclusion
	References
Chapter 3 Sampling
	3.1 Introduction
	3.2 Types of Sampling
		3.2.1 Simple Random Sampling
		3.2.2 Stratified Sampling
		3.2.3 Cluster Sampling
	3.3 Sample Size
		3.3.1 Sampling vs. Nonsampling Errors
	3.4 Incorporating Sample Design
	3.5 Design vs. Model-Based Sampling
		3.5.1 Illustration: Design-Based vs. Model-Based Means
	3.6 Weights or Not?
		3.6.1 Illustration: Weighting Regression
	3.7 Sampling Hard-to-Reach Groups: Vietnam
	3.8 Respondent-Driven Sampling: Hard-to-Reach Groups
	References
Chapter 4 Beyond Linear Regression
	4.1 Introduction
	4.2 Flexibility in Linear Regression Models
	4.3 Nonlinear Models
	4.4 Nonparametric Models
	4.5 Higher-Dimension Models
		4.5.1 MARS Models
		4.5.2 MARS Application: Changes in Expenditure in Vietnam, 1993-1998
		4.5.3 CART Models
		4.5.4 CART as a Preprocessor: A Nutrition Example
		4.5.5 CART as a Classifier: An Expenditure Example
	4.6 Beyond Regression
	References
Chapter 5 Causality
	5.1 Introduction
	5.2 The Experimentalist School
	5.3 The Structuralist School
	5.4 The Causal Inference School
		5.4.1 Establishing Causality
	5.5 Creating Directed Acyclic Graphs
		5.5.1 Basic Tools: Probability
		5.5.2 Directed Acyclic Graphs
		5.5.3 d-Separation
		5.5.4 Illustrative Example of Measuring Causality
		5.5.5 Tetrad, and the Partial Correlation Algorithm
	5.6 A DAG to Explain World Poverty
		5.6.1 DAGs and Theory: Publishing Productivity
	5.7 Conclusion
	References
Chapter 6 Grouping Methods
	6.1 Introduction
	6.2 Hierarchical Cluster Analysis
	6.3 Nonhierarchical Clustering
	6.4 Examples of Cluster Analysis
		6.4.1 Regions of Slovakia
		6.4.2 Households in South Africa
	6.5 Model-Based Clustering: Latent Class Models
		6.5.1 Applications
	6.6 Case Study: Vietnamese Households, 2002
	6.7 Kohonen Maps
		6.7.1 Building Kohonen Maps
		6.7.2 An Illustration
	6.8 Conclusion
	References
Chapter 7 Bayesian Analysis
	7.1 Introduction
	7.2 A Worked Example
		7.2.1 Assuming i.i.d. Observations
		7.2.2 Taking Survey Sampling into Account
		7.2.3 An Illustration
	7.3 Prior Distributions and Implications of Their Choice
		7.3.1 Eliciting Priors
		7.3.2 Noninformative Priors
		7.3.3 Priors in a Linear Regression Context
	7.4 Bayes Factors and Posterior Predictive Checking
		7.4.1 Bayes Factors
		7.4.2 Posterior Predictive Checking
		7.4.3 Example: Fixed vs. Random Effects
		7.4.4 Example: Modeling Diagnostic Tests
	7.5 Combining Models: Bayesian Model Averaging
		7.5.1 Practical Issues
	7.6 Bayesian Approach to Sample Size Determination
	7.7 Conclusion
	Appendix AWinbugs Code for the Ho Chi Minh City Urban Example
		A.1Program hcmcurban
		A.2Model A Data
	References
Chapter 8 Spatial Models
	8.1 Introduction
	8.2 The Starting Point: Including Spatial Variables
		8.2.1 Exploratory Spatial Data Analysis
		8.2.2 Including Spatial Variables
	8.3 Spatial Models
		8.3.1 Spatial Dependence
		8.3.2 Spatial Heterogeneity
	8.4 Classifying Spatial Models
		8.4.1 Measuring Spatial Contiguity
		8.4.2 Types of Spatial Model
		8.4.3 Illustrating the Choice of Spatial Model
	8.5 Other Spatial Models
		8.5.1 Spatial Expansion Models
		8.5.2 Geographically Weighted Regression
		8.5.3 Spatial Effects as Random Effects
	8.6 Conclusion
		8.6.1 Estimating Spatial Models
	References
Chapter 9 Panel Data
	9.1 Introduction
	9.2 Types of Panel Data
	9.3 Why Panel Data?
	9.4 Why Not Panel Data?
	9.5 Application: The Birth and Growth of NFHEs
	9.6 Statistical Analysis of Panel Data
	9.7 Illustration: Thai Microcredit
	References
Chapter 10 Measuring Poverty and Vulnerability
	10.1 Introduction
	10.2 What and Why?
	10.3 Basic Measurement
		10.3.1 Measuring Well-Being
		10.3.2 Adult Equivalents
		10.3.3 Choosing a Poverty Line
			10.3.3.1 Theoretical Considerations
			10.3.3.2 Practical Considerations
			10.3.3.3 Subjective Poverty Lines
			10.3.3.4 Shortcuts to Measuring Poverty
		10.3.4 Summarizing Poverty Information
	10.4 Robustness
		10.4.1 Sampling Error
		10.4.2 Measurement Error
		10.4.3 Equivalence Scales
		10.4.4 Choice of Poverty Line
	10.5 International Poverty Comparisons
	10.6 Vulnerability to Poverty
	References
Chapter 11 Bootstrapping
	11.1 Introduction
	11.2 Bootstrap: Mechanics
		11.2.1 Further Considerations
	11.3 Applications to Living Standards
		11.3.1 SST Index for Vietnam
		11.3.2 Measuring Vulnerability
	11.4 Bootstrapping Inequality and Regression
		11.4.1 Regression
	11.5 Has Poverty Changed?
	11.6 Conclusion
	References
Chapter 12 Impact Evaluation
	12.1 Introduction
	12.2 General Principles
		12.2.1 Case: The Thailand Village Fund
		12.2.2 A More Formal Treatment
	12.3 Experimental Design
		12.3.1 Case Study: Flip Charts in Kenya
		12.3.2 Partial Randomization
		12.3.3 Randomization Evaluated
	12.4 Quasi-Experimental Methods
		12.4.1 Solution 1. Matching Comparisons
		12.4.2 Propensity Score Matching
			12.4.2.1 An Illustration
			12.4.2.2 Matching with Propensity Scores
			12.4.2.3 Propensity Score Matching Illustrated
			12.4.2.4 Propensity Score Matching Cases
		12.4.3 Covariate Matching
		12.4.4 Solution 2. Double Differences
			12.4.4.1 Illustrations: Schools in Indonesia, Subsidies in Mexico
		12.4.5 Solution 3. Instrumental Variables
			12.4.5.1 An Illustration: Thai Microcredit
		12.4.6 Other Solutions
	12.5 Impact Evaluation: Macro Projects
		12.5.1 Time-Series Data Analysis: Deviations from Trend
		12.5.2 CGE and Simulation Models
		12.5.3 Household Panel Impact Analysis
		12.5.4 Self-Rated Retrospective Evaluation
	12.6 In Conclusion
	References
Chapter 13 Multilevel Models and Small-Area Estimation
	13.1 Introduction
	13.2 Simple Small-Area Models
	13.3 Synthetic Regression Models
		13.3.1 An Illustration: Poverty Mapping in Vietnam
	13.4 Random Effects and Multilevel Models
		13.4.1 Basic Idea
		13.4.2 Specifying and Estimating a Two-Level Model
		13.4.3 An Example: Expenditures in Vietnam
		13.4.4 Rationale for Using Multilevel Models for Small-Area Estimation
	13.5 Conclusion
	References
Chapter 14 Duration Models
	14.1 Introduction
	14.2 Basics
	14.3 An Exploratory Analysis of Duration Data
		14.3.1 The Kaplan-Meier Estimator
	14.4 Cox Proportional Hazards Model
	14.5 Parametric Regression Models
		14.5.1 Weibull Regression Models
		14.5.2 A Mixture of Two-Weibull Regression Models
	14.6 Other Applications
	References
Index
                        
Document Text Contents
Page 2

Statistics for Social and Behavioral Sciences

Advisors:
S.E. Fienberg
W.J. van der Linden

For further volumes:
http://www.springer.com/series/3463

Page 168

negative if the patient is not ill) both equal to 1. The result for each test is either

positive (+) or negative (�). For each subject, the outcome consists of a 4-tuple of
test results; for instance, ��++ would represent a negative result on the two first,
and a positive result on the two last, tests. There are 16 possible patterns, for which

Menten et al. specify a number of alternative models. For example, their Model 4

assumes correlated outcomes for tests 1 and 2, and for tests 3 and 4; and their

conditional independence model assumes (implausibly) that the outcomes of the

four tests are independent.

Figure 7.5 shows three posterior predictive check histograms; in each case a

vertical line indicates where the observed frequency lies. The left panel shows that

the observed frequency of ++�� patterns is far higher than the conditional inde-
pendence model would lead one to expect, while the center panel shows that it is in

line with what Model 4 would predict. On the other hand, the right panel in Fig. 7.5

shows that Model 4 does a relatively poor job of predicting the �++� pattern. If it
is important to use a model that gets the ++�� pattern right, then the diagnostic
results in Fig. 7.5. suggest that Model 4 would be appropriate, or at least certainly a

lot better than the conditional independence model.

The examples in this section involve the use of posterior predictive checking in

the context of data related to living standards. For a further discussion of this

method, and examples in other areas of application, we refer the reader to Gelman

et al. (2003), Sect. 6.3.

7.5 Combining Models: Bayesian Model Averaging

Suppose we are interested in finding the determinants of poverty, and we have at

hand a long list of K plausible explanatory variables. The classical econometric
approach assumes that we know the fundamental structure of the model that

Fig. 7.5 Posterior predictive histograms for selected outcomes of tests for visceral leishmaniasis

in East Africa. (Source: Menten et al. 2006)

7.5 Combining Models: Bayesian Model Averaging 145

Page 169

generates this poverty; the main task is then to use the data to measure the strength

of the effects of the different variables.

In contrast, the Bayesian Model Averaging approach treats the models them-

selves (within a set M of possible models) as random variables: we do not
necessarily know which variables to include in the analysis, and furthermore, we

often do not know how to combine these variables in an appropriate modelMj. We
therefore need to use the data to help resolve some of the ambiguity about model

selection; this will reduce the precision with which the size of effects are measured,

but this also implies that the classical approach, by ignoring model uncertainty,

generates statistically significant results too often.

Bayesian Model Averaging is a method for dealing with model uncertainty that

generates a posterior distribution for the effects or parameters of interest (y) given
data y – the left hand side of (7.22) – as a weighted average of the posterior

distributions of the parameters under each model Mj. More formally, if we have a
total of K covariates that could be used in linear combinations, we have

PðyjyÞ ¼
X2K
j¼1

Pðyjy;MjÞ � PðMjjyÞ; (7.23)

since there are 2
K
potential models, corresponding to the inclusion or not of a

covariate. The weights here are the posterior model probabilities, which equal the
posterior probabilities of the models, given the data, and are given by

PðMjjyÞ ¼
PðyjMjÞ � PðMjÞP2K
i¼1 PðyjMiÞ � PðMiÞ

: (7.24)

Here the P(Mi) terms are the prior probabilities of the models Mi belonging to
M; and P(y|Mi) is the marginal likelihood of modelMi, that is, the likelihood of the
data given the parameters and the modelMi, integrated with respect to the priors of
the parameters.

It appears, both in theory and in practice, that Bayesian averaging over all the

models has greater predictive ability than using any single modelM* (Hoeting et al.
1999), even a single model that has been chosen using, for instance, the Bayesian

Information Criterion (BIC). This provides a powerful argument for using this

technique, at least in principle.

7.5.1 Practical Issues

If the number of variables under consideration is small, then Bayesian Model

Averaging (BMA) is relatively straightforward: for each of the 2
K
models, find

146 7 Bayesian Analysis

Page 336

Shortcuts to measuring poverty. See Poverty
Simple random sample, 51–53, 56, 57, 129,

130, 132

Simultaneity, 24, 43–47

Slovakia, 112–114

Small area estimation, 151, 198, 273–286

Snowball sampling. See Sampling
Social capital, 46, 47, 260

Social Capital and Poverty Survey.

See Tanzania
Social network, 64

Social Weather Stations [in the Philippines],

198, 199

Son preference, 292, 293, 295, 300–302

South Africa, 19, 113–116, 204, 267

South Asia, 19, 145, 212

South Korea, 178, 268

SpaceStat, 173
Spatial autoregressive model. See Spatial

models

Spatial contiguity

contiguity matrix, 163, 164

queen contiguity, 163

rook contiguity, 163

Spatial dependence. See Spatial models
Spatial errors model. See Spatial models
Spatial expansion model. See Spatial models
Spatial heterogeneity. See Spatial models
Spatial lag model. See Spatial models
Spatial models

first order spatial autoregressive model, 164

general spatial model, 166

mixed autoregressive-regressivemodel, 164

spatial autoregressive model, 164, 166

spatial dependence, 159–160, 165, 166

spatial errors model, 165, 167

spatial expansion model, 169–170

spatial heterogeneity, 159–161, 170

spatial lag model, 164, 166, 167, 169

Spline, 68, 70, 71, 76–78, 83, 151

Squared poverty gap index (P2). See Poverty
measure

Statistical control.See Instrumental variables (IV)
Strata, 36, 51, 52, 54, 58, 61, 252, 291

Stratification, 10, 23, 55, 206

Stratified sampling, 51

Structuralists, 91, 93–94

Stunting. See Malnutrition
Subjective poverty line. See Poverty
Sub-Saharan Africa, 212, 213

Sudan, 145

Survey design, 52, 132

Survival function, 289–294, 296, 297, 301

“Synthetic” estimators, 274

“Synthetic regression” estimator, 275

T

Tableau, 18

Tanzania

Human Resource Development Survey, 46

Social Capita and Poverty survey, 46

Tetrad, 79, 97, 101–104, 106

Thailand, socio-economic survey, 9, 11, 42, 44,

183–185, 237, 246, 248, 253, 260, 266

Thailand Village Fund, 42–44, 183, 236–239,

245, 246, 249–251, 253, 256. See also
Microcredit

Time taken to exit. See Poverty measure
T€ornqvst index. See Price deflator [used in

measuring poverty]

Total survey error, 53

Trabajar II [Argentina], 255

Transfer axiom. See Poverty measure
Transient poor, 177

Transition matrix, 176, 177, 214

Treatment, 24, 43, 92, 94, 145, 183, 185, 186,

189, 235, 238–244, 246–252, 254,

256–262, 264, 267, 289

Trellis plot, 10

Tri-cube function, 74

Triple differences, 247, 263

t-test, 10, 231–233, 256

U

Uganda, 204

U-matrix. See Kohonen map
Unconfoundedness, 240, 249

United States, 50, 57, 84, 144, 192, 196, 212,

213, 250, 274

Unit non-response error, 54

Unobserved area heterogeneity, 245, 247

Unobserved heterogeneity [in panel data], 178

Unobserved household heterogeneity, 245,

247, 266

Unobserved individual heterogeneity, 245

Urban Poverty Survey. See Vietnam
U.S. National Longitudinal Survey of

Youth, 144

V

Variance inflation factor, 33

VCE. See Asymptotic variance–covariance
matrix of the estimator

Index 313

Page 337

Vietnam

Household Living Standards Survey, 17,

24, 42, 53, 54, 59, 60, 73–75, 118,

209, 276, 278, 280

Living Standards Survey, 3, 4, 7, 8,

11–14, 17, 24, 25, 29, 30, 36, 37,

42, 53, 54, 59, 60, 73–75, 79, 84,

118, 132, 135, 175, 176, 178, 179,

191, 197, 209, 214, 227, 231, 233,

275, 276, 278, 280, 291, 293, 295,

300, 302

Urban Poverty Survey, 61, 62

Vietnam Living Standards Survey, 3, 4, 7, 8,

11–14, 17, 24, 25, 29, 30, 36, 37, 42, 53,

54, 59, 60, 73–75, 79, 84, 118, 132, 135,

175, 176, 178, 179, 191, 197, 209, 214,

227, 231, 233, 275, 276, 278, 280, 291,

293, 295, 300, 302

Village Fund. See Thailand Village Fund
Violin plot, 10, 11

Visceral leishmaniasis, 144, 145

Vulnerability, 114–116, 189–218, 227–230

to poverty, 214–218, 227

W

Wasting. See Malnutrition
Watts index. See Poverty measure
Weibull model, 296, 299, 300, 303, 304

Weighted regression. See Regression
Weight for height. See Malnutrition
“Welfarist” approach [to poverty], 189

White’s robust estimator, 35, 36

White’s test, 37

Wild bootstrap, 231

WinBugs, 130, 134, 136, 137, 151–152

Women and Love, 49

Z

Zambia, 151

Zero-stage rule, 5

314 Index

Similer Documents