Download Handbook of Computational Econometrics PDF

TitleHandbook of Computational Econometrics
File Size4.0 MB
Total Pages516
Table of Contents
                            Handbook of Computational Econometrics
	List of Contributors
	1 Econometric software
		1.1 Introduction
		1.2 The nature of econometric software
			1.2.1 The characteristics of early econometric software
			1.2.2 The expansive development of econometric software
			1.2.3 Econometric computing and the microcomputer
		1.3 The existing characteristics of econometric software
			1.3.1 Software characteristics: broadening and deepening
			1.3.2 Software characteristics: interface development
			1.3.3 Directives versus constructive commands
			1.3.4 Econometric software design implications
		1.4 Conclusion
	2 The accuracy of econometric software
		2.1 Introduction
		2.2 Inaccurate econometric results
			2.2.1 Inaccurate simulation results
			2.2.2 Inaccurate GARCH results
			2.2.3 Inaccurate VAR results
		2.3 Entry-level tests
		2.4 Intermediate-level tests
			2.4.1 NIST Statistical Reference Datasets
			2.4.2 Statistical distributions
			2.4.3 Random numbers
		2.5 Conclusions
	3 Heuristic optimization methods in econometrics
		3.1 Traditional numerical versus heuristic optimization methods
			3.1.1 Optimization in econometrics
			3.1.2 Optimization heuristics
			3.1.3 An incomplete collection of applications of optimization heuristics in econometrics
			3.1.4 Structure and instructions for use of the chapter
		3.2 Heuristic optimization
			3.2.1 Basic concepts
			3.2.2 Trajectory methods
			3.2.3 Population-based methods
			3.2.4 Hybrid metaheuristics
		3.3 Stochastics of the solution
			3.3.1 Optimization as stochastic mapping
			3.3.2 Convergence of heuristics
			3.3.3 Convergence of optimization-based estimators
		3.4 General guidelines for the use of optimization heuristics
			3.4.1 Implementation
			3.4.2 Presentation of results
		3.5 Selected applications
			3.5.1 Model selection in VAR models
			3.5.2 High breakdown point estimation
		3.6 Conclusions
	4 Algorithms for minimax and expected value optimization
		4.1 Introduction
		4.2 An interior point algorithm
			4.2.1 Subgradient of Φ(x) and basic iteration
			4.2.2 Primal–dual step size selection
			4.2.3 Choice of c and μ
		4.3 Global optimization of polynomial minimax problems
			4.3.1 The algorithm
		4.4 Expected value optimization
			4.4.1 An algorithm for expected value optimization
		4.5 Evaluation framework for minimax robust policies and expected value optimization
	5 Nonparametric estimation
		5.1 Introduction
			5.1.1 Comments on software
		5.2 Density estimation
			5.2.1 Some illustrations
		5.3 Nonparametric regression
			5.3.1 An illustration
			5.3.2 Multiple predictors
			5.3.3 Some illustrations
			5.3.4 Estimating conditional associations
			5.3.5 An illustration
		5.4 Nonparametric inferential techniques
			5.4.1 Some motivating examples
			5.4.2 A bootstrap-t method
			5.4.3 The percentile bootstrap method
			5.4.4 Simple ordinary least squares regression
			5.4.5 Regression with multiple predictors
	6 Bootstrap hypothesis testing
		6.1 Introduction
		6.2 Bootstrap and Monte Carlo tests
		6.3 Finite-sample properties of bootstrap tests
		6.4 Double bootstrap and fast double bootstrap tests
		6.5 Bootstrap data generating processes
			6.5.1 Resampling and the pairs bootstrap
			6.5.2 The residual bootstrap
			6.5.3 The wild bootstrap
			6.5.4 Bootstrap DGPs for multivariate regression models
			6.5.5 Bootstrap DGPs for dependent data
		6.6 Multiple test statistics
			6.6.1 Tests for structural change
			6.6.2 Point-optimal tests
			6.6.3 Non-nested hypothesis tests
		6.7 Finite-sample properties of bootstrap supF tests
		6.8 Conclusion
	7 Simulation-based Bayesian econometric inference: principles and some recent computational advances
		7.1 Introduction
		7.2 A primer on Bayesian inference
			7.2.1 Motivation for Bayesian inference
			7.2.2 Bayes’ theorem as a learning device
			7.2.3 Model evaluation and model selection
			7.2.4 Comparison of Bayesian inference and frequentist approach
		7.3 A primer on simulation methods
			7.3.1 Motivation for using simulation techniques
			7.3.2 Direct sampling methods
			7.3.3 Indirect sampling methods yielding independent draws
			7.3.4 Markov chain Monte Carlo: indirect sampling methods yielding dependent draws
		7.4 Some recently developed simulation methods
			7.4.1 Adaptive radial-based direction sampling
			7.4.2 Adaptive mixtures of t distributions
		7.5 Concluding remarks
	8 Econometric analysis with vector autoregressive models
		8.1 Introduction
			8.1.1 Integrated variables
			8.1.2 Structure of the chapter
			8.1.3 Terminology and notation
		8.2 VAR processes
			8.2.1 The levels VAR representation
			8.2.2 The VECM representation
			8.2.3 Structural forms
		8.3 Estimation of VAR models
			8.3.1 Estimation of unrestricted VARs
			8.3.2 Estimation of VECMs
			8.3.3 Estimation with linear restrictions
			8.3.4 Bayesian estimation of VARs
		8.4 Model specification
			8.4.1 Choosing the lag order
			8.4.2 Choosing the cointegrating rank of a VECM
		8.5 Model checking
			8.5.1 Tests for residual autocorrelation
			8.5.2 Tests for non-normality
			8.5.3 ARCH tests
			8.5.4 Stability analysis
		8.6 Forecasting
			8.6.1 Known processes
			8.6.2 Estimated processes
		8.7 Causality analysis
			8.7.1 Intuition and theory
			8.7.2 Testing for Granger-causality
		8.8 Structural VARs and impulse response analysis
			8.8.1 Levels VARs
			8.8.2 Structural VECMs
			8.8.3 Estimating impulse responses
			8.8.4 Forecast error variance decompositions
		8.9 Conclusions and extensions
	9 Statistical signal extraction and filtering: a partial survey
		9.1 Introduction: the semantics of filtering
		9.2 Linear and circular convolutions
			9.2.1 Kernel smoothing
		9.3 Local polynomial regression
		9.4 The concepts of the frequency domain
			9.4.1 The periodogram
			9.4.2 Filtering and the frequency domain
			9.4.3 Aliasing and the Shannon–Nyquist sampling theorem
			9.4.4 The processes underlying the data
		9.5 The classical Wiener–Kolmogorov theory
		9.6 Matrix formulations
			9.6.1 Toeplitz matrices
			9.6.2 Circulant matrices
		9.7 Wiener–Kolmogorov filtering of short stationary sequences
		9.8 Filtering nonstationary sequences
		9.9 Filtering in the frequency domain
		9.10 Structural time-series models
		9.11 The Kalman filter and the smoothing algorithm
			9.11.1 The smoothing algorithms
			9.11.2 Equivalent and alternative procedures
	10 Concepts of and tools for nonlinear time-series modelling
		10.1 Introduction
		10.2 Nonlinear data generating processes and linear models
			10.2.1 Linear and nonlinear processes
			10.2.2 Linear representation of nonlinear processes
		10.3 Testing linearity
			10.3.1 Weak white noise and strong white noise testing
			10.3.2 Testing linearity against a specific nonlinear model
			10.3.3 Testing linearity when the model is not identified under the null
		10.4 Probabilistic tools
			10.4.1 A strict stationarity condition
			10.4.2 Second-order stationarity and existence of moments
			10.4.3 Mixing coefficients
			10.4.4 Geometric ergodicity and mixing properties
		10.5 Identification, estimation and model adequacy checking
			10.5.1 Consistency of the QMLE
			10.5.2 Asymptotic distribution of the QMLE
			10.5.3 Identification and model adequacy
		10.6 Forecasting with nonlinear models
			10.6.1 Forecast generation
			10.6.2 Interval and density forecasts
			10.6.3 Volatility forecasting
			10.6.4 Forecast combination
		10.7 Algorithmic aspects
			10.7.1 MCMC methods
			10.7.2 Optimization algorithms for models with several latent processes
		10.8 Conclusion
	11 Network economics
		11.1 Introduction
		11.2 Variational inequalities
			11.2.1 Systems of equations
			11.2.2 Optimization problems
			11.2.3 Complementarity problems
			11.2.4 Fixed point problems
		11.3 Transportation networks: user optimization versus system optimization
			11.3.1 Transportation network equilibrium with travel disutility functions
			11.3.2 Elastic demand transportation network problems with known travel demand functions
			11.3.3 Fixed demand transportation network problems
			11.3.4 The system-optimized problem
		11.4 Spatial price equilibria
			11.4.1 The quantity model
			11.4.2 The price model
		11.5 General economic equilibrium
		11.6 Oligopolistic market equilibria
			11.6.1 The classical oligopoly problem
			11.6.2 A spatial oligopoly model
		11.7 Variational inequalities and projected dynamical systems
			11.7.1 Background
			11.7.2 The projected dynamical system
		11.8 Dynamic transportation networks
			11.8.1 The path choice adjustment process
			11.8.2 Stability analysis
			11.8.3 Discrete-time algorithms
			11.8.4 A dynamic spatial price model
		11.9 Supernetworks: applications to telecommuting decision making and teleshopping decision making
		11.10 Supply chain networks and other applications
Document Text Contents
Page 258


sampling points below the graph of cQ, and accepting the horizontal positions of the
points falling below the graph of P . The remaining points are rejected. The coordinates
of the points below the cQ graph are sampled as follows. The horizontal position θ
is obtained by drawing it from the candidate distribution with density Q. Next, the
vertical position ũ is uniformly sampled from the interval (0, cQ(θ)), that is ũ = cQ(θ)u
with u ∼ U(0, 1). As the point (θ, ũ) is accepted if and only if ũ is located in the
interval (0, P (θ)), the acceptance probability for this point is given by P (θ)/cQ(θ). The
following rejection algorithm collects a sample of size n from the target distribution with
density P .

Initialize the algorithm:

The set of accepted draws S is empty: S = ∅.
The number of accepted draws i is zero: i = 0.

Do while i < n:

Obtain θ from candidate distribution with density Q.
Obtain u from uniform distribution U(0, 1).
If u < P(θ)/cQ(θ) then accept θ :

Add θ to the set of accepted draws: S = S ∪ {θ}.
Update the number of accepted draws: i = i + 1.

Although rejection sampling is based on using an approximating candidate distribu-
tion, the method yields an exact sample for the target distribution. However, the big
drawback of the rejection approach is that many candidate draws might be required to
obtain an accepted sample of moderate size, making the method inefficient. For example,
in Figure 7.9 it is seen that most points are located above the P graph, so that many
draws are thrown away. For large n, the fraction of accepted draws tends to the ratio of
the area below the P graph and the area below the cQ graph. As the candidate density
Q integrates to 1, this acceptance rate is given by

P(θ) dθ/c, so that a smaller value

for c results in more efficiency. Clearly, c is optimized by setting it at

c = max

P (θ)

, (7.29)

implying that the optimal c is small if variation in the ratio P(θ)/Q(θ) is small. This
explains that an appropriate candidate density, providing a good approximation to the
target density, is desirable.

Clever rejection sampling methods have been developed for simulating draws from
the univariate standard normal distribution, which serve as building blocks for more
involved algorithms. However, in higher dimensions, it may be nontrivial to determine
the maximum of the ratio P(θ)/Q(θ). Moreover, it may be difficult to find a candidate
density that has small c in (7.29), so that the acceptance rate may be very low. Therefore,
in higher dimensions, rejection sampling is not so popular; one mostly prefers one of the
other sampling methods that will be discussed later in this chapter.

Page 259

A PRIMER ON SIMULATION METHODS 239 Importance sampling

Importance sampling is another indirect approach to obtain an estimate for E[g(θ)],
where θ is a random variable from the target distribution. It was initially discussed by
Hammersley and Handscomb (1964) and introduced in econometrics by Kloek and Van
Dijk (1978). The method is related to rejection sampling. The rejection method either
accepts or rejects candidate draws, that is, either draws receive full weight or they do not
get any weight at all. Importance sampling is based on this notion of assigning weights
to draws. However, in contrast with the rejection method, these weights are not based
on an all-or-nothing situation. Instead, they can take any possible value, representing the
relative importance of draws. If Q is the candidate density (= importance function) and
P is a kernel of the target density, importance sampling is based on the relationship

E[g(θ)] =

g(θ)P (θ) dθ∫
P(θ) dθ


g(θ)w(θ)Q(θ) dθ∫
w(θ)Q(θ) dθ

= E[w(θ̃)g(θ̃)]

, (7.30)

where θ̃ is a random variable from the candidate distribution, and w(θ̃) = P(θ̃)/Q(θ̃) is
the weight function, which should be bounded. It follows from (7.30) that a consistent
estimate of E[g(θ)] is given by the weighted mean

�E[g(θ)]IS =
j=1w(θ̃j )

, (7.31)

where θ̃1, . . . , θ̃n are realizations from the candidate distribution and w(θ̃1), . . . , w(θ̃n)
are the corresponding weights. As relationship (7.30) would still hold after redefining
the weight function as w(θ̃) = P(θ̃)/cQ(θ̃), yielding the acceptance probability of θ̃ ,
there exists a clear link between rejection sampling and importance sampling, that is, the
importance sampling method weights draws with the acceptance probabilities from the
rejection approach. Figure 7.10 provides a graphical illustration of the method. Points for
which the graph of the target density is located above the graph of the candidate density
are not sampled often enough. In order to correct for this, such draws are assigned
relatively large weights (weights larger than unity). The reverse holds in the opposite
case. Although importance sampling can be used to estimate characteristics of the target
density (such as the mean), it does not provide a sample according to this density, as
draws are generated from the candidate distribution. So, in a strict sense, importance
sampling should not be called a sampling method; rather, it should be called a pure
integration method.

The performance of the importance sampler is greatly affected by the choice of the
candidate distribution. If the importance function Q is inappropriate, the weight function
w(θ̃) = P(θ̃)/Q(θ̃) varies a lot, and it might happen that only a few draws with extreme
weights almost completely determine the estimate �E[g(θ)]IS. This estimate would be
very unstable. In particular, a situation such that the tails of the target density are fatter
than the tails of the candidate density is concerning, as this would imply that the weight
function might even tend to infinity. In such a case, E[g(θ)] does not exist – see Equation
(7.30). Roughly stated, it is much less harmful for the importance sampling results if the
candidate density’s tails are too fat than if the tails are too thin, as compared with the
target density. It is for this reason that a fat-tailed Student’s t importance function is
usually preferred over a normal candidate density.

Page 515


absolute, 348

summation operator
seasonal, 358, 364

supF statistic, 202, 204, 205
supremum test statistic, 393
switching regression, 85
symmetric bootstrap P -value, 186
Symposium on Large-Scale Digital

Calculating Machinery, 1

tabu list, 89
tabu search, 88, 89
test statistic

asymptotically pivotal, 187, 191
pivotal, 185, 186

tests for heteroskedasticity, 188
tests for structural change, 201
TESTU01, 67, 73, 75
threshold accepting, 85, 89, 99, 113
threshold autoregressive models, 85
threshold methods, 88
threshold sequence, 89, 104
threshold vector error correction models,

time domain, 322, 346
time-reversibility, 250
Toeplitz matrix, 346
top Lyapunov exponent, 396
traffic assignment, 443
trajectory methods, 88, 94
TRAMO–SEATS program, 331, 344,

363, 367, 368, 373
transition equation, 368
trend of data sequence, 325, 326, 329,

334, 336, 344, 354, 355, 361
trend/cycle component of data sequence,

trigonometric basis, 332
trigonometric identity, 332
two-stage least squares, 197, 198

unguided search, 94
unitary matrix, 349
University of Auckland, 24
University of Cambridge, 1, 20, 24

Department of Applied Economics,
1, 2, 12

University of Chicago, 24
University of London, 20

London School of Economics and
Political Science, 12, 24

University of Michigan, 2
University of Minnesota, 24
University of Pennsylvania, 2, 8, 20, 24
University of Warwick

ESRC Macroeconomic Modelling
Bureau, 12

University of Wisconsin, 24
unobserved components, 351, 363, 368
updating equation of Kalman filter, 369

VAR model
Bayesian estimation, 294
reduced form, 288
structural form, 288

VAR models, 85
VAR order selection, 295
VAR process, 285
variance decomposition, 63
variational inequalities, 432, 435, 438,

448–450, 456, 458, 460, 461
VEC model, 286
VEC models, 85
VECM, 286
vector autoregression, 62–64, 75
vector autoregressive process, 285, 365
vector error correction model, 286
volatility forecast, 414
von Neumann, John

first computer program, 5
first stored-program computer, 5
game theory, 5

Wald test, 389–395, 406
wavelet analysis, 322, 340
wavelets, 154
wavepacket, 340
Wharton Econometric Forecasting

Associates, 11, 16
white noise

strong, 382
testing, 386–388
weak, 382

white noise process, 340

Page 516


white noise process, (continued )
band-limited, 340

Wiener process, 340
Wiener–Kolmogorov filter, 341, 342,

359, 362, 372
initial conditions, 355, 360

wild bootstrap, 196, 197, 204, 205
worst case analysis, 121
worst case strategy, 122
wrapped filter, 350

z transform, 322, 345, 348

Similer Documents