Download BEST PRACTICES FOR BUILDING HARDWARE DESIGNS FOR LIVING COMPUTATIONAL ... PDF

TitleBEST PRACTICES FOR BUILDING HARDWARE DESIGNS FOR LIVING COMPUTATIONAL ...
LanguageEnglish
File Size2.5 MB
Total Pages128
Table of Contents
                            LIST OF TABLES
LIST OF FIGURES
LIST OF ABBREVIATIONS
CHAPTER 1: INTRODUCTION
	1.1 Computer Simulations
	1.2 Hardware Accelerators
	1.3 Cost of Refactoring
	1.4 Evaluation
		1.4.1 Experiment 1: Effectiveness
		1.4.2 Experiment 2: Broad Applicability
CHAPTER 2: BACKGROUND
	2.1 Field-Programmable Gate Arrays
		2.1.1 Configurable Logic Blocks
		2.1.2 Digital Clock Managers
		2.1.3 Block RAMs
		2.1.4 PPC 405 Processor
		2.1.5 XtremeDSP Tile
		2.1.6 Ethernet MAC Block
	2.2 Related
		2.2.1 Hardware/Software Co-Design
		2.2.2 Scientific Application Design Methodologies
		2.2.3 HDL Coding and Design Guidelines
		2.2.4 C-to-HDL Conversion Tools
CHAPTER 3: SCOPE AND METHODOLOGY
	3.1 Key Idea
	3.2 Scope of the Work
	3.3 Analysis of Sequential Code
		3.3.1 Example: Electrodynamics Application
	3.4 Hardware Design
CHAPTER 4: EVALUATION AND VALIDATION
	4.1 Effectiveness of design guideline
		4.1.1 Design Guideline Evaluation Metrics
		4.1.2 Applications and Kernel Under Test
	4.2 Communicability of the Design Guidelines
	4.3 Broad Applicability of the Design Guidelines
		4.3.1 Guideline Fitness Plot
		4.3.2 Computational Fluid Dynamics
		4.3.3 Computational Molecular Dynamics
		4.3.4 Quantum Monte Carlo Simulations
		4.3.5 Hessenberg Reduction
		4.3.6 Gaxpy - BLAS Routine
		4.3.7 N-Body Simulations
	4.4 Validation
CHAPTER 5: RESULTS
	5.1 Effectiveness of design guidelines
		5.1.1 P-V System Modeling using Neural Networks (NN)
		5.1.2 2D-Finite Difference Time Domain
		5.1.3 Sparse Matrix Vector Multiplication
	5.2 Broad Applicability of the Design Guidelines
		5.2.1 Computational Fluid Dynamics
		5.2.2 Computational Molecular Dynamics
		5.2.3 Quantum Monte Carlo Simulations
		5.2.4 Hessenberg Reduction
		5.2.5 Gaxpy - BLAS Routine
		5.2.6 N-Body Simulations
CHAPTER 6: CONCLUSION
REFERENCES
                        
Document Text Contents
Page 1

BEST PRACTICES FOR BUILDING HARDWARE DESIGNS FOR LIVING
COMPUTATIONAL SCIENCE APPLICATIONS

by

Robin Jacob Pottathuparambil

A dissertation submitted to the faculty of
The University of North Carolina at Charlotte

in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in

Electrical Engineering

Charlotte

2013

Approved by:

Dr. Ronald R. Sass

Dr. James M. Conrad

Dr. Bharat S. Joshi

Dr. Ryan Adams

Dr. Taghi Mostafavi

Page 2

ii

c
2013
Robin Jacob Pottathuparambil

ALL RIGHTS RESERVED

Page 64

51

Table 4.6: Hardware design details for version 3.0 electromagnetic application
Details Values
FPGA Xilinx Virtex II-Pro FPGA
Resources used 47% of Slices, 28% Multipliers, and 50% BRAM blocks
Frequency of operation 120 MHz
Performance 481×481 took 13.1 seconds for 3,026 iterations, 53.4 MNodes/second
Computation Precision 33-bits after binary point and 2-bits for integer part
Input Excitation Electromagnetic wave from ground penetrating radar
Test Cases Free Space
Boundary Conditions UPML boundary conditions

Figure 4.17: Version 3.0: 2D-FDTD UPML hardware design [5]

Figure 4.18: Version 3.0: 2D-FDTD UPML hardware design [5]

Page 65

52

Figure 4.19: Version 3.0: 2D-FDTD UPML overall hardware design [5]

parameters will be stored in off-chip memory.

3. Input Excitation : The input excitation wave will be stored as look-up table

values and will be loaded at each computation cycle.

4.1.2.3 Sparse Matrix-Vector Multiplication (SpMV)

Sparse matrix vector multiply (SpMV) routine does a matrix-vector multiplication

of sparsely filled matrix, and the complexity of the routine is O(n2). SpMV routine

is an important routine in many scientific applications that deals with matrix com-

putations. As better computation methods are discovered, computational scientists

implement those methods to reduce the execution time of their applications. In order

to evaluate the design guidelines, three versions (version 1.0, 2.0, and 3.0) of SpMV

is taken from the literature. Each version reduces the computation latency and im-

proves throughput. Figure 4.20 shows the versions of SpMV hardware design. These

three models are then used to evaluate the set of design guidelines. The following

sub-sections explain all the versions.

The version 1.0 design computes sparse matrix vector product as described by

Zhuo et al. [6]. The work discusses about a high throughput sparse matrix vector

Page 127

114

[56] C. Cote and Z. Zilic, “Automated systemc to vhdl translation in hardware/soft-
ware codesign,” in Electronics, Circuits and Systems, 2002. 9th International
Conference on, vol. 2, June 2002, pp. 717 – 720 vol.2.

[57] “Handel-c,” 2005. [Online]. Available:
http://babbage.cs.qc.edu/courses/cs345/Manuals/HandelC.pdf

[58] “Dime-c,” 2006. [Online]. Available:
http://www.nallatech.com/Development-Tools/dime-c.html

[59] R. Bruce, M. Devlin, and S. Marshall, “An elementary transcendental function
core library for reconfigurable computing,” in Proceedings of the Third Annual
Recon�gurable Systems Summer Institute (RSSI’07), 2007, pp. 1 – 9.

[60] “Impulse c,” 2003. [Online]. Available: www.impulseaccelerated.com

[61] J. Xu, N. Subramanian, A. Alessio, and S. Hauck, “Impulse c vs. VHDL
for accelerating tomographic reconstruction,” in 2010 18th IEEE Annual In-
ternational Symposium on Field-Programmable Custom Computing Machines
(FCCM) . IEEE, May 2010, pp. 171 – 174.

[62] A. Jain, J. Mao, and K. Mohiuddin, “Artificial neural networks: a tutorial,”
Computer, vol. 29, no. 3, pp. 31 – 44, mar 1996.

[63] M. Gardner and S. Dorling, “Artificial neural networks (the multilayer
perceptron)–a review of applications in the atmospheric sciences,” Atmospheric
Environment, vol. 32, no. 14-15, pp. 2627 – 2636, 1998.

[64] A. Ormondi and J. Rajapakse, FPGA implementations of neural networks.
Springer, 2006.

[65] A. Mellit, H. Mekki, and S. Shaari, “FPGA-based neural network for simulation
of photovoltaic array: application for estimating the output power generation,”
in 33rd IEEE Photovoltaic Specialists Conference, 2008. PVSC ’08. IEEE, May
2008, pp. 1 – 7.

[66] G. Mur, “Absorbing boundary conditions for the Finite-Difference approximation
of the Time-Domain Electromagnetic-Field equations,” IEEE Transactions on
Electromagnetic Compatibility, vol. EMC-23, no. 4, pp. 377 – 382, Nov. 1981.

[67] Z. S. Sacks, D. M. Kingsland, R. Lee, and J. Lee, “A perfectly matched
anisotropic absorber for use as an absorbing boundary condition,” IEEE Trans-
actions on Antennas and Propagation, vol. 43, no. 12, pp. 1460 – 1463, Dec.
1995.

[68] S. D. Gedney, “An anisotropic perfectly matched layer-absorbing medium for the
truncation of FDTD lattices,” IEEE Transactions on Antennas and Propagation,
vol. 44, no. 12, pp. 1630 – 1639, Dec. 1996.

Page 128

115

[69] R. J. CULLEY, “FDTD methods using parallel computations and hardware
optimization,” thesis, UNIVERSITY OF CINCINNATI, 2007.

[70] W. Chen, P. Kosmas, M. Leeser, and C. Rappaport, “An FPGA implementation
of the two-dimensional finite-difference time-domain (FDTD) algorithm,” in 12th
international symposium on Field programmable gate arrays, 2004. FPGA 2004.
ACM Press, 2004, p. 213.

[71] J. H. Ferziger and M. Peric., Computational Methods for Fluid Dynamics. Berlin,
Heidelberg: Springer-Verlag, 1996.

[72] D. C. Rapaport, The Art of Molecular Dynamics Simulation. New York, NY,
USA: Cambridge University Press, 1995.

[73] A. S. Householder, “Unitary triangularization of a nonsymmetric matrix,” Jour-
nal of The ACM, vol. 5, pp. 339 – 342, 1958.

Similer Documents