Download A Publish-Subscribe System for Data Replication and Synchronization Among Integrated Person PDF

TitleA Publish-Subscribe System for Data Replication and Synchronization Among Integrated Person
LanguageEnglish
File Size2.4 MB
Total Pages125
Table of Contents
                            A Publish-Subscribe System for Data Replication and Synchronization Among Integrated Person-Centric Information Systems
	Recommended Citation
ABSTRACT
LIST OF TABLES
	INTRODUCTION
	CHAPTER 2
	DATABASE HETEROGENITY, DATABASE INTEGRATION
	AND REPLICATION PROBLEM
		2.1 Integrated Information System
			2.1.1 Central Index
			2.1.2 Peer-to-Peer
			2.1.3 Arm‘s Length Information Broker
			2.1.4 Central Database
			2.1.5 Partitioned Central Database
		2.2 Heterogeneity in Information Systems
			factors, both technical and nontechnical. This section gives a classification of heterogeneities that can exist in information systems. Figure 2.1 shows one perspective on heterogeneity [11F ].
			2.2.1 System Heterogeneity
			2.2.2 Syntactic Heterogeneity
			2.2.3 Schematic Heterogeneity
			2.2.4 Semantic Heterogeneity
		2.3 Data Integration Issues
			2.3.1 Schema Matching and Schema Mappings
			2.3.2 Data Reconciliation Problem
			2.3.3 Data Transformation and Data Cleansing Problem
		2.4 Data Replication among Heterogeneous Databases
			Synchronous Replication
			Asynchronous Replication
			Master-slave Replication
			Multi-master Replication
			Log-based Change Capture
			Trigger-based Change Capture
	WHAT DOES SYNC ENGINE DO?
		3.1 Supports for Multiple Types of
		Integration Architecture
		3.2 Dealing with Heterogeneities of Integrated
		Systems at Different Levels
		3.3 It’s Not All about Data Integration
		3.4 Support for Database Replication
		3.5 Major Advantages of the Sync Engine
		and Relevant Work
	CHAPTER 4
	ARCHITECUTRAL DESIGN OF SYNC ENGINE
		Requirements for the the Sync Engine
		Overall Architectural Design
			4.2.1 Publish-Subscribe Model
			4.2.2 Sync Message
			4.2.3 Event Channel
			4.2.4 Sync Agent
			4.2.5 Sync Server
		Data Transformation Process
			Schematic Data Transformation with Detecting Queries
			Data Translation with Dynamic Translator
			Data Formatting with Formatter
		Other Implementation Issues
		Discussion
			Flexibility of the Sync Engine Design
			Application of the Sync Engine
	CHAPTER 5
	APPLICATION OF SYNC ENGINE TO CHARM
		Overview of CHARM Environment
			CHARM’s Master Person Index (MPI)
			Participating Programs
			Core Agent
			Matcher
			Address Cleaner
			Data Loaders
		5.2 Heterogeneous Databases in CHARM
			Platform Differences
			multiple forms of connectivity, including JDBC, and provide sufficient support for standalone SQL processing.
			Structural Differences
			Data Differences
		5.3 Adaptation of the Sync Engine to CHARM
			Adapting to a Child-Centric CHARM Environment
			Other Adaptation Issues
		5.4 Sync Engine for CHARM Work Summary
			Work Summary
			Evaluation
		5.5 Future Work with CHARM
	CHAPTER 6
	SUMMARY AND FUTURE WORK
	REFERENCES
                        
Document Text Contents
Page 1

Utah State University Utah State University

[email protected] [email protected]

All Graduate Theses and Dissertations Graduate Studies

5-2010

A Publish-Subscribe System for Data Replication and A Publish-Subscribe System for Data Replication and

Synchronization Among Integrated Person-Centric Information Synchronization Among Integrated Person-Centric Information

Systems Systems

Xiangbin Qiu
Utah State University

Follow this and additional works at: https://digitalcommons.usu.edu/etd

Part of the Computer Sciences Commons

Recommended Citation Recommended Citation
Qiu, Xiangbin, "A Publish-Subscribe System for Data Replication and Synchronization Among Integrated
Person-Centric Information Systems" (2010). All Graduate Theses and Dissertations. 620.
https://digitalcommons.usu.edu/etd/620

This Thesis is brought to you for free and open access by
the Graduate Studies at [email protected] It has
been accepted for inclusion in All Graduate Theses and
Dissertations by an authorized administrator of
[email protected] For more information, please
contact [email protected]

https://digitalcommons.usu.edu/
https://digitalcommons.usu.edu/etd
https://digitalcommons.usu.edu/gradstudies
https://digitalcommons.usu.edu/etd?utm_source=digitalcommons.usu.edu%2Fetd%2F620&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/142?utm_source=digitalcommons.usu.edu%2Fetd%2F620&utm_medium=PDF&utm_campaign=PDFCoverPages
https://digitalcommons.usu.edu/etd/620?utm_source=digitalcommons.usu.edu%2Fetd%2F620&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
http://library.usu.edu/
http://library.usu.edu/

Page 62

51


Figure 4.12. Class diagram for Sync Server package.

4.3 Data Transformation Process

As mentioned above, the data extracted from a participating database by

scanning the registered channels is raw data that has a completely local representation.

For a participating database to be able to comprehend and use the data, it has to be

transformed into a format that is consistent with the subscribing database’s format. To

avoid coupling between the publishing database and the subscribing database, the data

is first transformed to a global format that meets the standard of the integrated system.

At the receiving end, the Sync Agent associated with each of the subscribing systems

is responsible for transforming the data into its local representation. After this step,

the data is handed over to the participating program by either Sync Message Pusher or

Update Executor. The participating program needs not to be aware of the difference

Page 63

52

Publisher

Subscriber

Publisher Format

Subscriber Format

Global Format

Sync Agent

Sync Agent



Figure 4.13. Two-step data transformation of the Sync Engine.

between its own particular data representation between and that of other participating

programs. In this two-step transformation process, schematic and semantic

heterogeneities are hidden from each of the participating programs. Figure 4.13

illustrates this process.

With the Sync Agent, each step of the above two-step transformation is carried

out by three components: Channel, Translator and Formatter. Details on how data is

transformed with each of the three components are described in this section.

4.3.1 Schematic Data Transformation with Detecting Queries

As mentioned above, detecting queries are written as SQL queries that comply

with each of the participating databases’ DBMS standard. The powerful nature of

SQL provides the ability to restructure data by joining and putting query conditions

on tables, the ability to change field names by aliasing them, as well as the ability to

cast some simple data types with built in functions. Therefore, the data transformation

Similer Documents