Enterprise DB Systems Architecture
-
System Architecture Concepts
-
Repositories
-
Centralized Processing
-
Distributed Processing
-
DB Application Integration Mechanisms
System Architecture Concepts
Enterprise System Architecture
-
An significant Enterprise System Architecture example
taken from DB2 Magazine, appearing in the following article
(optional material)
Components
-
Executable software systems, programs, functions,
or computational methods (in object-oriented programs) with an interface
(a composition of I/O resources).
Interfaces
-
Composition of I/O resources that Require or Provide
application data object types, instance values, or component control signals
I/O Resources
-
Application resources -- data object types, instance values,
or component control signals (start, stop, suspend, resume, error,
etc.) -- Input to, or Output from, a component
-
(Optional) Functions or logical predicates that specify the
range
of acceptable values for each resource, or special values (e.g., "singularities")
that signify component processing exceptions.
Connectors
-
Data communication mechanisms that enable the exchange
or interchange of messages (data or control signals) between communicating
components
-
Point-to-Point Application Program (Component) Interfaces
(APIs) that link a Component's Outputs to other Components' Inputs
-
Messaging system or "information bus" for broadcasting
messages from one sending component to many other listening components.
Configuration and Versions
-
Configuration: An arrangement of components ("floorplan")
whose interfaces and I/O resources are interconnected to other components
using conectors.
-
Version: An instance of a configuration, component
or connector, with a unique interface
An (Advanced) Architectural Description Language (xArch)
-
For your information only :)
-
(Optional)
xArch Notation: an XML data description schema (in progress!)
-
XML schemas can be sent/received over the Web
-
This enables portable, reusable or mobile specification of
System Architectures
Repositories
Client-side
-
Pervasive, dominant approach to information and data storage
-
Data content is not organized and administered using
a DBMS
-
Lack explicit database data model, lack standard query or
data definition languages, updates are ad hoc, end-user acts as data administrator.
-
PC/Workstation file systems (e.g., "Windows Explorer"), directories,
files, and others (e.g., "Bookmark" or "Favorites" hierarchical lists for
Web browsers) are pervasive data storage mechanisms, with little support
for consistent data organization, access, or update.
-
Centralized system architecture
Server-side
-
Pervasive, dominant approach to information and data storage
-
Data content is not organized and administered using a DBMS
-
Lack explicit database data model, lack standard query or
data definition languages, updates are ad hoc, end-user acts as data administrator.
-
Network file systems (e.g., NFS, NTFS, NDS), directories
and files are pervasive data storage mechanisms, with little support for
consistent data organization, access, or update.
-
Storage Area Network (SAN) vs. Network Attached
Storage (NAS)
-
SAN -- a network of storage repository servers
-
NAS -- a storage repository server attached to a network
-
Centralized server, distributed client system architecture
Centralized Processing Architectures
Centralized Servers
-
Mainframe (centralized processor, like an IBM
System/390)
-
Star/Hub (e.g., centralized processor connected
to networked PCs and other processors)
-
Cluster (physically centralized, logically distributed)
-
Multiple, tightly-coupled servers, all identical
-
Provides reliability through redundancy
-
DB may be replicated across multiple processors, each
with a copy of the same DBMS.
-
DBMS servers can be configured to "take turns" (e.g., round-robin
processing) in processing a stream of DB transactions.
-
Redundancy provides reliability through a fail-safe
configuration of transaction processing.
-
If a single processor fails, then data processing can be
automatically migrated to other processors.
-
Provides scalability through workload distribution
(load-sharing)
-
Enterprises may acquire two or more mainframes, then add
more as transaction processing load increases. Well-suited choice for scaling
large DBMS processors.
-
Enterprises may acquire 100s of PC-class processors as a
cluster, then buy more as workload increases. Well suited choice
for scaling many small DBM processors.
-
Note: Redundancy is not the same as scalability.
Having one does not imply the existence of the other. They are distinct
concepts and capabilities.
-
Clusters are increasingly being brought together to form
application/computing
service grids.
-
Grids may be physically distributed, but logically centralized
(i.e., act together as if a single system or repository server).
-
Grids are implemented using Web-based application services
(described below)
-
Centralized processing is most often employed when large
numbers of transactions must be processed at high rates (e.g., 100s-1000s
transactions per second), in a highly reliable manner.
Distributed Processing Architectures
Client-Server
-
Two-tier (centralized server, distributed clients)
-
Least expensive to configure
-
Vulnerable
-
Generally a legacy solution
-
Three-tier (clients, proxy/gateway, servers)
-
Most common contemporary solution
-
Proxy/gateway accomodates, hides/isolates, and protects multiple
servers and multiple clients
-
Well-suited to small-medium size enterprises that are not
in a DBM-specific business
-
N-tier or multi-tier (clients, proxy, application
server,…, data servers)
-
Most common future solution
-
Proxy/gateway networks accomodate, hide/isolate, and protect
multiple servers and multiple clients
-
Well-suited to medium-very large size enterprises that can
be in a DBM-specific business
-
N-tier architectures enable massively decentralized
systems (Freenet, SETI@Home, etc.)
-
All tiers are increasingly being configured as clusters
or multi-processors.
Client-side Platforms
-
Processors that request or provide data on-demand
-
Desktop/Laptop PC
-
Web/Browser User Interface and associated "helper applications"
-
Mobile computers: Remote, wireless/disconnected data bases
-
Handheld Personal Digital Asssistants (PDAs -- e.g., Palm
Pilot)
-
Internet Devices (PCS/Internet Phone with local "address
book" database)
-
Smart Cards
-
Coming attractions in 2-5 years:
-
Mobile DBs and Virtual DBs constituted on-demand
from ad-hoc network or virtual private network (VPN) of mobile computers!
Server-side Platforms
-
Processors that continuously wait for and service requests
for data or application processing from clients
-
DBMSs operate on servers.
-
PCs, laptops, or mobile devices can be configured to operate
as a client, server, or both!
-
Servers organized following
-
Centralized Processing Architectures, as well as
-
According to data communication strategy.
-
Serverless messaging/communication
-
Peer-to-Peer (i.e., Client-to-Client, Client-to-Client-to-Client-to…)
-
Instant messaging systems like "ICQ" employ peer-to-peer
communication
-
Supports high bandwidth messaging, but doesn't scale up to
large numbers of peers.
-
Best for highly interactive or "bursty" applications
-
Also called, Point-to-Point or "Narrowcasting"
-
Multicast (i.e., shared messaging "trunks" which can
be hierarchically organized)
-
Shares messaging bandwidth, so scales up is manageable
-
Clients "subscribe" to messages "publish(ed)" by other clients/servers
-
No central server bottleneck
-
Difficult to manage
-
Still seems to consume substantial network bandwidth as subscriber
base grows
-
Seems to work best when "proxy" servers subscribe to multicast
message server, and clients communicate with these proxy servers
-
Proxy servers, gateways, brokers, as well as routers,
firewalls, etc.
-
A proxy server (or gateway, or broker) is just a special-purpose
or limited-purpose server that can filter or aggregate data transactions
or data flow between clients and application/storage servers
-
Proxy servers are commonly used to isolate and hide a DBMS
server behind a firewall to protect against attacks.
-
Coordinating activities, application services or data across
N-tier system architectures is major technical problem
-
Coordination demands tend to consume processing resources
and available bandwidth
Other Server Platforms
-
Query servers: virtual server that routes queries
and manages connections and data from multiple DBMSs.
-
Connection-less servers
-
The World-Wide Web is a repository architecture organized
as an uncoordinated, multiple-server information sharing system
-
The HyperText Transfer Protocol (HTTP) implements
a connection-less data communication protocol
-
A communication protocol is a computational framework
that implements a particular scheme for controlling the exchange of data
between communicating processors.
-
Telnet and SMTP (simple mail transfer protocol) are examples
of connection-oriented data communication protocols.
-
Microsoft .NET -- Software/DBM applications as Web-based
application services
-
HTTP -- Web-based object transfer protocol for transmitting
views of remote objects (Web pages, data entry forms, etc.)
-
XML -- eXtensible Markup Language for publishing or sharing
database schemas (data models) or data across repositories
-
SOAP -- Simple Object Access Protocol allows clients to access
remote/networked applications as services
-
UDDI -- Universal Description, Discovery, and Integration
is a distributed Web directory services (a registry or shared repository)
used to discover one another in order to interact and share information
-
WSDL -- Web Service Definition Language, a relatively new
("untried") approach to specifying application/DB services, where data
are specified using XML, data are transported via HTTP/SOAP, and UDDI provides
a registry which indicates the "address" (e.g., URL) for the data, applications,
users, repositories, etc.
-
At present, an "unproven" technology
-
To make .NET work requires the following software systems
-
.NET enterprise servers (SQL Server, Exchange Server, .etc)
-
.NET Web-based service framework (runtime environment, class
libraries, advanced ASP (HTTP+XML))
-
.NET application building block services (middleware for
identity (UDDI), notification (SOAP), schematized storage (ODBC))
-
MS Passport (password, license server, user profile, personal
calendar, contact list, your current location, etc.)
-
.NET does not provide for application data/service
routing, which is necessary for Web-based workflow or EBusiness.
-
Does this look like a vendor "lock-in" strategy?
-
The mono project is developing an open source version
of .NET
-
(Optional) Advanced Hybrid Peer-Server -- (Bleeding
Edge!!!)
-
Peer-to-Peer and Peer-Server and Client-Server, together.
-
Peers act as "servents" (SERVers and cliENTS) which may only
differ by access PORT attribute on URL
-
Example: http://www.gsm.uci.edu:80 (where "80" is
a port id)
-
Clients coordinate through servers to determine which
peers to interact with directly.
-
Requires concurrent use of multiple data communication protocols
(e.g., UDP, HTTP, Telnet, SMTP).
DB Application Integration Mechanisms
Enterprise Application Integration
Connectors
-
Goal: Maintain autonomy or isolate underlying database
to hide data heterogeniety (application data model, DBMS data model, format,
layout, etc.) when integrating to other databases or repositories, in order
to provide access transparency and scalability.
-
DBMS as the ultimate "fat" multi-application architecture
connector, via SQL-based API
-
Middleware: ORBs and Connectors that isolate or "pave
over" differences when accessing multiple remote DBMSs that embody vendor-proprietary
differences (e.g., MS SQL 7--RDBMS vs. Oracle 9i--ORDBMS)
-
CORBA Object Request Brokers for handling remote database
procedure (transaction) invocation:
-
API-style connectors
-
Open DataBase Connection (ODBC): an API-style connector
-
Java DataBase Connection (JDBC): a Java-based API-style connector
-
Compare API-style middleware connectors to protocols like
HTTP and connection-less servers.
-
Integration process overview:
-
Create database tables or schemas
-
Develop and compile "servlet" code
-
Register application and associated transaction type, query
type or application data type in integration directory server
-
Multimedia Databases:
-
Add their own special integration constraints, due to their
support and use of:
-
Multi-media data object types (e.g., audio (MP3, .wav, .au))
and video (.mov, .avi, .mpeg)
-
Special-purpose multi-media processors
-
Streaming media servers and clients
-
Real-time data streaming protocols (e.g., RSVP, RTSP)