Query Processing and Transaction
Design
- Primitive transaction types
- Primitive action types (open, close, copy, rename,
move)
- Create
- application datatype schema
- Some attributes may need constraints on Null
(i.e., missing) attribute values
- relation
- instance
- Update, Insert, Delete
- specified attribute values in designated table(s)
- Query: in SQL, queries are specified
using "Select" <attribute> "From" <table> "Where
" <one or more logical predicate expressions applied to instance values
of table attribute datatypes>
- Select <attribute>:
- From <table>
- Where <one or more logical predicate expressions
applied to instance values of table attribute datatypes>
-
Example: querying a three table database
- Transactions on databases are processed using Queries
- Querying multiple database tables
- Subquery (or Nested queries):
- "Select"..."From"..."Where"..."In" ("Select"..."From"..."Where"...<predicate
expressions applied to values of table attribute datatypes>)
- ("Select"..."From"..."Where"...<predicate
expressions applied to values of table attribute datatypes>) --
query the "bigger" table first, then
- "Select"..."From"..."Where"... -- use these results
to index the "smaller" sub-table
- Subqueries may be used to query multiple tables as
well.
-
Example: subquerying a three table database
- Join: using a query to construct a
view of relational data that links ("relates") data across two or more tables.
- Similar to subqueries in SQL
- "Select"..."From"..."Where" <first predicate expression>
"AND" <second predicate expression>...(followed by zero or
more "AND" <additional predicate expression(s)>
-
Example: querying a three table database with a join
- The result of a Join can either be displayed as a
user view, or can be used to derive new tables or databases
- Joins are one mechanism for constructing and
populating the database schemas used in a Data Warehouse.
- Report/Display
- may require output device (logical/physical) specific
command syntax
- Long-running Transactions -- transactions that take
hours, days, weeks, etc. to complete and commit
- Process or Workflow activities
- Maintaining persistent "state" information that supports
the logical routing of information work entities
- Order Fulfillment (cf. Amazon.com)
- Vendor Certification
- Not handled well by middleware or message passing
mechanisms that lack persistence.
- Nested transactions may be used to denote incremental
progress or "status" of larger long-running transaction
- Alternatively, progress-tracking table/attributes may
be used for long-running transactions.
- SQL Queries and Updates
- Procedures and Procedural Language extensions
- Data (attribute value) entry checking
- Triggers -- for implementing business rules
- Alerters or Alarms -- for sending notifications in
response to updates to designated table attribute values
- Stored Procedures
- Small special-purpose programs that perform complex
transactions, queries, or joins.
- Can play a central role in customizing Web pages/sites
for individual end users
- SQL Code Generators
- Query through User Interface Forms
- Query By Example (QBE): Table or Form-based
queries (similar to data entry forms on Web)
- Wild Cards
- Default values
- Query Optimization
- Query compilers
- Integrity constraints
- Query transaction parallelization
- Parallel queries and RAID storage
- Partitioning very large databases
- Performance considerations
- Poorly designed queries and database schemas can give
rise to significant performance delays in processing queries
- How much "navigation" (e.g., queries, joins) across
tables in required versus how much "action" (e.g., create, update, insert,
etc.) occurs and when.
- Complex navigation across multiple dispersed tables
constrained by complex predicate expressions often is "slower" than localized
actions
- Update transactions on views
which propagate updates back across multiple tables and indices (reverse
navigation) can be "very slow"
- Total system monitoring, transaction performance
(execution time) measurement and tuning may be necessary to determine where
performance improvements can be achieved
- Renormalization of database
- Redesign of database structure or transactions
- Repartitioning physical database across different
configuration of servers or storage devices
- Faster/parallel disk access to storage area network
server or network attached storage (JBOD, RAID, etc.)