Current location - Education and Training Encyclopedia - Graduation thesis - Database system implementation directory
Database system implementation directory
The publisher's words

Translator's order

Brief introduction of translator

Preface to publication

Chapter 1 Overview of dbms system

1. 1 development of database system

1. 1. 1 early database management system

1. 1.2 relational database system

1. 1.3 Smaller and smaller systems

1. 1.4 Larger and larger systems

1. 1.5 information integration

1.2 Overview of Database Management System

1.2. 1 data definition language command

1.2.2 Overview of query processing

1.2.3 main memory and buffer manager

1.2.4 transaction processing

1.2.5 query processor

1.3 book overview

1.4 database model and language review

1.4. 1 Review of relational model

1.4.2sql review

1.5 citations

The first part is the realization of database system.

Chapter II Auxiliary Warehouse Management

2. 1 memory hierarchy

2. 1. 1 memory hierarchy

2. 1.2 transferring data between memory levels

2. 1.3 volatile and nonvolatile memory

2. 1.4 virtual memory

2. 1.5 movement

2.2 disk

2.2. 1 disk structure

Disk controller

2.2.3 disk access characteristics

practise

2.3 Accelerate access to auxiliary memory

2.3. 1 i/o model for calculation

2.3.2 Organize data by cylinder.

Use multiple disks

Disk mirroring

2.3.5 disk scheduling and elevator algorithm

2.3.6 prefetching and large-scale buffering

practise

2.4 Disk failure

2.4. 1 intermittent fault

Checksum

Stable storage

2.4.4 Error handling ability of stable storage

2.4.5 Recovery from Disk Crash

2.4.6 as a mirror image of redundant technology

Parity check block

2.4.8 An improvement: raid 5

2.4.9 Handling when Multiple Disks Crash

2.4. 10 exercise

2.5 organize the data on the disk

2.5. 1 fixed length record

2.5.2 Placement of Fixed Length Records in Blocks

practise

2.6 Representation of Block and Record Address

2.6. 1 address in client-server system

Logical address and structure address

Pointer mixed writing

2.6.4 Block Return to Disk

2.6.5 pinned records and blocks

practise

2.7 Variable length data and records

2.7. 1 records with variable length fields

2.7.2 Records with duplicate fields

Variable format record

2.7.4 Records in the block cannot be loaded.

2.7.5blob

Chromatographic column storage

practise

2.8 Modification of records

2.8. 1 insert

delete

modify

practise

2.9 Summary

2. 10 references

Chapter III Exponential Structure

3. 1 indicator structure basis

3. 1. 1 sequential file

3. 1.2 Density index

3. 1.3 sparsity index

3. 1.4 multilevel index

3. 1.5 auxiliary index

3. 1.6 Application of auxiliary indicators

Indirect in 3. 1.7 auxiliary index

3. 1.8 document retrieval and inverted index

3. 1.9 movement

3.2b tree

3.2. 1b tree structure

3.2.2 Application of B-B Tree

3.2.3b- tree search

3.2.4 Range query

3.2.5b- tree insertion

3.2.6b- tree deletion

3.2.7b- Christmas Tree Efficiency

take exercise

3.3 Hash table

3.3. 1 secondary hash table

3.3.2 Insert hash table

3.3.3 Delete Hashtable

3.3.4 Efficiency of Hash Table Index

3.3.5 Extensible Hash Table

3.3.6 Insertion of Extensible Hash Table

3.3.7 linear hash table

3.3.8 Insert a linear hash table

take exercise

3.4 Multidimensional Index

3.4. 1 multidimensional indexing application

3.4.2 Performing Range Query with Traditional Index

3.4.3 Using traditional index to execute nearest neighbor query

3.4.4 Overview of Multidimensional Index Structure

3.5 Hash structure of multidimensional data

3.5. 1 grid file

Search grid file

Insertion of grid file

Performance of grid files

3.5.5 Piecewise Hash Function

3.5.6 Comparison between Grid File and Segment Hash

take exercise

3.6 Tree structure of multidimensional data

3.6. 1 multi-key index

3.6.2 Multi-key index performance

3.6.3kd tree

3.6.4kd tree operation

3.6.5 Make kd- tree suitable for auxiliary storage.

quadtree

R- tree

3.6.8r- Operation of Christmas Tree

take exercise

3.7 Bitmap Index

3.7. 1 bitmap index motivation

3.7.2 compressed bitmap

3.7.3 Operation of Segment Length Encoding Bit Vector

3.7.4 Bitmap Index Management

take exercise

3.8 Summary

3.9 References

Chapter 4 Query Execution

4. 1 Introduction of physical query plan operator

4. 1. 1 scan table

4. 1.2 Sorting when scanning tables

4. 1.3 physical operator calculation model

4. 1.4 parameters to measure the cost

4. I/o cost of1.5 scan operator

4. 1.6 Implementing iterator of physical operators

4.2 Single stroke algorithm

4.2. 1 One-pass algorithm for once cell group operation

4.2.2 One-step algorithm for unary operation of global relations

4.2.3 One-stroke algorithm for binary operation

take exercise

4.3 Nested loop connection

4.3. 1 nested cyclic join based on tuples

4.3.2 Nested Loop Connection Iterator Based on Tuples

4.3.3 Block-based Nested Cyclic Join Algorithm

4.3.4 Nested loop connection analysis

4.3.5 Summary of algorithms so far

take exercise

4.4 Two-pass algorithm based on sorting

4.4. 1 two-level multiple merge sorting

4.4.2 Use sorting to eliminate duplication

4.4.3 Grouping and aggregation by sorting

4.4.4 Joint algorithm based on sorting

4.4.5 intersection algorithm based on sorting

4.4.6 Simple connection algorithm based on sorting

4.4.7 Simple Sorting Connection Analysis

4.4.8 More effective connection based on sorting

4.4.9 Overview of algorithms based on sorting

4.4. 10 exercise

4.5 Hash-based two-pass algorithm

4.5. 1 Dividing relationships by hashing

4.5.2 Hash-based deduplication algorithm

4.5.3 Hash-based grouping and aggregation algorithm

4.5.4 Algorithm of Union, Intersection and Difference Based on Hash

Hash join algorithm

4.5.6 Save some disk I/O.

4.5.7 Overview of Hash-based Algorithms

take exercise

4.6 Index-based algorithm

4.6. 1 clustered index and nonclustered index

4.6.2 Index-based Selection

Use indexed connections.

4.6.4 Connections using ordered indexes

take exercise

4.7 Buffer management

4.7. 1 buffer management structure

Buffer management strategy

4.7.3 Relationship between physical operator selection and buffer management

take exercise

4.8 Use the algorithm more than twice.

4.8. 1 Multi-pass algorithm based on sorting

4.8.2 Performance of multi-pass algorithm based on sorting

4.8.3 Multi-pass algorithm based on hash

4.8.4 Performance of multi-pass algorithm based on hash

take exercise

4.9 Summary

4. 10 references

Chapter 5 Query Compiler

5. 1 syntax analysis and preprocessing

5. 1. 1 Parsing and Parsing Tree

5. Syntax of simple subset of1.2 SQL

5. 1.3 preprocessor

5. 1.4 preprocessing queries involving views

5. 1.5 movement

5.2 Improve the Algebraic Rule of Query Planning

5.2. 1 commutative law and associative law

5.2.2 Law on Choice

Push down selection

5.2.4 Laws on Projection

5.2.5 Laws on Connections and Products

5.2.6 Laws on eliminating duplication

5.2.7 Law on grouping and aggregation

take exercise

5.3 From Parsing Tree to Logical Query Plan

5.3. 1 into relational algebra

5.3.2 Delete the subquery from the condition.

5.3.3 Improvement of Logical Query Plan

5.3.4 Grouping of Combinable/Assignable Operators

take exercise

5.4 Operating cost estimation

5.4. 1 estimation of intermediate relation size

5.4.2 Estimated operation scale

5.4.3 Selection of operation scale estimation

5.4.4 Estimation of connection operation scale

5.4.5 Natural connection of multi-connection attributes

5.4.6 Connection of Multiple Relationships

5.4.7 Estimation of other operation scale

take exercise

5.5 Introduction of cost-based planning options

5.5. 1 Obtain the estimated value of the size parameter.

5.5.2 Calculation of statistics

5.5.3 Heuristic estimation to reduce the cost of logical query planning

5.5.4 Methods of enumerating physical plans

take exercise

5.6 Selection of connection sequence

5.6. The meaning of1left and right parameter connection

5.6.2 Connection Tree

5.6.3 Left deep junction tree

5.6.4 Select the connection sequence and grouping through dynamic planning.

5.6.5 Dynamic Programming with More Specific Cost Function

5.6.6 Greedy algorithm for selecting connection order

take exercise

5.7 Complete the physical query scheme selection.

5.7. 1 Select the selection method.

Select connection method

5.7.3 Pipelining and concretization

5.7.4 unary pipeline operation

5.7.5 Pipeline operation of binary operation

5.7.6 Symbol of physical query plan

Classification of physical operations

take exercise

5.8 Summary

5.9 References

Chapter VI System Fault Countermeasures

6. 1 problems and models that can be resumed.

6. 1. 1 failure mode

6. Further discussion on1.2 transaction

6. 1.3 Correct execution of the transaction

6. Original operation of1.4 transaction

6. 1.5 movement

6.2 Undo log

6.2. 1 logging

6. 2. 2 Undo the log rule

6.2.3 Recovering with Undo Log

checking point

Nonstationary checkpoint

practise

6.3 Redo Log

6. 3. 1 Redo Log Rules

6.3.2 Use redo logs for recovery.

6. 3. 3 Redo checkpoint of redo log

6.3.4 Use the redo log with checkpoint for recovery.

practise

6.4 Undo/Redo Log

Undo/Redo Rule

6.4.2 Recovery by using Undo/Redo Log

6.4.3 checkpoint of undo/redo log

take exercise

6.5 Media fault protection

6.5. 1 backup

Nonstationary dump

6.5.3 Use backup and log for recovery.

practise

6.6 summary

6.7 references

Chapter 7 Concurrency Control

7. 1 serial scheduling and serializable scheduling

7. 1. 1 scheduling

7. 1.2 serial scheduling

7. 1.3 serializable scheduling

7. 1.4 Influence of transaction semantics

7. 1.5 A symbol for transaction and scheduling

7. 1.6 movement

7.2 conflict serializability

6.5438+0 conflict

7.2.2 priority map and conflict serializability judgment

7.2.3 Reasons why priority graph test plays a role

take exercise

7.3 Serializable implementation using locks

7.3. 1 lock

7.3.2 Blocking scheduler

7.3.3 Two-stage blockade

7.3.4 Reasons for the effectiveness of the two-stage blockade

practise

7.4 Locking system with multiple locking modes

7.4.1* * Enjoy locks and exclusive locks.

Compatibility matrix

Lock upgrade

Update lock

Incremental lock

practise

7.5 architecture of blocking scheduler

7.5. 1 Dispatcher performs insertion locking action.

Lock watch

take exercise

7.6 Hierarchy of database elements

7.6. 1 Multi-granularity lock

Warning lock

7.6.3 Correct treatment of illusion and insertion

practise

7.7 Tree Protocol

7.7. 1 tree blockade motivation

7.7.2 Rules for accessing tree structure data

7.7.3 The reason why the tree protocol works.

practise

7.8 concurrency control using timestamps

7.8. 1 timestamp

7.8.2 Behavior that cannot be realized in fact.

7.8.3 Dirty data problem

7.8.4 Scheduling Rules Based on Timestamp

Multi-version timestamp

Timestamp and blocking

practise

7.9 Concurrency Control Using Validity Confirmation

Structure of scheduler based on validity confirmation

Validity confirmation rule

7.9.3 Comparison of Three Concurrency Control Mechanisms

practise

7. 10 summary

7. 1 1 reference

Chapter VIII Re-discussion on Affairs Management

8. 1 continuity and recoverability

8. 1. 1 Dirty data problem

8. 1.2 cascade rollback

8. 1.3 can resume scheduling.

8. 1.4 scheduling to avoid cascading rollback

8. 1.5 Rollback Management Based on Lock

8. 1.6 Group Submission

8. 1.7 logical log

8. 1.8 Recovering from the logical log

8. 1.9 movement

8.2 deadlock

8.2. 1 timeout deadlock detection

8.2.2 Waiting for the chart

8.2.3 Preventing Deadlock by Ordering Elements

8.2.4 Deadlock Detection by Timestamp

8.2.5 Comparison of Deadlock Management Methods

practise

8.3 Long-term business

8.3. 1 Long Transaction Problem

8.3.2saga (continuous recording)

8.3.3 Compensation affairs

8.3.4 Reasons why the salary problem plays a role

practise

8.4 Summary

8.5 references

Chapter 9 Parallel and Distributed Databases

9. Parallel algorithm of1

9. 1. 1 parallel mode

9. 1.2 Parallel operation, one tuple at a time.

9. 1.3 Parallel Algorithm for Total Relation Operation

Performance of 9. 1.4 parallel algorithm

9. 1.5 movement

9.2 map? Reduce parallel architecture

9.2. 1 storage mode

Mapping function

9.2.3 Reduction function

take exercise

9.3 Distributed Database

data distribution

Distributed transaction

9.3.3 Data Replication

take exercise

9.4 Distributed Query Processing

9.4. 1 distributed connection operation problem

Semi-connected simplification

9.4.3 Connection of Multiple Relationships

9.4.4 acyclic hypergraph

9.4.5 Complete Simplification of Acyclic Hypergraphs

9.4.6 Why is the completely simplified algorithm effective?

practise

9.5 distributed submission

9.5. 1 supports distributed atomicity.

9.5.2 Two-stage submission

9.5.3 Recovery of Distributed Transactions

take exercise

9.6 Distributed Blockade

9.6. 1 centralized block system

9.6.2 Cost Model of Distributed Blocking Algorithm

9.6.3 Blocking elements with multiple copies

9.6.4 Block the master copy

9.6.5 Global lock composed of local locks

practise

9.7 Point-to-Point Distributed Search

9.7. 1 Peer-to-Peer Network

9.7.2 Distributed Hash Problem

9.7.3 Centralized solution of distributed hash

9.7.4 Circle with rope

9.7.5 Link on chord circle

9.7.6 Look it up with a finger table.

9.7.7 Add a new node

9.7.8 When the terminal leaves the network,

9.7.9 When one end collapses,

9.7. 10 exercise

9.8 Summary

9.9 references

The second part is the theme of modern database system.

Chapter 10 Information Integration

10. 1 Introduction to Information Integration

10. 1. 1 Why do you want to integrate information?

10. 1.2 Heterogeneity

10.2 information integration mode

10.2. 1 federated database system

10.2.2 data warehouse

10. 2. 3 intermediary

10.2.4 exercise

10.3 Wrapper in Mediation-based System

10.3. 1 query pattern template

10.3.2 packaging generator

10.3.3 filter

10.3.4 Other operations on the package

10.3.5 exercise

10.4 capacity-based optimization

10.4. 1 limited data source capacity

10.4.2 symbol describing the function of data source.

10.4.3 Query Plan Selection Based on Ability

10.4.4 Add cost-based optimization.

10.4.5 exercise

10.5 optimize the intermediary query

10.5. 1 Simplified modifier

10.5.2 get the answer of the sub-goal.

10. 5. 3 chain algorithm

10.5.4 combine and view mediations.

10.5.5 exercise

10.6 uses the local as the intermediary of the view.

10.6. 1lav mediator's motivation

10. 6. 2 legal mediator terminology

10.6.3 extended solution

10.6.4 contains a federated query.

10.6.5 Why is the inclusion mapping test valid?

10.6.6 Discover the solution of mediator query.

10.6.7 Why does the lmss theorem hold?

10.6.8 exercise

10.7 entity analysis

10.7. 1 Determines whether the record represents an * * * entity.

10.7.2 merge similar records.

Similarity and Useful Properties of 10.7.3 Merging Function

10.7.4icar recorded by icar? Swoosh algorithm

10.7.5 Why R? Swoosh algorithm will be effective.

10.7.6 Other methods of entity resolution

10.7.7 exercise

10.8 summary

10.9 references

Chapter 1 1 Data Mining

1 1. 1 frequent itemsets mining

11.1.1market-shopping basket model

Basic definition of 1 1. 1.2

1 1. 1.3 association rules

1 1. 1.4 calculation model of frequent itemsets

1 1. 1.5 motion

1 1.2 Algorithm for Finding Frequent Itemsets

1 1.2. 1 the distribution of frequent itemsets

1 1.2.2 Naive algorithm for finding frequent itemsets

1 1.2.3a? apriori algorithm

1 1.2.4a? Realization of prior algorithm

1 1.2.5 Make better use of main memory

1 1.2.6 When to use pcy algorithm?

1 1.2.7 multilevel algorithm

1 1.2.8 motion

1 1.3 Find similar products.

Jaccard measure of 1 1.3. 1 similarity

11.3.2 Application of Jaccard Similarity

1 1.3.3 Minimum hash

1 1.3.4 Similarity between minimum hash and jaccard

1 1.3.5 Why can the minimum hash be used to estimate the similarity?

1 1.3.6 Implementation of Minimum Hash

1 1.3.7 movement

1 1.4 local sensitive hash

1 1.4. 1lsh Example: Entity Resolution

Local sensitive hash of 1 1.4.2 tag

1 1.4.3 combination of minimum hash method and local sensitive hash.

1 1.4.4 motion

1 1.5 large-scale data clustering

The application of 1 1.5. 1 clustering

1 1.5.2 Definition of distance

1 1.5.3 Cohesive clustering

1 1.5.4k? Mean value algorithm

Massive data 1 1.5.5 K? Means and methods

1 1.5.6 Processing after memory is full.

1 1.5.7 motion

1 1.6 summary

1 1.7 references

12 Chapter Database System and Internet

12. 1 search engine architecture

12. 1. 1 search engine composition

12. 1.2 web crawler

12. 1.3 query processing in search engines

12. 1.4 ranking pages.

Pagerank used to identify important web pages.

12.2. 1pagerank's intuitive thinking

The Recursive Formula of 12.2.2pagerank —— A Preliminary Attempt

12.2.3 crawler traps and dead ends

12.2.4 pagerank considering reptile traps and dead ends

12.2.5 exercise

12.3 pagerank for specific topics

12.3. 1 "Long-distance Mobile" Collection

12.3.2 Calculate the pagerank related to the topic.

12.3.3 link cheating

12.3.4 pagerank and link cheating related to the topic

12.3.5 exercise

12.4 data stream

12.4. 1 data flow management system

12.4.2 data stream application

12.4.3 data flow data model

12.4.4 data stream conversion to relationship

12.4.5 relationship into data stream.

12.4.6 exercise

12.5 data stream mining

12.5. 1 motivation

12.5.2 statistics of binary numbers

12.5.3 Calculate the number of different elements.

12.5.4 exercise

12.6 summary

12.7 references