ch07
dz / designing_data_intensive_applications / ch07Summary
Chapter 7: Transactions
Node Tree
- single_node
-
transactions
-
ACID
- BASE
- atomicity
- best_effort
- consistency
- durability
- implementations_not_equal
-
isolation
- concurrency_problems_multiusers
- concurrent_tasks_isolated
- isolation_formalized_serializability
-
lost_update_problem
- atomic_updates
- auto_detecting_lost_updates
- compare_and_set
- explicit_locking
- generalization_lost_update
- later_write_clobbers_earler
- read_modify_write
- replicated_lost_update
- timing_dirty_write_LU_anomaly
- write_skew
- snapshot_isolation_oracle
- terminology_fault_tolerant_mechanisms
- abort_reduces_many_errors
- group_rw_logical_unit
- multi_object_transactions
- no_partial_failure
- safety_guarantees
- simplify_programming_model
-
transaction_isolation
- concurrency_bugs_difficult
-
read_committed
-
no_dirty_reads
-
no_dirty_writes
- delay_2nd_write_until_1st_done
- timing_dirty_write_LU_anomaly
- read_committed_impl
-
no_dirty_writes
-
no_dirty_writes
- delay_2nd_write_until_1st_done
- timing_dirty_write_LU_anomaly
- read_committed_impl
- read_skew
-
no_dirty_reads
- serializable_isolation
-
ACID
Nodes
| transactions | |
| content | Transactions |
| children | ACID, abort_reduces_many_errors, group_rw_logical_unit (description), multi_object_transactions, no_partial_failure, safety_guarantees, simplify_programming_model (why transactions were created), transaction_isolation |
| group_rw_logical_unit | |
| content | Groups reads and writes into logical unit |
| parents | transactions |
| simplify_programming_model | |
| content | Created to simplify programming model for apps accessing database |
| parents | transactions |
| no_partial_failure | |
| content | No partial failure. Safe to retry. |
| parents | transactions |
| safety_guarantees | |
| content | Safety Guarantees: databse handles certain potential error scenarios and concurrency issues |
| parents | transactions |
| ACID | |
| content | ACID: Atomicity, Consistency, Isolation, Durability |
| children | BASE (BASE is a less strict version of ACID), atomicity, best_effort, consistency, durability, implementations_not_equal, isolation, terminology_fault_tolerant_mechanisms |
| parents | transactions |
| terminology_fault_tolerant_mechanisms | |
| content | An effort to establish precise terminology for fault-tolerant mechanisms in databases |
| parents | ACID |
| implementations_not_equal | |
| content | Implementations not equal |
| children | isolation_ambiguity |
| parents | ACID |
| isolation_ambiguity | |
| content | "Isolation" ambiguity |
| children | SQL_isolation_ambiguous |
| parents | implementations_not_equal |
| BASE | |
| content | BASE: Basically available Soft-state Eventual consistency |
| parents | ACID |
| atomicity | |
| content | Atomicity |
| children | atomic, atomic_transaction, not_about_concurrency |
| parents | ACID |
| atomic | |
| content | Atomic: something that cannot be broken down into smaller parts |
| parents | atomicity |
| not_about_concurrency | |
| content | Not about concurrency in a DB context |
| parents | atomicity |
| atomic_transaction | |
| content | Atomic Transaction: if writes in a transaction can't be completed due to fault, abort and discard writes |
| children | abortability |
| parents | atomicity |
| abortability | |
| content | "abortability" |
| parents | atomic_transaction |
| consistency | |
| content | consistency |
| children | depends_app_invariants, invariants |
| parents | ACID |
| invariants | |
| content | Invariants: statements about data that must always be true |
| children | depends_app_invariants |
| parents | consistency |
| depends_app_invariants | |
| content | Depends on application's notion of invaraints |
| children | property_of_app_not_db |
| parents | invariants, consistency |
| property_of_app_not_db | |
| content | Property of application, not database |
| parents | depends_app_invariants |
| isolation | |
| content | Isolation |
| children | concurrency_problems_multiusers, concurrent_tasks_isolated, isolation_formalized_serializability, lost_update_problem, snapshot_isolation_oracle |
| parents | ACID |
| concurrent_tasks_isolated | |
| content | Concurrently executing tasks are isolated from one another |
| parents | isolation |
| concurrency_problems_multiusers | |
| content | Concurrency problems with multiple users |
| parents | isolation |
| isolation_formalized_serializability | |
| content | Classical database textbooks formalize isolation as serializability |
| children | perf_penalty, serializability |
| parents | isolation |
| serializability | |
| content | Serializability: each transaction can pretend it is the only one running on the entire database |
| children | two_phase_locking |
| parents | isolation_formalized_serializability |
| perf_penalty | |
| content | Performance penalty: rarely used. |
| parents | isolation_formalized_serializability |
| snapshot_isolation_oracle | |
| content | Snapshot Isolation (Oracle) |
| parents | isolation, snapshot_isolation |
| durability | |
| content | Durability |
| children | WAL, data_written_not_forgotten, nv_storage, repl_wait_copy_before_success |
| parents | ACID |
| nv_storage | |
| content | Non-volatile storage |
| parents | single_node, durability |
| WAL | |
| content | WAL (write-ahead-log) |
| parents | durability, single_node, even_crash_hardware_fault |
| repl_wait_copy_before_success | |
| content | Replicated database: wait for data to copy to nodes before reporting success |
| parents | durability |
| data_written_not_forgotten | |
| content | Promise: data written in successful transaction will not be forgotten |
| children | even_crash_hardware_fault |
| parents | durability |
| even_crash_hardware_fault | |
| content | Even with crashes or hardware faults |
| children | WAL |
| parents | data_written_not_forgotten |
| single_node | |
| content | Single-node |
| children | WAL, nv_storage |
| multi_object_transactions | |
| content | multi-object transactions |
| children | determine_which_ops_belong_to_transaction, ensures_foreign_keys_remain_valid, keeps_denorm_data_synced, single_object_writes |
| parents | transactions |
| single_object_writes | |
| content | Single-object writes |
| children | provides_atomicity_single_objects |
| parents | multi_object_transactions |
| determine_which_ops_belong_to_transaction | |
| content | requires some way of determining which read/write operations belong to same transaction |
| parents | multi_object_transactions |
| provides_atomicity_single_objects | |
| content | Usually provides atomicity for single objects |
| parents | single_object_writes |
| ensures_foreign_keys_remain_valid | |
| content | In relational model, multi-object transactions ensure that foreign keys remain valid |
| parents | multi_object_transactions |
| keeps_denorm_data_synced | |
| content | Helps to keep denormalized data in sync |
| parents | multi_object_transactions |
| best_effort | |
| content | Best Effort: databse does as much as it can, won't undo changes it has already done if it runs into error |
| children | leaderless_repl (useful in leaderless replication schemes (I think?)) |
| parents | ACID |
| leaderless_repl | |
| content | Leaderless replication |
| parents | best_effort |
| transaction_isolation | |
| content | Transaction Isolation |
| children | concurrency_bugs_difficult (reason they were created), read_committed, serializable_isolation |
| parents | transactions |
| concurrency_bugs_difficult | |
| content | Created because concurrency bugs are hard to find by testing |
| parents | transaction_isolation |
| serializable_isolation | |
| content | serializable isolation: database guarantees that transactions have same effect as if they ran serially |
| children | SSI, actual_serial_execution, performance_cost, si_mantra, strongest_isolation_level |
| parents | transaction_isolation |
| read_committed | |
| content | read committed |
| children | no_dirty_reads, no_dirty_writes, read_committed_impl, read_skew |
| parents | transaction_isolation |
| no_dirty_reads | |
| content | Reads: only see committed (AKA no dirty reads) |
| children | no_dirty_writes (related), read_committed_impl |
| parents | read_committed |
| no_dirty_writes | |
| content | Writes: only overwrite data that has been comitted (AKA no dirty writes) |
| children | delay_2nd_write_until_1st_done, timing_dirty_write_LU_anomaly |
| parents | read_committed, no_dirty_reads |
| performance_cost | |
| content | performance cost |
| children | weaker_isolation_levels (alternative solution) |
| parents | serializable_isolation |
| weaker_isolation_levels | |
| content | Weaker levels of isolation that protect aginst only some concurrency bugs |
| children | subtle_bugs, throughput_and_response_worse_than_weak_isolation |
| parents | performance_cost |
| subtle_bugs | |
| content | Can lead to subtle bugs |
| parents | weaker_isolation_levels |
| delay_2nd_write_until_1st_done | |
| content | Usually delays second write until first write is either committed or aborted |
| children | row_level_locks |
| parents | no_dirty_writes |
| row_level_locks | |
| content | Row-level locks |
| children | two_phase_locking (similar) |
| parents | delay_2nd_write_until_1st_done |
| read_skew | |
| content | Read Skew / Non-repeatable Read |
| children | reads_inconsistent_state, snapshot_isolation |
| parents | read_committed |
| reads_inconsistent_state | |
| content | Anomaly where databse is read in an inconsistent state |
| parents | read_skew |
| read_committed_impl | |
| content | For every object written, DB remembers old comitted and new value (set by transaction holding write lock). Old value is returned while transaction ongoing, new value returned when transaction committed |
| children | impl_generalization_of_dirty_read (description of dirty read prevention mechanism) |
| parents | read_committed, no_dirty_reads |
| snapshot_isolation | |
| content | Snapshot Isolation: each transaction reads from a consistent snapshot of the database |
| children | SSI, repeatable_read, serializable_oracle (AKA), snapshot_isolation_oracle, visibility_rules |
| parents | read_skew |
| visibility_rules | |
| content | Visibility Rules |
| parents | snapshot_isolation |
| impl_generalization_of_dirty_read | |
| content | Implemenation is generalization of dirty read prevention mechanism |
| children | MVCC |
| parents | read_committed_impl |
| MVCC | |
| content | MVCC: multi-version concurrency control |
| children | indices_multiversion_db, postgres_GC, stale_MVCC |
| parents | impl_generalization_of_dirty_read |
| indices_multiversion_db | |
| content | Indices in multiversion DB |
| children | append_only_b_trees, postgres_optimizations_avoid_index_updates |
| parents | MVCC |
| postgres_GC | |
| content | PostgreSQL: use garbage collection on rows marked for deletion |
| parents | MVCC |
| postgres_optimizations_avoid_index_updates | |
| content | PostgreSQL: optimizations for avoiding index updates if idfferent versions of same object can fit on same page |
| parents | indices_multiversion_db |
| append_only_b_trees | |
| content | append-only / copy-on-write B-Trees (CouchDB, Datomic, LMDB) |
| parents | indices_multiversion_db |
| serializable_oracle | |
| content | Serializable (Oracle) |
| parents | snapshot_isolation |
| repeatable_read | |
| content | Repeatable Read (PostreSQL, MySQL) |
| children | SQL_standard_repeatable_read |
| parents | snapshot_isolation |
| SQL_standard_repeatable_read | |
| content | SQL Standard doesn't have concept of Snapshot Isolation (wasn't invented yet), Repeatable Read meets requirements of standard. |
| children | SQL_isolation_ambiguous |
| parents | repeatable_read |
| SQL_isolation_ambiguous | |
| content | Standard's definition of isolation ambiguous, and imprecise, different database implementations have different guarantees |
| parents | SQL_standard_repeatable_read, isolation_ambiguity |
| lost_update_problem | |
| content | Lost Update Problem: if two concurrent transactions perform read-modify-write-cycle, one of the writes will be lost. |
| children | atomic_updates (solution), auto_detecting_lost_updates, compare_and_set, explicit_locking (solution), generalization_lost_update, later_write_clobbers_earler, read_modify_write, replicated_lost_update, timing_dirty_write_LU_anomaly, write_skew |
| parents | isolation |
| read_modify_write | |
| content | read-modify-write: read a value, modify the value, then write the new value back to the DB. |
| children | allows_parallel_RMW |
| parents | lost_update_problem |
| later_write_clobbers_earler | |
| content | Later write clobbers the earlier write |
| parents | lost_update_problem |
| atomic_updates | |
| content | Atomic Updates |
| children | cursor_stability |
| parents | lost_update_problem |
| cursor_stability | |
| content | Cursor Stability: Take exclusive lock on object when it is being read so that no transaction can read it until the update has been applied. |
| parents | atomic_updates |
| explicit_locking | |
| content | Explicit Locking |
| children | app_explicitely_locks_objects |
| parents | lost_update_problem |
| app_explicitely_locks_objects | |
| content | Application explicitely locks objects to updated |
| parents | explicit_locking |
| compare_and_set | |
| content | Compare-and-set |
| children | in_DBs_without_transactions, update_only_if_value_unchanged |
| parents | lost_update_problem |
| auto_detecting_lost_updates | |
| content | auto-detecting lost updates: on detection, abort transaction and retry. |
| children | allows_parallel_RMW |
| parents | lost_update_problem |
| in_DBs_without_transactions | |
| content | in Databases that don't provide transactions |
| parents | compare_and_set |
| allows_parallel_RMW | |
| content | Allows read-modify-write cycles to run in parallel |
| parents | auto_detecting_lost_updates, read_modify_write |
| update_only_if_value_unchanged | |
| content | Update value only if value has not changed since it was last read. |
| parents | compare_and_set |
| replicated_lost_update | |
| content | Replicated: allow concurent writes, conflicting versions (siblings), use application code or special data structures to resolve later. |
| parents | lost_update_problem |
| write_skew | |
| content | Write Skew |
| children | generalization_lost_update, phantom, premise, protects_against_write_skew_phantom, two_reads_same_object_some_updates |
| parents | lost_update_problem |
| generalization_lost_update | |
| content | Generalization of lost update problem |
| parents | lost_update_problem, write_skew |
| two_reads_same_object_some_updates | |
| content | Can happen if two transactions read same objects, then update some of those objects |
| children | timing_dirty_write_LU_anomaly |
| parents | write_skew |
| timing_dirty_write_LU_anomaly | |
| content | Depending on timing: it can either be a dirty write or a lost update anomaly |
| parents | no_dirty_writes, two_reads_same_object_some_updates, lost_update_problem |
| phantom | |
| content | phantom |
| children | materializing_conflicts, predicate_lock, protects_against_write_skew_phantom, write_changes_result_in_other_trans (description) |
| parents | write_skew |
| write_changes_result_in_other_trans | |
| content | Write in one transactions changes the result in another transaction. |
| parents | phantom |
| materializing_conflicts | |
| content | Materializing Conflicts |
| children | phantom_to_lock_conflict (description) |
| parents | phantom |
| phantom_to_lock_conflict | |
| content | Takes phantom and turns it into a lock conflict on concrete set of rows |
| parents | materializing_conflicts |
| strongest_isolation_level | |
| content | Strongest Isolation Level |
| parents | serializable_isolation |
| actual_serial_execution | |
| content | Actual Serial Execution |
| children | every_trans_small_fast, interactive_multistatements_forbidden, limited_to_in_memory, low_write_throughput, stored_procedure |
| parents | serializable_isolation |
| interactive_multistatements_forbidden | |
| content | Interactive multi-statement transactions not allowed in single-trhead serial transaction processing (for performance reasons) |
| parents | actual_serial_execution |
| stored_procedure | |
| content | stored procedure: entire transaction code submitted to databse ahead of time |
| parents | actual_serial_execution |
| every_trans_small_fast | |
| content | Every transaction must be small and fast |
| parents | actual_serial_execution |
| limited_to_in_memory | |
| content | Limited to usecases where dataset fits in memory |
| parents | actual_serial_execution |
| low_write_throughput | |
| content | Write throughput must be low enough for single CPU core, otherwise use partitioning |
| parents | actual_serial_execution |
| two_phase_locking | |
| content | Two-Phase Locking (2PL) |
| children | deadlock, lock_impl, many_reads_until_write_then_exclusive (description), not_two_phase_commit (not to be confused with), pessimistic_concurrency, phases (origin of name), predicate_lock, writers_block_readers_vice_versa |
| parents | serializability, row_level_locks |
| not_two_phase_commit | |
| content | not to be confused with two-phase commit (2PC) |
| parents | two_phase_locking |
| many_reads_until_write_then_exclusive | |
| content | Severual concurrent reads allowed until write occurs, then, exclusive access |
| parents | two_phase_locking |
| writers_block_readers_vice_versa | |
| content | Writers block other readers and vice versa |
| parents | two_phase_locking, si_mantra |
| si_mantra | |
| content | Serializable Isolation Mantra: readers nver block writers |
| children | writers_block_readers_vice_versa (contrasts) |
| parents | serializable_isolation |
| lock_impl | |
| content | Lock Implementation |
| children | exclusive_mode, index_range_locking, lock_held_until_trans_end, phases, predicate_lock (similar to shared/exclusive lock), shared_mode |
| parents | two_phase_locking |
| shared_mode | |
| content | Shared Mode |
| children | transaction_reads_objects, upgrade_shared_to_exclusive |
| parents | lock_impl |
| exclusive_mode | |
| content | Exclusive Mode |
| children | transaction_writes_objects, upgrade_shared_to_exclusive |
| parents | lock_impl |
| transaction_reads_objects | |
| content | Transaction reads objects |
| parents | shared_mode, transaction_writes_objects |
| transaction_writes_objects | |
| content | Transaction writes objects |
| children | transaction_reads_objects |
| parents | exclusive_mode |
| upgrade_shared_to_exclusive | |
| content | When a transaction reads an object, then writes to it, it gets upgraded from shared mode to exclusive mode |
| parents | shared_mode, exclusive_mode |
| lock_held_until_trans_end | |
| content | Acquired lock must be held until the end of the transaction |
| parents | lock_impl |
| phases | |
| content | Phase 1: acquire lock. Phase 2: release lock |
| parents | lock_impl, two_phase_locking |
| deadlock | |
| content | Deadlock: Transaction A stuck waiting for Transaction B to finish and release lock, and vice versa |
| children | auto_deadlock_detection |
| parents | two_phase_locking |
| auto_deadlock_detection | |
| content | Automatic Deadlock Detection |
| parents | deadlock |
| throughput_and_response_worse_than_weak_isolation | |
| content | throughput and response times signficantly worse compared to weak isolation |
| children | reduced_concurrency |
| parents | weaker_isolation_levels |
| reduced_concurrency | |
| content | Reduced Concurrency |
| parents | throughput_and_response_worse_than_weak_isolation |
| predicate_lock | |
| content | Predicate Lock: lock that bleongs to all objects that match search condition |
| children | better_than_predicate |
| parents | phantom, lock_impl, two_phase_locking |
| index_range_locking | |
| content | Index-range locking |
| children | better_than_predicate, lock_table_as_safe_fallback, next_key_locking (AKA), simplify_predicate (description) |
| parents | lock_impl |
| better_than_predicate | |
| content | Better than predicate locking |
| children | simplify_predicate |
| parents | index_range_locking, predicate_lock |
| protects_against_write_skew_phantom | |
| content | Protects aginst write skew and phantoms |
| parents | write_skew, phantom |
| next_key_locking | |
| content | next-key locking |
| parents | index_range_locking |
| simplify_predicate | |
| content | Simplify predicate by making it match greater set of objects |
| parents | better_than_predicate, index_range_locking |
| lock_table_as_safe_fallback | |
| content | Safe (but expensive) Fallback: shared lock on entire table if not suitable range found |
| parents | index_range_locking |
| SSI | |
| content | Serializable Snapshot Isolation (SSI) |
| children | SSI_overview, abort_rate_affects_perf, full_serializability_small_penalty, granular_tracking_more_overhead, optimistic_concurrency |
| parents | serializable_isolation, snapshot_isolation |
| full_serializability_small_penalty | |
| content | Provides full serializability and only a small performance penalty compared to snapshop isolation |
| parents | SSI |
| pessimistic_concurrency | |
| content | Pessimistic Concurrency Control: if something goes wrong, better to wait until it's safe to do something |
| children | mutex (similar), optimistic_concurrency (vs) |
| parents | two_phase_locking |
| mutex | |
| content | Mutal Exclusion |
| parents | pessimistic_concurrency |
| premise | |
| content | Premise: fact that was true at beginning of transaction |
| children | db_detects_outdated_premise_abort |
| parents | write_skew |
| optimistic_concurrency | |
| content | Optimistic Concurrency Control: keep going if things go wrong in the hope that everything willl turn out all right |
| children | poor_perf_high_contention, usually_better_than_pessimistic |
| parents | pessimistic_concurrency, SSI |
| poor_perf_high_contention | |
| content | Performs badly if there is high contention (many transactions trying to access same object) |
| parents | optimistic_concurrency |
| usually_better_than_pessimistic | |
| content | Usually better than pessimistic concurrency control |
| parents | optimistic_concurrency |
| SSI_overview | |
| content | Builds on top of snapshot isolation, detects serialization conflicts, determines which transactions to abort |
| children | db_detects_outdated_premise_abort |
| parents | SSI |
| db_detects_outdated_premise_abort | |
| content | Database must detect outdated premises and abort |
| children | stale_MVCC |
| parents | SSI_overview, premise |
| stale_MVCC | |
| content | Stale MVCC |
| parents | MVCC, db_detects_outdated_premise_abort |
| abort_rate_affects_perf | |
| content | Abort rate significantly affects performance |
| parents | SSI |
| granular_tracking_more_overhead | |
| content | More granular in tracking transactions yields more bookkeeping overhead |
| parents | SSI |
| abort_reduces_many_errors | |
| content | Reduces large class of errors to do transaction abort |
| parents | transactions |