designing_data_intensive_applications/ch08

ch08

dz / designing_data_intensive_applications / ch08

Summary

Chapter 8: The Trouble With Distributed Systems

Node Tree

Nodes

faults_partial_failures
content Faults and Partial Failures
children partial_failure, unreliable_networks, detecting_faults, deterministic

deterministic
content Deterministic: same opereration produces same result
children nondeterministic (vs)
parents faults_partial_failures

partial_failure
content Partial Failure: parts of distributed system broken in unpredictable way, other parts working fine
children makes_dist_systems_difficult, nondeterministic
parents faults_partial_failures

nondeterministic
content nondeterministic
parents deterministic, partial_failure

makes_dist_systems_difficult
content What makes distributed systems hard to work with
parents partial_failure

large_scale_computing_philosophies
content Large-scale computing philosophies
children HPC, cloud_and_super_computing

cloud_and_super_computing
content Cloud computing and supercomputing
children thousands_of_nodes_something_wrong, HPC (vs), cloud_computing
parents large_scale_computing_philosophies

HPC
content High Performance Computing (HPC)
children checkpoint_and_crash
parents cloud_and_super_computing, large_scale_computing_philosophies

checkpoint_and_crash
content Use checkpoints, let everything crash
parents HPC

cloud_computing
content Cloud Computing
children reliable_system_unreliable_components
parents cloud_and_super_computing

thousands_of_nodes_something_wrong
content When a system has thousands of nodes, it is safe to assume something is wrong.
parents cloud_and_super_computing

reliable_system_unreliable_components
content Reliable system form unreliable components
parents cloud_computing

unreliable_networks
content Unreliable Networks
children impossible_determine_why_request_not_received, network_faults_practice, APN
parents faults_partial_failures

APN
content Asynchronous Packet Networks (APN)
children impossible_determine_why_request_not_received, sync_async_networks, unbounded_delay
parents unreliable_networks

impossible_determine_why_request_not_received
content Impossible to determine why you don't receive request from APN
children timeout
parents unreliable_networks, APN

timeout
content Timeout: after some time, give up waiting and assume response won't arrive
children flow_control, how_long_to_declare_dead, ideal_timeout, jitter, network_congestion, no_correct_timeout, noisy_neighbor, unbounded_delay
parents impossible_determine_why_request_not_received

network_faults_practice
content Network Faults in Practice
children how_software_reacts_to_network_problems, network_fault, network_partition
parents unreliable_networks

network_partition
content Network Partition / Netsplit: One part of network cut off from the rest of network due to a network fault
children network_fault (AKA)
parents network_faults_practice

network_fault
content Network Fault
parents network_partition, network_faults_practice

how_software_reacts_to_network_problems
content Need to know how your software reacts to network problems and that the system can recover from them
parents network_faults_practice

detecting_faults
content Detecting Faults
parents faults_partial_failures

rapid_feedback
content Rapid Feedback about a node being down is useful, but can't count on it
children certain_response_success (the only way to know for sure)

certain_response_success
content Certainty that response that response was successful: positive response from application itself
parents rapid_feedback

how_long_to_declare_dead
content How long to wait before declaring a node is dead?
parents timeout

unbounded_delay
content Unbounded Delay: deliver packets as quickly as possible, no upper limit on packet arrival time.
children no_correct_timeout, variable_delays_dynamic_partitioning
parents APN, timeout

ideal_timeout
content Ideal Timeout: 2d + r, where d is time to deliver, and r is the time it takes to handle request on non-failed node.
parents timeout

network_congestion
content Network Congestion: packets have to wait in queue for a slot
parents timeout

flow_control
content flow control / congestion avboidance / backpressure: (TCP) node limits rate of sending in order to avoid overloading newtwork link or receiving node
parents timeout

noisy_neighbor
content Noisy Neighbor: delays caused by other customer using shared resources.
parents timeout

jitter
content Jitter: systems can continuously monitor response times and their variability
parents timeout

sync_async_networks
content Synchronous vs Asynchronous Networks
children synchronous, ISDN_network, bursty_traffic, circuit, emulating_circuit_switching
parents APN

circuit
content Circuit: (in telephone systems) fixed, guaranteed amount of bandwidth allocated for a call
children synchronous, ISDN_network
parents sync_async_networks

synchronous
content Synchronous: no queuing, 16 bits of space are already reserved.
parents timing_assumptions, sync_async_networks, circuit

ISDN_network
content ISDN Network
parents sync_async_networks, circuit

bursty_traffic
content Internet and datacenter networks optimized for bursty traffic instead of constant bitrate
parents sync_async_networks

quality_of_service
content Quality of Service: priority scheduling of packets
parents emulating_circuit_switching

emulating_circuit_switching
content Emulating circuit switching on asynchronous network or statistically bounded delay
children quality_of_service, admission_control
parents sync_async_networks

admission_control
content Admission Control: rate limit senders
parents emulating_circuit_switching

variable_delays_dynamic_partitioning
content More generally, you can think of variable delays as a consequence of dynamic partitioning
children static_partitioning (dynamic vs static partitioning), dynamic_partitioning
parents unbounded_delay

static_partitioning
content Static Partitioning: latency guarantees, reducied utilization (less expensive)
children dynamic_partitioning (vs)
parents variable_delays_dynamic_partitioning

no_correct_timeout
content No correct timeout, determined experimentally
parents unbounded_delay, timeout

dynamic_partitioning
content Dynamic Partitioning: multi-tenancy, better utilization (cheaper), but variable delays
parents static_partitioning, variable_delays_dynamic_partitioning

clocks
content clocks
children measuring_duration, monotonic_clock, node_clock_not_accurate, smearing, time_of_day_clock, clock_issues, clock_reading_confidence_interval, describing_points_in_time, drift

measuring_duration
content Measuring Duration
children describing_points_in_time (related)
parents clocks

describing_points_in_time
content Describing points in time
parents measuring_duration, clocks

node_clock_not_accurate
content Each node has its own hardware clock (quartz, usually), not perfectly accurate
parents clocks

time_of_day_clock
content Time-of-Day Clock: returns current calendar date
children monotonic_clock (vs), unsuitable_for_measuing_duration, clock_sync_global_snapshots
parents NTP, clocks

monotonic_clock
content Monotonic Clock: Clock that is guaranteed to move forward in time. Suitable for measuring durations.
children slew, timediff_yields_duration_absolute_meaningless, unsuitable_for_measuing_duration (used for measuring elapsed time), different_timers_per_CPU, drift
parents time_of_day_clock, clocks

unsuitable_for_measuing_duration
content Time jumps (backwards) and (often) ignoring leap seconds make time-of-day clock unsuitable for measuring elapsed time
parents time_of_day_clock, monotonic_clock

NTP
content NTP: network time protocol, syncs clocks according to time reported by group of servers
children slew (NTP uses slew), smearing (how NTP handles leap seconds), time_of_day_clock (synced with), NTP_accuracy_limited_by_network_roundtrip, NTP_slew_amount

timediff_yields_duration_absolute_meaningless
content Difference between times gives duration elapsed, absolute value is meaningless
parents monotonic_clock

different_timers_per_CPU
content Different timers per CPU: unsynchronized. OS compensates, but take monoticity guarantee with grain of salt
parents clock_issues, monotonic_clock

slew
content Slew: to adjust the rate a monotonic clock moves forward.
children NTP_slew_amount
parents NTP, monotonic_clock

NTP_slew_amount
content NTP slew rate up to +/- 0.05%, but no forward/backward jumps
parents NTP, slew

drift
content Drift: clock runs slower/faster than it should
children smearing, timestamps_for_ordering_can_fail, 200ppm_drift
parents clock_issues, clocks, monotonic_clock

200ppm_drift
content Google assumes 200ppm drift for their servers. 6ms drift for a node resynchronized every 30s, 17s for a node resynchronized once a day
parents drift

smearing
content Smearing: performing leap second gradually over course of day.
parents drift, NTP, clocks

timestamps_for_ordering_can_fail
content Timestamps for ordering events across multiple nodes can fail due to skew.
parents drift

NTP_accuracy_limited_by_network_roundtrip
content NTP accuracy limited by network roundtrip time
parents NTP

clock_reading_confidence_interval
content Clock Reading: less a point in time, more of a range with confidence interval
children truetime_google, uncertainy_bound
parents clocks

uncertainy_bound
content Uncertainty bound computed based on time source
parents clock_reading_confidence_interval

truetime_google
content TrueTime (Google Spanner): explicitely reports confidence interval on local clock
children clock_sync_global_snapshots
parents clock_reading_confidence_interval

clock_sync_global_snapshots
content Clock sync for global snapshots (synched time-of-day-clocks used as transaction IDs)
parents truetime_google, time_of_day_clock

clock_issues
content Clock Issues
children process_pauses, VM_suspended, different_timers_per_CPU, drift
parents clocks

process_pauses
content process pauses
children hard_realtime_systems, preempt_thread, steal_time, thrashing, GC_brief_planned_outages
parents clock_issues

VM_suspended
content VM Suspended
parents clock_issues

preempt_thread
content Preempt running thread at any moment and resume it without any noticing
parents process_pauses

thrashing
content thrashing: OS spends most time swapping pages in/out of memory
parents process_pauses

steal_time
content steal_time: CPU time spent in another thread
parents process_pauses

hard_realtime_systems
content Hard Realtime Systems: specified deadline by which software must respond
children RTOS
parents process_pauses

RTOS
content Real-Time Operating System (RTOS): allows processes to be scheduled with a guaranteed allocation of CPU time in specified intervals
children GC_brief_planned_outages (GC)
parents hard_realtime_systems

GC_brief_planned_outages
content Garbage collection: emerging idewa to treat GC pauses like brief planned outages, let other nodes handle requests
parents process_pauses, RTOS

knowledge_truth_and_lies
content Knowledge, Truth, and Lies
children fencing, system_model, truth_defined_by_majority, byzantine_fault

system_model
content System Model: assumptions about behavior in distrubuted systems.
children node_failures, correctness
parents knowledge_truth_and_lies

truth_defined_by_majority
content Truth defined by majority
children node_cant_trust_judgement_of_situation, quorum, consensus_algos
parents knowledge_truth_and_lies

node_cant_trust_judgement_of_situation
content A node can't trust its own judgement of a situation
parents truth_defined_by_majority

quorum
content Quorum: voting amongst the nodes
parents truth_defined_by_majority

consensus_algos
content Consensus Algorithms
parents truth_defined_by_majority

fencing
content Fencing: in locking/leasing to protect access to a resource, a way to ensure nodes under false belief of ownership can't disrupt rest of system.
children fencing_token
parents knowledge_truth_and_lies

fencing_token
content Fencing Token: returned every time lock or lease granted, It is a number that increses
children fencing_tokens_lock
parents fencing

byzantine_fault
content Byzantine Fault: node claims to have received a particular message when in fact it didn't
children relevant_p2p, byz_fault_tolerant, byz_ft_protocols_complicated, consensus_in_untrusted_environment
parents knowledge_truth_and_lies

consensus_in_untrusted_environment
content Reaching consensus in untrusted environment
children byzantine_generals_problem (terminology)
parents byzantine_fault

byzantine_generals_problem
content Byzantine Generals Problem
parents consensus_in_untrusted_environment

byz_fault_tolerant
content Byzantine Fault Tolerant: oeprates correctly even if some of the nodes are malfunctioning or if there are malicious attackers
children relevant_p2p, byz_ft_protocols_complicated
parents byzantine_fault

byz_ft_protocols_complicated
content Protocoals for Byzantine Fault Tolerant systems are usually complicated and impractical in most use server-side data systems.
parents byz_fault_tolerant, byzantine_fault

relevant_p2p
content Relevant in P2P systems
parents byz_fault_tolerant, byzantine_fault

timing_assumptions
content Timing Assumptions
children partially_synchronous, synchronous, asynchronous

asynchronous
content Asynchronous
parents timing_assumptions

partially_synchronous
content Partially Synchronous
parents timing_assumptions

node_failures
content Node Failures
children byzantine_arbitrary_faults, crash_recoevery_faults, crash_stop_faults
parents system_model

crash_stop_faults
content Crash-stop faults
parents node_failures

crash_recoevery_faults
content Crash Recovery Faults
parents node_failures

byzantine_arbitrary_faults
content Byzantine (arbitrary) faults
parents node_failures

correctness
content Correctness
children proving_correct_doesnt_mean_implementation_correct, correctness_properties
parents system_model

correctness_properties
content Corretness Properties
children fencing_tokens_lock, liveness, safety
parents correctness

safety
content Safety
children monotonic_sequence, nothing_bad_happens (informal description), on_violation_point_to_time, uniqueness
parents correctness_properties

liveness
content liveness
children satisfied_in_future, something_good_eventually_happens (informal description), availability
parents correctness_properties

fencing_tokens_lock
content Fencing tokens for lock
children uniqueness
parents correctness_properties, fencing_token

uniqueness
content Uniqueness
parents fencing_tokens_lock, safety

monotonic_sequence
content Monotonic Sequence
parents safety

availability
content Availability
parents liveness

nothing_bad_happens
content Nothing bad happens
parents safety

something_good_eventually_happens
content something good eventually happens
parents liveness

on_violation_point_to_time
content On violation, able to point to specific point in time. Violation can't be undone.
parents safety

satisfied_in_future
content May not hold at some point in time, but may be satisified in the future
parents liveness

proving_correct_doesnt_mean_implementation_correct
content Proving correct doesn't mean implementation will be, but it's a good first step
parents correctness