designing_data_intensive_applications/ch08

ch08

dz / designing_data_intensive_applications / ch08

Summary

Chapter 8: The Trouble With Distributed Systems

Node Tree

rapid_feedback
- certain_response_success
timing_assumptions
NTP
clocks
faults_partial_failures
knowledge_truth_and_lies
large_scale_computing_philosophies
- HPC
  - checkpoint_and_crash
- cloud_and_super_computing

Nodes

faults_partial_failures
content	Faults and Partial Failures
children	partial_failure, unreliable_networks, detecting_faults, deterministic

deterministic
content	Deterministic: same opereration produces same result
children	nondeterministic (vs)
parents	faults_partial_failures

partial_failure
content	Partial Failure: parts of distributed system broken in unpredictable way, other parts working fine
children	nondeterministic, makes_dist_systems_difficult
parents	faults_partial_failures

nondeterministic
content	nondeterministic
parents	partial_failure, deterministic

makes_dist_systems_difficult
content	What makes distributed systems hard to work with
parents	partial_failure

large_scale_computing_philosophies
content	Large-scale computing philosophies
children	HPC, cloud_and_super_computing

cloud_and_super_computing
content	Cloud computing and supercomputing
children	thousands_of_nodes_something_wrong, HPC (vs), cloud_computing
parents	large_scale_computing_philosophies

HPC
content	High Performance Computing (HPC)
children	checkpoint_and_crash
parents	large_scale_computing_philosophies, cloud_and_super_computing

checkpoint_and_crash
content	Use checkpoints, let everything crash
parents	HPC

cloud_computing
content	Cloud Computing
children	reliable_system_unreliable_components
parents	cloud_and_super_computing

thousands_of_nodes_something_wrong
content	When a system has thousands of nodes, it is safe to assume something is wrong.
parents	cloud_and_super_computing

reliable_system_unreliable_components
content	Reliable system form unreliable components
parents	cloud_computing

unreliable_networks
content	Unreliable Networks
children	APN, impossible_determine_why_request_not_received, network_faults_practice
parents	faults_partial_failures

APN
content	Asynchronous Packet Networks (APN)
children	sync_async_networks, unbounded_delay, impossible_determine_why_request_not_received
parents	unreliable_networks

impossible_determine_why_request_not_received
content	Impossible to determine why you don't receive request from APN
children	timeout
parents	unreliable_networks, APN

timeout
content	Timeout: after some time, give up waiting and assume response won't arrive
children	noisy_neighbor, unbounded_delay, flow_control, how_long_to_declare_dead, ideal_timeout, jitter, network_congestion, no_correct_timeout
parents	impossible_determine_why_request_not_received

network_faults_practice
content	Network Faults in Practice
children	how_software_reacts_to_network_problems, network_fault, network_partition
parents	unreliable_networks

network_partition
content	Network Partition / Netsplit: One part of network cut off from the rest of network due to a network fault
children	network_fault (AKA)
parents	network_faults_practice

network_fault
content	Network Fault
parents	network_faults_practice, network_partition

how_software_reacts_to_network_problems
content	Need to know how your software reacts to network problems and that the system can recover from them
parents	network_faults_practice

detecting_faults
content	Detecting Faults
parents	faults_partial_failures

rapid_feedback
content	Rapid Feedback about a node being down is useful, but can't count on it
children	certain_response_success (the only way to know for sure)

certain_response_success
content	Certainty that response that response was successful: positive response from application itself
parents	rapid_feedback

how_long_to_declare_dead
content	How long to wait before declaring a node is dead?
parents	timeout

unbounded_delay
content	Unbounded Delay: deliver packets as quickly as possible, no upper limit on packet arrival time.
children	variable_delays_dynamic_partitioning, no_correct_timeout
parents	timeout, APN

ideal_timeout
content	Ideal Timeout: 2d + r, where d is time to deliver, and r is the time it takes to handle request on non-failed node.
parents	timeout

network_congestion
content	Network Congestion: packets have to wait in queue for a slot
parents	timeout

flow_control
content	flow control / congestion avboidance / backpressure: (TCP) node limits rate of sending in order to avoid overloading newtwork link or receiving node
parents	timeout

noisy_neighbor
content	Noisy Neighbor: delays caused by other customer using shared resources.
parents	timeout

jitter
content	Jitter: systems can continuously monitor response times and their variability
parents	timeout

sync_async_networks
content	Synchronous vs Asynchronous Networks
children	synchronous, ISDN_network, bursty_traffic, circuit, emulating_circuit_switching
parents	APN

circuit
content	Circuit: (in telephone systems) fixed, guaranteed amount of bandwidth allocated for a call
children	synchronous, ISDN_network
parents	sync_async_networks

synchronous
content	Synchronous: no queuing, 16 bits of space are already reserved.
parents	circuit, timing_assumptions, sync_async_networks

ISDN_network
content	ISDN Network
parents	circuit, sync_async_networks

bursty_traffic
content	Internet and datacenter networks optimized for bursty traffic instead of constant bitrate
parents	sync_async_networks

quality_of_service
content	Quality of Service: priority scheduling of packets
parents	emulating_circuit_switching

emulating_circuit_switching
content	Emulating circuit switching on asynchronous network or statistically bounded delay
children	quality_of_service, admission_control
parents	sync_async_networks

admission_control
content	Admission Control: rate limit senders
parents	emulating_circuit_switching

variable_delays_dynamic_partitioning
content	More generally, you can think of variable delays as a consequence of dynamic partitioning
children	static_partitioning (dynamic vs static partitioning), dynamic_partitioning
parents	unbounded_delay

static_partitioning
content	Static Partitioning: latency guarantees, reducied utilization (less expensive)
children	dynamic_partitioning (vs)
parents	variable_delays_dynamic_partitioning

no_correct_timeout
content	No correct timeout, determined experimentally
parents	timeout, unbounded_delay

dynamic_partitioning
content	Dynamic Partitioning: multi-tenancy, better utilization (cheaper), but variable delays
parents	variable_delays_dynamic_partitioning, static_partitioning

clocks
content	clocks
children	node_clock_not_accurate, smearing, time_of_day_clock, clock_issues, clock_reading_confidence_interval, describing_points_in_time, drift, measuring_duration, monotonic_clock

measuring_duration
content	Measuring Duration
children	describing_points_in_time (related)
parents	clocks

describing_points_in_time
content	Describing points in time
parents	clocks, measuring_duration

node_clock_not_accurate
content	Each node has its own hardware clock (quartz, usually), not perfectly accurate
parents	clocks

time_of_day_clock
content	Time-of-Day Clock: returns current calendar date
children	unsuitable_for_measuing_duration, clock_sync_global_snapshots, monotonic_clock (vs)
parents	clocks, NTP

monotonic_clock
content	Monotonic Clock: Clock that is guaranteed to move forward in time. Suitable for measuring durations.
children	slew, timediff_yields_duration_absolute_meaningless, unsuitable_for_measuing_duration (used for measuring elapsed time), different_timers_per_CPU, drift
parents	time_of_day_clock, clocks

unsuitable_for_measuing_duration
content	Time jumps (backwards) and (often) ignoring leap seconds make time-of-day clock unsuitable for measuring elapsed time
parents	time_of_day_clock, monotonic_clock

NTP
content	NTP: network time protocol, syncs clocks according to time reported by group of servers
children	slew (NTP uses slew), smearing (how NTP handles leap seconds), time_of_day_clock (synced with), NTP_accuracy_limited_by_network_roundtrip, NTP_slew_amount

timediff_yields_duration_absolute_meaningless
content	Difference between times gives duration elapsed, absolute value is meaningless
parents	monotonic_clock

different_timers_per_CPU
content	Different timers per CPU: unsynchronized. OS compensates, but take monoticity guarantee with grain of salt
parents	monotonic_clock, clock_issues

slew
content	Slew: to adjust the rate a monotonic clock moves forward.
children	NTP_slew_amount
parents	monotonic_clock, NTP

NTP_slew_amount
content	NTP slew rate up to +/- 0.05%, but no forward/backward jumps
parents	slew, NTP

drift
content	Drift: clock runs slower/faster than it should
children	smearing, timestamps_for_ordering_can_fail, 200ppm_drift
parents	clocks, clock_issues, monotonic_clock

200ppm_drift
content	Google assumes 200ppm drift for their servers. 6ms drift for a node resynchronized every 30s, 17s for a node resynchronized once a day
parents	drift

smearing
content	Smearing: performing leap second gradually over course of day.
parents	clocks, drift, NTP

timestamps_for_ordering_can_fail
content	Timestamps for ordering events across multiple nodes can fail due to skew.
parents	drift

NTP_accuracy_limited_by_network_roundtrip
content	NTP accuracy limited by network roundtrip time
parents	NTP

clock_reading_confidence_interval
content	Clock Reading: less a point in time, more of a range with confidence interval
children	truetime_google, uncertainy_bound
parents	clocks

uncertainy_bound
content	Uncertainty bound computed based on time source
parents	clock_reading_confidence_interval

truetime_google
content	TrueTime (Google Spanner): explicitely reports confidence interval on local clock
children	clock_sync_global_snapshots
parents	clock_reading_confidence_interval

clock_sync_global_snapshots
content	Clock sync for global snapshots (synched time-of-day-clocks used as transaction IDs)
parents	time_of_day_clock, truetime_google

clock_issues
content	Clock Issues
children	process_pauses, VM_suspended, different_timers_per_CPU, drift
parents	clocks

process_pauses
content	process pauses
children	preempt_thread, steal_time, thrashing, GC_brief_planned_outages, hard_realtime_systems
parents	clock_issues

VM_suspended
content	VM Suspended
parents	clock_issues

preempt_thread
content	Preempt running thread at any moment and resume it without any noticing
parents	process_pauses

thrashing
content	thrashing: OS spends most time swapping pages in/out of memory
parents	process_pauses

steal_time
content	steal_time: CPU time spent in another thread
parents	process_pauses

hard_realtime_systems
content	Hard Realtime Systems: specified deadline by which software must respond
children	RTOS
parents	process_pauses

RTOS
content	Real-Time Operating System (RTOS): allows processes to be scheduled with a guaranteed allocation of CPU time in specified intervals
children	GC_brief_planned_outages (GC)
parents	hard_realtime_systems

GC_brief_planned_outages
content	Garbage collection: emerging idewa to treat GC pauses like brief planned outages, let other nodes handle requests
parents	RTOS, process_pauses

knowledge_truth_and_lies
content	Knowledge, Truth, and Lies
children	system_model, truth_defined_by_majority, byzantine_fault, fencing

system_model
content	System Model: assumptions about behavior in distrubuted systems.
children	node_failures, correctness
parents	knowledge_truth_and_lies

truth_defined_by_majority
content	Truth defined by majority
children	quorum, consensus_algos, node_cant_trust_judgement_of_situation
parents	knowledge_truth_and_lies

node_cant_trust_judgement_of_situation
content	A node can't trust its own judgement of a situation
parents	truth_defined_by_majority

quorum
content	Quorum: voting amongst the nodes
parents	truth_defined_by_majority

consensus_algos
content	Consensus Algorithms
parents	truth_defined_by_majority

fencing
content	Fencing: in locking/leasing to protect access to a resource, a way to ensure nodes under false belief of ownership can't disrupt rest of system.
children	fencing_token
parents	knowledge_truth_and_lies

fencing_token
content	Fencing Token: returned every time lock or lease granted, It is a number that increses
children	fencing_tokens_lock
parents	fencing

byzantine_fault
content	Byzantine Fault: node claims to have received a particular message when in fact it didn't
children	relevant_p2p, byz_fault_tolerant, byz_ft_protocols_complicated, consensus_in_untrusted_environment
parents	knowledge_truth_and_lies

consensus_in_untrusted_environment
content	Reaching consensus in untrusted environment
children	byzantine_generals_problem (terminology)
parents	byzantine_fault

byzantine_generals_problem
content	Byzantine Generals Problem
parents	consensus_in_untrusted_environment

byz_fault_tolerant
content	Byzantine Fault Tolerant: oeprates correctly even if some of the nodes are malfunctioning or if there are malicious attackers
children	relevant_p2p, byz_ft_protocols_complicated
parents	byzantine_fault

byz_ft_protocols_complicated
content	Protocoals for Byzantine Fault Tolerant systems are usually complicated and impractical in most use server-side data systems.
parents	byzantine_fault, byz_fault_tolerant

relevant_p2p
content	Relevant in P2P systems
parents	byzantine_fault, byz_fault_tolerant

timing_assumptions
content	Timing Assumptions
children	partially_synchronous, synchronous, asynchronous

asynchronous
content	Asynchronous
parents	timing_assumptions

partially_synchronous
content	Partially Synchronous
parents	timing_assumptions

node_failures
content	Node Failures
children	byzantine_arbitrary_faults, crash_recoevery_faults, crash_stop_faults
parents	system_model

crash_stop_faults
content	Crash-stop faults
parents	node_failures

crash_recoevery_faults
content	Crash Recovery Faults
parents	node_failures

byzantine_arbitrary_faults
content	Byzantine (arbitrary) faults
parents	node_failures

correctness
content	Correctness
children	proving_correct_doesnt_mean_implementation_correct, correctness_properties
parents	system_model

correctness_properties
content	Corretness Properties
children	safety, fencing_tokens_lock, liveness
parents	correctness

safety
content	Safety
children	nothing_bad_happens (informal description), on_violation_point_to_time, uniqueness, monotonic_sequence
parents	correctness_properties

liveness
content	liveness
children	satisfied_in_future, something_good_eventually_happens (informal description), availability
parents	correctness_properties

fencing_tokens_lock
content	Fencing tokens for lock
children	uniqueness
parents	correctness_properties, fencing_token

uniqueness
content	Uniqueness
parents	safety, fencing_tokens_lock

monotonic_sequence
content	Monotonic Sequence
parents	safety

availability
content	Availability
parents	liveness

nothing_bad_happens
content	Nothing bad happens
parents	safety

something_good_eventually_happens
content	something good eventually happens
parents	liveness

on_violation_point_to_time
content	On violation, able to point to specific point in time. Violation can't be undone.
parents	safety

satisfied_in_future
content	May not hold at some point in time, but may be satisified in the future
parents	liveness

proving_correct_doesnt_mean_implementation_correct
content	Proving correct doesn't mean implementation will be, but it's a good first step
parents	correctness