ch08
dz / designing_data_intensive_applications / ch08Summary
Chapter 8: The Trouble With Distributed Systems
Node Tree
- rapid_feedback
- timing_assumptions
- NTP
- clocks
- faults_partial_failures
- knowledge_truth_and_lies
- large_scale_computing_philosophies
Nodes
| faults_partial_failures | |
| content | Faults and Partial Failures |
| children | partial_failure, unreliable_networks, detecting_faults, deterministic |
| deterministic | |
| content | Deterministic: same opereration produces same result |
| children | nondeterministic (vs) |
| parents | faults_partial_failures |
| partial_failure | |
| content | Partial Failure: parts of distributed system broken in unpredictable way, other parts working fine |
| children | nondeterministic, makes_dist_systems_difficult |
| parents | faults_partial_failures |
| nondeterministic | |
| content | nondeterministic |
| parents | partial_failure, deterministic |
| makes_dist_systems_difficult | |
| content | What makes distributed systems hard to work with |
| parents | partial_failure |
| large_scale_computing_philosophies | |
| content | Large-scale computing philosophies |
| children | HPC, cloud_and_super_computing |
| cloud_and_super_computing | |
| content | Cloud computing and supercomputing |
| children | thousands_of_nodes_something_wrong, HPC (vs), cloud_computing |
| parents | large_scale_computing_philosophies |
| HPC | |
| content | High Performance Computing (HPC) |
| children | checkpoint_and_crash |
| parents | large_scale_computing_philosophies, cloud_and_super_computing |
| checkpoint_and_crash | |
| content | Use checkpoints, let everything crash |
| parents | HPC |
| cloud_computing | |
| content | Cloud Computing |
| children | reliable_system_unreliable_components |
| parents | cloud_and_super_computing |
| thousands_of_nodes_something_wrong | |
| content | When a system has thousands of nodes, it is safe to assume something is wrong. |
| parents | cloud_and_super_computing |
| reliable_system_unreliable_components | |
| content | Reliable system form unreliable components |
| parents | cloud_computing |
| unreliable_networks | |
| content | Unreliable Networks |
| children | APN, impossible_determine_why_request_not_received, network_faults_practice |
| parents | faults_partial_failures |
| APN | |
| content | Asynchronous Packet Networks (APN) |
| children | sync_async_networks, unbounded_delay, impossible_determine_why_request_not_received |
| parents | unreliable_networks |
| impossible_determine_why_request_not_received | |
| content | Impossible to determine why you don't receive request from APN |
| children | timeout |
| parents | unreliable_networks, APN |
| timeout | |
| content | Timeout: after some time, give up waiting and assume response won't arrive |
| children | noisy_neighbor, unbounded_delay, flow_control, how_long_to_declare_dead, ideal_timeout, jitter, network_congestion, no_correct_timeout |
| parents | impossible_determine_why_request_not_received |
| network_faults_practice | |
| content | Network Faults in Practice |
| children | how_software_reacts_to_network_problems, network_fault, network_partition |
| parents | unreliable_networks |
| network_partition | |
| content | Network Partition / Netsplit: One part of network cut off from the rest of network due to a network fault |
| children | network_fault (AKA) |
| parents | network_faults_practice |
| network_fault | |
| content | Network Fault |
| parents | network_faults_practice, network_partition |
| how_software_reacts_to_network_problems | |
| content | Need to know how your software reacts to network problems and that the system can recover from them |
| parents | network_faults_practice |
| detecting_faults | |
| content | Detecting Faults |
| parents | faults_partial_failures |
| rapid_feedback | |
| content | Rapid Feedback about a node being down is useful, but can't count on it |
| children | certain_response_success (the only way to know for sure) |
| certain_response_success | |
| content | Certainty that response that response was successful: positive response from application itself |
| parents | rapid_feedback |
| how_long_to_declare_dead | |
| content | How long to wait before declaring a node is dead? |
| parents | timeout |
| unbounded_delay | |
| content | Unbounded Delay: deliver packets as quickly as possible, no upper limit on packet arrival time. |
| children | variable_delays_dynamic_partitioning, no_correct_timeout |
| parents | timeout, APN |
| ideal_timeout | |
| content | Ideal Timeout: 2d + r, where d is time to deliver, and r is the time it takes to handle request on non-failed node. |
| parents | timeout |
| network_congestion | |
| content | Network Congestion: packets have to wait in queue for a slot |
| parents | timeout |
| flow_control | |
| content | flow control / congestion avboidance / backpressure: (TCP) node limits rate of sending in order to avoid overloading newtwork link or receiving node |
| parents | timeout |
| noisy_neighbor | |
| content | Noisy Neighbor: delays caused by other customer using shared resources. |
| parents | timeout |
| jitter | |
| content | Jitter: systems can continuously monitor response times and their variability |
| parents | timeout |
| sync_async_networks | |
| content | Synchronous vs Asynchronous Networks |
| children | synchronous, ISDN_network, bursty_traffic, circuit, emulating_circuit_switching |
| parents | APN |
| circuit | |
| content | Circuit: (in telephone systems) fixed, guaranteed amount of bandwidth allocated for a call |
| children | synchronous, ISDN_network |
| parents | sync_async_networks |
| synchronous | |
| content | Synchronous: no queuing, 16 bits of space are already reserved. |
| parents | circuit, timing_assumptions, sync_async_networks |
| ISDN_network | |
| content | ISDN Network |
| parents | circuit, sync_async_networks |
| bursty_traffic | |
| content | Internet and datacenter networks optimized for bursty traffic instead of constant bitrate |
| parents | sync_async_networks |
| quality_of_service | |
| content | Quality of Service: priority scheduling of packets |
| parents | emulating_circuit_switching |
| emulating_circuit_switching | |
| content | Emulating circuit switching on asynchronous network or statistically bounded delay |
| children | quality_of_service, admission_control |
| parents | sync_async_networks |
| admission_control | |
| content | Admission Control: rate limit senders |
| parents | emulating_circuit_switching |
| variable_delays_dynamic_partitioning | |
| content | More generally, you can think of variable delays as a consequence of dynamic partitioning |
| children | static_partitioning (dynamic vs static partitioning), dynamic_partitioning |
| parents | unbounded_delay |
| static_partitioning | |
| content | Static Partitioning: latency guarantees, reducied utilization (less expensive) |
| children | dynamic_partitioning (vs) |
| parents | variable_delays_dynamic_partitioning |
| no_correct_timeout | |
| content | No correct timeout, determined experimentally |
| parents | timeout, unbounded_delay |
| dynamic_partitioning | |
| content | Dynamic Partitioning: multi-tenancy, better utilization (cheaper), but variable delays |
| parents | variable_delays_dynamic_partitioning, static_partitioning |
| clocks | |
| content | clocks |
| children | node_clock_not_accurate, smearing, time_of_day_clock, clock_issues, clock_reading_confidence_interval, describing_points_in_time, drift, measuring_duration, monotonic_clock |
| measuring_duration | |
| content | Measuring Duration |
| children | describing_points_in_time (related) |
| parents | clocks |
| describing_points_in_time | |
| content | Describing points in time |
| parents | clocks, measuring_duration |
| node_clock_not_accurate | |
| content | Each node has its own hardware clock (quartz, usually), not perfectly accurate |
| parents | clocks |
| time_of_day_clock | |
| content | Time-of-Day Clock: returns current calendar date |
| children | unsuitable_for_measuing_duration, clock_sync_global_snapshots, monotonic_clock (vs) |
| parents | clocks, NTP |
| monotonic_clock | |
| content | Monotonic Clock: Clock that is guaranteed to move forward in time. Suitable for measuring durations. |
| children | slew, timediff_yields_duration_absolute_meaningless, unsuitable_for_measuing_duration (used for measuring elapsed time), different_timers_per_CPU, drift |
| parents | time_of_day_clock, clocks |
| unsuitable_for_measuing_duration | |
| content | Time jumps (backwards) and (often) ignoring leap seconds make time-of-day clock unsuitable for measuring elapsed time |
| parents | time_of_day_clock, monotonic_clock |
| NTP | |
| content | NTP: network time protocol, syncs clocks according to time reported by group of servers |
| children | slew (NTP uses slew), smearing (how NTP handles leap seconds), time_of_day_clock (synced with), NTP_accuracy_limited_by_network_roundtrip, NTP_slew_amount |
| timediff_yields_duration_absolute_meaningless | |
| content | Difference between times gives duration elapsed, absolute value is meaningless |
| parents | monotonic_clock |
| different_timers_per_CPU | |
| content | Different timers per CPU: unsynchronized. OS compensates, but take monoticity guarantee with grain of salt |
| parents | monotonic_clock, clock_issues |
| slew | |
| content | Slew: to adjust the rate a monotonic clock moves forward. |
| children | NTP_slew_amount |
| parents | monotonic_clock, NTP |
| NTP_slew_amount | |
| content | NTP slew rate up to +/- 0.05%, but no forward/backward jumps |
| parents | slew, NTP |
| drift | |
| content | Drift: clock runs slower/faster than it should |
| children | smearing, timestamps_for_ordering_can_fail, 200ppm_drift |
| parents | clocks, clock_issues, monotonic_clock |
| 200ppm_drift | |
| content | Google assumes 200ppm drift for their servers. 6ms drift for a node resynchronized every 30s, 17s for a node resynchronized once a day |
| parents | drift |
| smearing | |
| content | Smearing: performing leap second gradually over course of day. |
| parents | clocks, drift, NTP |
| timestamps_for_ordering_can_fail | |
| content | Timestamps for ordering events across multiple nodes can fail due to skew. |
| parents | drift |
| NTP_accuracy_limited_by_network_roundtrip | |
| content | NTP accuracy limited by network roundtrip time |
| parents | NTP |
| clock_reading_confidence_interval | |
| content | Clock Reading: less a point in time, more of a range with confidence interval |
| children | truetime_google, uncertainy_bound |
| parents | clocks |
| uncertainy_bound | |
| content | Uncertainty bound computed based on time source |
| parents | clock_reading_confidence_interval |
| truetime_google | |
| content | TrueTime (Google Spanner): explicitely reports confidence interval on local clock |
| children | clock_sync_global_snapshots |
| parents | clock_reading_confidence_interval |
| clock_sync_global_snapshots | |
| content | Clock sync for global snapshots (synched time-of-day-clocks used as transaction IDs) |
| parents | time_of_day_clock, truetime_google |
| clock_issues | |
| content | Clock Issues |
| children | process_pauses, VM_suspended, different_timers_per_CPU, drift |
| parents | clocks |
| process_pauses | |
| content | process pauses |
| children | preempt_thread, steal_time, thrashing, GC_brief_planned_outages, hard_realtime_systems |
| parents | clock_issues |
| VM_suspended | |
| content | VM Suspended |
| parents | clock_issues |
| preempt_thread | |
| content | Preempt running thread at any moment and resume it without any noticing |
| parents | process_pauses |
| thrashing | |
| content | thrashing: OS spends most time swapping pages in/out of memory |
| parents | process_pauses |
| steal_time | |
| content | steal_time: CPU time spent in another thread |
| parents | process_pauses |
| hard_realtime_systems | |
| content | Hard Realtime Systems: specified deadline by which software must respond |
| children | RTOS |
| parents | process_pauses |
| RTOS | |
| content | Real-Time Operating System (RTOS): allows processes to be scheduled with a guaranteed allocation of CPU time in specified intervals |
| children | GC_brief_planned_outages (GC) |
| parents | hard_realtime_systems |
| GC_brief_planned_outages | |
| content | Garbage collection: emerging idewa to treat GC pauses like brief planned outages, let other nodes handle requests |
| parents | RTOS, process_pauses |
| knowledge_truth_and_lies | |
| content | Knowledge, Truth, and Lies |
| children | system_model, truth_defined_by_majority, byzantine_fault, fencing |
| system_model | |
| content | System Model: assumptions about behavior in distrubuted systems. |
| children | node_failures, correctness |
| parents | knowledge_truth_and_lies |
| truth_defined_by_majority | |
| content | Truth defined by majority |
| children | quorum, consensus_algos, node_cant_trust_judgement_of_situation |
| parents | knowledge_truth_and_lies |
| node_cant_trust_judgement_of_situation | |
| content | A node can't trust its own judgement of a situation |
| parents | truth_defined_by_majority |
| quorum | |
| content | Quorum: voting amongst the nodes |
| parents | truth_defined_by_majority |
| consensus_algos | |
| content | Consensus Algorithms |
| parents | truth_defined_by_majority |
| fencing | |
| content | Fencing: in locking/leasing to protect access to a resource, a way to ensure nodes under false belief of ownership can't disrupt rest of system. |
| children | fencing_token |
| parents | knowledge_truth_and_lies |
| fencing_token | |
| content | Fencing Token: returned every time lock or lease granted, It is a number that increses |
| children | fencing_tokens_lock |
| parents | fencing |
| byzantine_fault | |
| content | Byzantine Fault: node claims to have received a particular message when in fact it didn't |
| children | relevant_p2p, byz_fault_tolerant, byz_ft_protocols_complicated, consensus_in_untrusted_environment |
| parents | knowledge_truth_and_lies |
| consensus_in_untrusted_environment | |
| content | Reaching consensus in untrusted environment |
| children | byzantine_generals_problem (terminology) |
| parents | byzantine_fault |
| byzantine_generals_problem | |
| content | Byzantine Generals Problem |
| parents | consensus_in_untrusted_environment |
| byz_fault_tolerant | |
| content | Byzantine Fault Tolerant: oeprates correctly even if some of the nodes are malfunctioning or if there are malicious attackers |
| children | relevant_p2p, byz_ft_protocols_complicated |
| parents | byzantine_fault |
| byz_ft_protocols_complicated | |
| content | Protocoals for Byzantine Fault Tolerant systems are usually complicated and impractical in most use server-side data systems. |
| parents | byzantine_fault, byz_fault_tolerant |
| relevant_p2p | |
| content | Relevant in P2P systems |
| parents | byzantine_fault, byz_fault_tolerant |
| timing_assumptions | |
| content | Timing Assumptions |
| children | partially_synchronous, synchronous, asynchronous |
| asynchronous | |
| content | Asynchronous |
| parents | timing_assumptions |
| partially_synchronous | |
| content | Partially Synchronous |
| parents | timing_assumptions |
| node_failures | |
| content | Node Failures |
| children | byzantine_arbitrary_faults, crash_recoevery_faults, crash_stop_faults |
| parents | system_model |
| crash_stop_faults | |
| content | Crash-stop faults |
| parents | node_failures |
| crash_recoevery_faults | |
| content | Crash Recovery Faults |
| parents | node_failures |
| byzantine_arbitrary_faults | |
| content | Byzantine (arbitrary) faults |
| parents | node_failures |
| correctness | |
| content | Correctness |
| children | proving_correct_doesnt_mean_implementation_correct, correctness_properties |
| parents | system_model |
| correctness_properties | |
| content | Corretness Properties |
| children | safety, fencing_tokens_lock, liveness |
| parents | correctness |
| safety | |
| content | Safety |
| children | nothing_bad_happens (informal description), on_violation_point_to_time, uniqueness, monotonic_sequence |
| parents | correctness_properties |
| liveness | |
| content | liveness |
| children | satisfied_in_future, something_good_eventually_happens (informal description), availability |
| parents | correctness_properties |
| fencing_tokens_lock | |
| content | Fencing tokens for lock |
| children | uniqueness |
| parents | correctness_properties, fencing_token |
| uniqueness | |
| content | Uniqueness |
| parents | safety, fencing_tokens_lock |
| monotonic_sequence | |
| content | Monotonic Sequence |
| parents | safety |
| availability | |
| content | Availability |
| parents | liveness |
| nothing_bad_happens | |
| content | Nothing bad happens |
| parents | safety |
| something_good_eventually_happens | |
| content | something good eventually happens |
| parents | liveness |
| on_violation_point_to_time | |
| content | On violation, able to point to specific point in time. Violation can't be undone. |
| parents | safety |
| satisfied_in_future | |
| content | May not hold at some point in time, but may be satisified in the future |
| parents | liveness |
| proving_correct_doesnt_mean_implementation_correct | |
| content | Proving correct doesn't mean implementation will be, but it's a good first step |
| parents | correctness |