ch01
dz / designing_data_intensive_applications / ch01Summary
Chapter 1: reliable, scalable, and maintainable applications
Node Tree
- describing_performance
- legacy_system
- load_cope_approaches
- maintainability
- managing_complexity
- more_efficient_than_rolling_window
- multiple_calls_high_latency_probability
- reliability
- scalability
- small_request_holdup_rest
Nodes
| reliability | |
| content | Reliability |
| children | faults, continue_work_correctly (definition) |
| continue_work_correctly | |
| content | Continuing to work correclty, even when things go wrong |
| parents | faults, reliability |
| faults | |
| content | faults |
| children | deviation_spec, hardware, human, not_equal_to_failure (Faults are not equivalent to failure), software, things_go_wrong (definition), anticipate_cope, continue_work_correctly |
| parents | reliability |
| things_go_wrong | |
| content | things that go wrong |
| parents | faults |
| anticipate_cope | |
| content | anticipate/cope |
| children | fault_tolerant, resilient |
| parents | faults |
| fault_tolerant | |
| content | Fault Tolerant |
| parents | anticipate_cope |
| resilient | |
| content | Resilient |
| parents | anticipate_cope |
| not_equal_to_failure | |
| content | Not equivalent to failure |
| children | systems_stop_providing_service (this is what a failure is) |
| parents | faults |
| deviation_spec | |
| content | Deviation from Spec |
| parents | faults |
| systems_stop_providing_service | |
| content | Systems as a whole stop providing service to user |
| parents | not_equal_to_failure |
| hardware | |
| content | Hardware |
| children | hardware_failure_examples (examples), mean_time_to_failure_disks, tolerate_loss_entire_machine, MTTF |
| parents | faults |
| hardware_failure_examples | |
| content | Examples of harware failure: disk crash, faulty RAM, power outage |
| parents | hardware |
| mean_time_to_failure_disks | |
| content | Mean Time To Failure (MTTF) for disks: 10-60yrs |
| parents | hardware, MTTF |
| MTTF | |
| content | Mean Time To Failure (MTTF) |
| children | mean_time_to_failure_disks |
| parents | hardware |
| tolerate_loss_entire_machine | |
| content | Tolerate loss of entire machine |
| children | rolling_updates |
| parents | hardware |
| rolling_updates | |
| content | Rolling updates: patch made one ndoe at a time |
| parents | tolerate_loss_entire_machine |
| software | |
| content | Software |
| children | runaway_process (Kind of software fault), bugs (Kind of software fault), corrupted_service (Kind of software fault) |
| parents | faults |
| bugs | |
| content | Bugs |
| parents | software |
| runaway_process | |
| content | Runaway Process |
| parents | software |
| corrupted_service | |
| content | Slow/Unresponsive corrupted service |
| parents | software |
| human | |
| content | Human |
| children | decouple_mistakes, detailed_clear_monitoring, good_management, minimize_opportunities, quick_easy_recovery |
| parents | faults |
| decouple_mistakes | |
| content | Decouple mistakes from failures |
| children | sandbox |
| parents | human |
| minimize_opportunities | |
| content | Minimize Opportunities |
| parents | human |
| quick_easy_recovery | |
| content | Quick, Easy, Recovoery |
| parents | human |
| detailed_clear_monitoring | |
| content | Detailed and Clear Monitoring |
| children | telemetry |
| parents | human |
| telemetry | |
| content | Telemetry |
| parents | detailed_clear_monitoring |
| sandbox | |
| content | sandbox |
| parents | decouple_mistakes |
| good_management | |
| content | Good Management Practices |
| parents | human |
| scalability | |
| content | Scalability |
| children | describing_load, cope_increased_load |
| cope_increased_load | |
| content | System's ability to cope with increased load |
| parents | scalability |
| describing_load | |
| content | Describing Load |
| children | ex_twitter, load_params |
| parents | scalability |
| ex_twitter | |
| content | Example: Twitter |
| children | post_tweet |
| parents | describing_load |
| load_params | |
| content | Load Parameters |
| children | follows_per_user, load_param_increased |
| parents | describing_load |
| post_tweet | |
| content | Post a Tweet |
| children | approach_SQL, approach_cache |
| parents | ex_twitter |
| approach_SQL | |
| content | Approach A: SQL Join |
| children | could_keep_up, post_tweet_more_work (Compared to) |
| parents | post_tweet |
| approach_cache | |
| content | Approach B: cache each users home timeline |
| children | could_keep_up (solution), fan_out, faster_reads, hybrid_approach, post_tweet_more_work |
| parents | post_tweet |
| could_keep_up | |
| content | Initial approach, couldn't keep up with load of home timelines |
| parents | approach_SQL, approach_cache |
| faster_reads | |
| content | Faster Reads |
| parents | approach_cache |
| post_tweet_more_work | |
| content | Posting a tweet takes more work |
| parents | approach_SQL, approach_cache |
| follows_per_user | |
| content | Follower Per User: key load parameter for scalability |
| children | fan_out |
| parents | load_params |
| fan_out | |
| content | Fan Out |
| parents | approach_cache, follows_per_user |
| hybrid_approach | |
| content | Hybrid Approach: tweets from users with huge amount of followers (celebrites) handled separately |
| parents | approach_cache |
| describing_performance | |
| content | Describing Performance |
| children | load_param_increased, response_time, throughput |
| load_param_increased | |
| content | When you increase a load parameter |
| children | keep_resources_unchanged, maintain_performance |
| parents | load_params, describing_performance |
| keep_resources_unchanged | |
| content | ...and keep resources unchanged, how is system performance affected? |
| parents | load_param_increased |
| maintain_performance | |
| content | how much increased in resources is needed to maintain current performance? |
| parents | load_param_increased |
| throughput | |
| content | throughput |
| children | num_records_processed_per_second, batch_process_system |
| parents | describing_performance |
| num_records_processed_per_second | |
| content | Number of records processed per second |
| parents | throughput |
| batch_process_system | |
| content | Batch process system |
| parents | throughput |
| response_time | |
| content | Response Time |
| children | distribution_of_values, latency (Latency and Response time are often used interchangeably,but they measure different things.), online_systems (response time is a metric used in the context of systems,and services that are online), time_btwn_request_response (definition) |
| parents | describing_performance |
| time_btwn_request_response | |
| content | Time between request and response |
| parents | response_time |
| online_systems | |
| content | Online Systems |
| parents | response_time |
| latency | |
| content | Latency: duration that request awaits to be handled |
| parents | response_time |
| distribution_of_values | |
| content | Distribution of Values |
| children | median, outliers, percentiles, avg_mean |
| parents | response_time |
| avg_mean | |
| content | Average/mean |
| parents | distribution_of_values |
| median | |
| content | Median |
| parents | percentiles, distribution_of_values |
| outliers | |
| content | Outliers |
| children | how_bad (Quantifying how bad the outliers are) |
| parents | distribution_of_values |
| how_bad | |
| content | How bad are the outliers? P95, p99, p999. |
| children | tail_latencies, 50th_percentile |
| parents | outliers, percentiles |
| percentiles | |
| content | Percentiles |
| children | how_bad, median, percentiles_in_practice, service_level_agreements, service_level_objectives, 50th_percentile |
| parents | distribution_of_values |
| 50th_percentile | |
| content | 50th Perecentile: p50 |
| parents | how_bad, percentiles |
| tail_latencies | |
| content | Tail Latencies |
| children | head_of_line_blocking, tail_latency_amplification |
| parents | how_bad |
| service_level_objectives | |
| content | Service Level Objectives (SLO) |
| parents | percentiles |
| service_level_agreements | |
| content | Service Level Agreements |
| parents | percentiles |
| head_of_line_blocking | |
| content | head of line blocking |
| parents | tail_latencies |
| small_request_holdup_rest | |
| content | small number of requests holding up subsequent requests |
| percentiles_in_practice | |
| content | Percentiles in Practice |
| children | rolling_window, tail_latency_amplification |
| parents | percentiles |
| tail_latency_amplification | |
| content | Tail Latency Amplification |
| parents | percentiles_in_practice, tail_latencies |
| rolling_window | |
| content | Rolling window of response times |
| parents | percentiles_in_practice |
| multiple_calls_high_latency_probability | |
| content | Chance of high latency increases when end-user requires multiple backend calls |
| more_efficient_than_rolling_window | |
| content | More efficient alternatives to rolling window: forward decay, t-digest, HdrHistrogram |
| load_cope_approaches | |
| content | Approaches for coping with load |
| children | scaling |
| scaling_up | |
| content | Scaling Up |
| children | vertical |
| parents | scaling |
| scaling_out | |
| content | Scaling Out |
| children | elastic, horizontal, magic_scaling_sauce, shared_nothing_arch |
| parents | scaling |
| scaling | |
| content | Scaling |
| children | scaling_out, scaling_up |
| parents | load_cope_approaches |
| vertical | |
| content | Vertical |
| parents | scaling_up |
| horizontal | |
| content | Horizontal |
| parents | scaling_out |
| elastic | |
| content | Elastic |
| children | good_for_unprepared_load, manual_simpler, auto_add_resources |
| parents | scaling_out |
| shared_nothing_arch | |
| content | Shared nothing architecture |
| parents | scaling_out |
| auto_add_resources | |
| content | auto-add resources on load increase |
| parents | elastic |
| good_for_unprepared_load | |
| content | Good for unprepared laod |
| parents | elastic |
| manual_simpler | |
| content | Manual is simpler |
| parents | elastic |
| magic_scaling_sauce | |
| content | "Magic Scaling Sauce" (informal) |
| parents | scaling_out |
| maintainability | |
| content | Maintainability |
| children | design_principles |
| legacy_system | |
| content | Legacy System |
| design_principles | |
| content | Design Principles |
| children | evolvability, operability, simplicity |
| parents | maintainability |
| operability | |
| content | Operability |
| children | good_operability, operations_team |
| parents | design_principles |
| simplicity | |
| content | Simplicity |
| parents | design_principles |
| evolvability | |
| content | Evolvability |
| children | evolve_aka, making_change_easy, agile |
| parents | design_principles |
| evolve_aka | |
| content | AKA |
| children | extensibility, modifiability, plasticicity |
| parents | evolvability |
| extensibility | |
| content | Extensibility |
| parents | evolve_aka |
| modifiability | |
| content | Modifiability |
| parents | evolve_aka |
| plasticicity | |
| content | Plasticicity |
| parents | evolve_aka |
| operations_team | |
| content | Operations Team |
| parents | operability |
| good_operability | |
| content | Good operability |
| children | routine_tasks_easy (characteristic of) |
| parents | operability |
| routine_tasks_easy | |
| content | Making routine tasks easy |
| parents | good_operability |
| managing_complexity | |
| content | Managing Complexity |
| children | remove_accidental_complexity, big_ball_of_mud |
| big_ball_of_mud | |
| content | Big Ball Of Mud |
| children | mired_in_complexity (definition) |
| parents | managing_complexity |
| mired_in_complexity | |
| content | Software Mired in Complexity |
| parents | big_ball_of_mud |
| remove_accidental_complexity | |
| content | Removing accidental complexity |
| children | not_inherent_to_problem (What defines "accidental complexity"?), abstraction (tool for) |
| parents | managing_complexity |
| not_inherent_to_problem | |
| content | Not inherent to problem it solves |
| parents | remove_accidental_complexity |
| making_change_easy | |
| content | Making Change Easy |
| parents | evolvability |
| agile | |
| content | Agile |
| children | framework_adapting_change, test_driven_development, working_pattern, agility_data_system_level |
| parents | evolvability |
| abstraction | |
| content | Abstraction |
| parents | remove_accidental_complexity |
| test_driven_development | |
| content | Test-Driven Development (TDD) |
| parents | agile |
| agility_data_system_level | |
| content | Agility on the data system levle |
| parents | agile |
| working_pattern | |
| content | working pattern |
| parents | agile |
| framework_adapting_change | |
| content | Framework for adapting to change |
| parents | agile |