designing_data_intensive_applications/ch01

ch01

dz / designing_data_intensive_applications / ch01

Summary

Chapter 1: reliable, scalable, and maintainable applications

Node Tree

Nodes

reliability
content Reliability
children continue_work_correctly (definition), faults

continue_work_correctly
content Continuing to work correclty, even when things go wrong
parents faults, reliability

faults
content faults
children anticipate_cope, continue_work_correctly, deviation_spec, hardware, human, not_equal_to_failure (Faults are not equivalent to failure), software, things_go_wrong (definition)
parents reliability

things_go_wrong
content things that go wrong
parents faults

anticipate_cope
content anticipate/cope
children fault_tolerant, resilient
parents faults

fault_tolerant
content Fault Tolerant
parents anticipate_cope

resilient
content Resilient
parents anticipate_cope

not_equal_to_failure
content Not equivalent to failure
children systems_stop_providing_service (this is what a failure is)
parents faults

deviation_spec
content Deviation from Spec
parents faults

systems_stop_providing_service
content Systems as a whole stop providing service to user
parents not_equal_to_failure

hardware
content Hardware
children MTTF, hardware_failure_examples (examples), mean_time_to_failure_disks, tolerate_loss_entire_machine
parents faults

hardware_failure_examples
content Examples of harware failure: disk crash, faulty RAM, power outage
parents hardware

mean_time_to_failure_disks
content Mean Time To Failure (MTTF) for disks: 10-60yrs
parents hardware, MTTF

MTTF
content Mean Time To Failure (MTTF)
children mean_time_to_failure_disks
parents hardware

tolerate_loss_entire_machine
content Tolerate loss of entire machine
children rolling_updates
parents hardware

rolling_updates
content Rolling updates: patch made one ndoe at a time
parents tolerate_loss_entire_machine

software
content Software
children bugs (Kind of software fault), corrupted_service (Kind of software fault), runaway_process (Kind of software fault)
parents faults

bugs
content Bugs
parents software

runaway_process
content Runaway Process
parents software

corrupted_service
content Slow/Unresponsive corrupted service
parents software

human
content Human
children decouple_mistakes, detailed_clear_monitoring, good_management, minimize_opportunities, quick_easy_recovery
parents faults

decouple_mistakes
content Decouple mistakes from failures
children sandbox
parents human

minimize_opportunities
content Minimize Opportunities
parents human

quick_easy_recovery
content Quick, Easy, Recovoery
parents human

detailed_clear_monitoring
content Detailed and Clear Monitoring
children telemetry
parents human

telemetry
content Telemetry
parents detailed_clear_monitoring

sandbox
content sandbox
parents decouple_mistakes

good_management
content Good Management Practices
parents human

scalability
content Scalability
children cope_increased_load, describing_load

cope_increased_load
content System's ability to cope with increased load
parents scalability

describing_load
content Describing Load
children ex_twitter, load_params
parents scalability

ex_twitter
content Example: Twitter
children post_tweet
parents describing_load

load_params
content Load Parameters
children follows_per_user, load_param_increased
parents describing_load

post_tweet
content Post a Tweet
children approach_SQL, approach_cache
parents ex_twitter

approach_SQL
content Approach A: SQL Join
children could_keep_up, post_tweet_more_work (Compared to)
parents post_tweet

approach_cache
content Approach B: cache each users home timeline
children could_keep_up (solution), fan_out, faster_reads, hybrid_approach, post_tweet_more_work
parents post_tweet

could_keep_up
content Initial approach, couldn't keep up with load of home timelines
parents approach_SQL, approach_cache

faster_reads
content Faster Reads
parents approach_cache

post_tweet_more_work
content Posting a tweet takes more work
parents approach_SQL, approach_cache

follows_per_user
content Follower Per User: key load parameter for scalability
children fan_out
parents load_params

fan_out
content Fan Out
parents follows_per_user, approach_cache

hybrid_approach
content Hybrid Approach: tweets from users with huge amount of followers (celebrites) handled separately
parents approach_cache

describing_performance
content Describing Performance
children load_param_increased, response_time, throughput

load_param_increased
content When you increase a load parameter
children keep_resources_unchanged, maintain_performance
parents load_params, describing_performance

keep_resources_unchanged
content ...and keep resources unchanged, how is system performance affected?
parents load_param_increased

maintain_performance
content how much increased in resources is needed to maintain current performance?
parents load_param_increased

throughput
content throughput
children batch_process_system, num_records_processed_per_second
parents describing_performance

num_records_processed_per_second
content Number of records processed per second
parents throughput

batch_process_system
content Batch process system
parents throughput

response_time
content Response Time
children distribution_of_values, latency (Latency and Response time are often used interchangeably,but they measure different things.), online_systems (response time is a metric used in the context of systems,and services that are online), time_btwn_request_response (definition)
parents describing_performance

time_btwn_request_response
content Time between request and response
parents response_time

online_systems
content Online Systems
parents response_time

latency
content Latency: duration that request awaits to be handled
parents response_time

distribution_of_values
content Distribution of Values
children avg_mean, median, outliers, percentiles
parents response_time

avg_mean
content Average/mean
parents distribution_of_values

median
content Median
parents percentiles, distribution_of_values

outliers
content Outliers
children how_bad (Quantifying how bad the outliers are)
parents distribution_of_values

how_bad
content How bad are the outliers? P95, p99, p999.
children 50th_percentile, tail_latencies
parents outliers, percentiles

percentiles
content Percentiles
children 50th_percentile, how_bad, median, percentiles_in_practice, service_level_agreements, service_level_objectives
parents distribution_of_values

50th_percentile
content 50th Perecentile: p50
parents how_bad, percentiles

tail_latencies
content Tail Latencies
children head_of_line_blocking, tail_latency_amplification
parents how_bad

service_level_objectives
content Service Level Objectives (SLO)
parents percentiles

service_level_agreements
content Service Level Agreements
parents percentiles

head_of_line_blocking
content head of line blocking
parents tail_latencies

small_request_holdup_rest
content small number of requests holding up subsequent requests

percentiles_in_practice
content Percentiles in Practice
children rolling_window, tail_latency_amplification
parents percentiles

tail_latency_amplification
content Tail Latency Amplification
parents percentiles_in_practice, tail_latencies

rolling_window
content Rolling window of response times
parents percentiles_in_practice

multiple_calls_high_latency_probability
content Chance of high latency increases when end-user requires multiple backend calls

more_efficient_than_rolling_window
content More efficient alternatives to rolling window: forward decay, t-digest, HdrHistrogram

load_cope_approaches
content Approaches for coping with load
children scaling

scaling_up
content Scaling Up
children vertical
parents scaling

scaling_out
content Scaling Out
children elastic, horizontal, magic_scaling_sauce, shared_nothing_arch
parents scaling

scaling
content Scaling
children scaling_out, scaling_up
parents load_cope_approaches

vertical
content Vertical
parents scaling_up

horizontal
content Horizontal
parents scaling_out

elastic
content Elastic
children auto_add_resources, good_for_unprepared_load, manual_simpler
parents scaling_out

shared_nothing_arch
content Shared nothing architecture
parents scaling_out

auto_add_resources
content auto-add resources on load increase
parents elastic

good_for_unprepared_load
content Good for unprepared laod
parents elastic

manual_simpler
content Manual is simpler
parents elastic

magic_scaling_sauce
content "Magic Scaling Sauce" (informal)
parents scaling_out

maintainability
content Maintainability
children design_principles

legacy_system
content Legacy System

design_principles
content Design Principles
children evolvability, operability, simplicity
parents maintainability

operability
content Operability
children good_operability, operations_team
parents design_principles

simplicity
content Simplicity
parents design_principles

evolvability
content Evolvability
children agile, evolve_aka, making_change_easy
parents design_principles

evolve_aka
content AKA
children extensibility, modifiability, plasticicity
parents evolvability

extensibility
content Extensibility
parents evolve_aka

modifiability
content Modifiability
parents evolve_aka

plasticicity
content Plasticicity
parents evolve_aka

operations_team
content Operations Team
parents operability

good_operability
content Good operability
children routine_tasks_easy (characteristic of)
parents operability

routine_tasks_easy
content Making routine tasks easy
parents good_operability

managing_complexity
content Managing Complexity
children big_ball_of_mud, remove_accidental_complexity

big_ball_of_mud
content Big Ball Of Mud
children mired_in_complexity (definition)
parents managing_complexity

mired_in_complexity
content Software Mired in Complexity
parents big_ball_of_mud

remove_accidental_complexity
content Removing accidental complexity
children abstraction (tool for), not_inherent_to_problem (What defines "accidental complexity"?)
parents managing_complexity

not_inherent_to_problem
content Not inherent to problem it solves
parents remove_accidental_complexity

making_change_easy
content Making Change Easy
parents evolvability

agile
content Agile
children agility_data_system_level, framework_adapting_change, test_driven_development, working_pattern
parents evolvability

abstraction
content Abstraction
parents remove_accidental_complexity

test_driven_development
content Test-Driven Development (TDD)
parents agile

agility_data_system_level
content Agility on the data system levle
parents agile

working_pattern
content working pattern
parents agile

framework_adapting_change
content Framework for adapting to change
parents agile