lec06
dz / distributed_systems_MIT / lec06Summary
RAFT part 1
Node Tree
-
raft
- problems_solved
- time_diagram
- automated_failover
- early_systems
- library_implementation
- logs_diverge
- majority_vote
- network_failures
- operation_log
- failure
Nodes
raft | |
content | RAFT |
children | problems_solved (Problems that RAFT solves), time_diagram, automated_failover, early_systems, library_implementation (how it's typically used in practice), logs_diverge (raft ensures identical long term), majority_vote (Core principle), network_failures, operation_log |
problems_solved | |
content | Problems Solved |
children | prevent_split_brain, replication_patterns (example) |
parents | raft |
replication_patterns | |
content | Common Patterns Found in Replication Systems |
children | single_entity (The main pattern being observed), vmware_ft (example), GFS (example), MR (example) |
parents | problems_solved |
GFS | |
content | GFS Replication: datat relies on single master |
parents | replication_patterns |
MR | |
content | MapReduce: replicates communication, but controlled by single master |
parents | replication_patterns |
vmware_ft | |
content | VMWare FT: relies on single test and set server |
parents | replication_patterns |
single_entity | |
content | Single Entity to decide critical decisions |
children | single_point_of_failure (single entity -> single point of failure) |
parents | replication_patterns |
single_point_of_failure | |
content | This is a single point of failure |
parents | single_entity |
prevent_split_brain | |
content | prevent split brain |
children | partition |
parents | problems_solved |
partition | |
content | Partition: sides that can't communicate |
parents | prevent_split_brain, automated_failover |
flashcard (front) | Partition (RAFT) |
flashcard (back) | Refers to sides that can't communicate |
automated_failover | |
content | automated failover systmes that can partition |
children | partition |
parents | raft |
majority_vote | |
content | RAFT centers around the principle of a majority vote |
children | quorem_systems, total_number, assemble_majority, leader_election, odd_number (required to prevent ties), overlap (This is an important property of the majority vote.) |
parents | raft |
total_number | |
content | Out of total number of servers, not just active servers |
parents | majority_vote |
odd_number | |
content | Odd number of servers required to prevent ties |
parents | majority_vote |
assemble_majority | |
content | Assemble majority before doing anything |
parents | majority_vote |
quorem_systems | |
content | Majority Vote is also known as being a "Quarom"(sp?) system |
parents | majority_vote |
overlap | |
content | Any two majorities overlap for at least one server |
parents | majority_vote |
early_systems | |
content | Early Systems that implemented something like RAFT |
children | vsr, paxos |
parents | raft |
paxos | |
content | Paxos |
parents | early_systems |
vsr | |
content | View Stamp Replication (VSR) |
parents | early_systems |
library_implementation | |
content | RAFT is typically implemented/used as a library |
children | raft_layer |
parents | raft |
operation_log | |
content | Operation Log |
children | log_focused |
parents | raft |
raft_layer | |
content | Application Architecture has a RAFT Layer |
parents | library_implementation |
time_diagram | |
content | Time Diagram: used to visualize communications between leader/followers and client request |
parents | raft |
log_focused | |
content | Log Focused. Why? |
children | rejoin, resend_replicas, tentative_ops, mechanism_ordering |
parents | operation_log |
mechanism_ordering | |
content | Mechanism for Ordering Operations |
parents | log_focused |
tentative_ops | |
content | Place to set aside tentative operations (for the follower) |
parents | log_focused |
resend_replicas | |
content | Way to resend events to replicas (leader) |
parents | log_focused |
rejoin | |
content | Means to rejoin |
parents | log_focused |
failure | |
content | If there's a failure, what did the logs see? |
remarks | Didn't connect to anything |
logs_diverge | |
content | logs sometimes diverge |
parents | raft |
leader_election | |
content | Leader Election |
children | term, zero_leaders, leader_partition_minority, majority_rule, possible_no_leader |
parents | majority_vote |
term | |
content | Term for a leader |
children | at_most_one (at most one leader), election_timer, no_one_can_append |
parents | leader_election |
possible_no_leader | |
content | It's possible when designing these systems to not have a leader (leaderless), but using a leader yields better performance |
parents | leader_election |
at_most_one | |
content | At most one leader per term |
parents | term |
majority_rule | |
content | Majority Rule: allows election to happen if something crashes |
children | client_no_response (Appends from old leader can never happen due to the majority rule), no_one_can_append |
parents | leader_election |
election_timer | |
content | Election Timer |
children | random_split_votes, atleast_heartbeat, if_expires (start election if expires), max_election_timer |
parents | term |
if_expires | |
content | If election timer expires, start election |
children | term++ |
parents | election_timer |
term++ | |
content | Increment term number on new election |
children | request_votes |
parents | if_expires |
request_votes | |
content | Request Votes |
parents | term++ |
no_one_can_append | |
content | No one can append entries unless they are leader for that term. |
parents | majority_rule, term |
random_split_votes | |
content | The chance of split votes are reduced by randomizing the election timer amount |
children | choose_new_random |
parents | election_timer |
zero_leaders | |
content | What if zero leaders? |
children | failure_rest |
parents | leader_election |
failure_rest | |
content | Failed election. Reset. |
parents | zero_leaders |
atleast_heartbeat | |
content | At least as long as heartbeat |
parents | election_timer |
max_election_timer | |
content | What should the max election timer time be? |
children | longer_delay, longer_than_roundtrip |
parents | election_timer |
longer_delay | |
content | Longer delay means longer recovery time (slower client requests) |
parents | max_election_timer |
longer_than_roundtrip | |
content | Should be longer than the roundtrip latency of for making an election |
parents | max_election_timer |
choose_new_random | |
content | Choose new random number on each reset |
parents | random_split_votes |
leader_partition_minority | |
content | What if? Leader in partition with minorty due, to network failure? |
children | client_no_response |
parents | leader_election |
client_no_response | |
content | The client will never hear a response, because this leader has a minority and no append will ever happen. |
parents | majority_rule, leader_partition_minority |
append_entries_to_subset | |
content | What happens if a leader appends to subset of entries? |
children | divergent_logs |
parents | network_failures |
network_failures | |
content | Problems due to network failures |
children | append_entries_to_subset |
parents | raft |
divergent_logs | |
content | How does new leader sort out divergent logs? |
children | log_combos |
parents | append_entries_to_subset |
log_combos | |
content | visualize hypothetical log combinations from different servers |
children | could_it_happen |
parents | divergent_logs |
could_it_happen | |
content | important to ask: could it happen? Could it actually occur? |
parents | log_combos |