lec03
dz / distributed_systems_MIT / lec03Node Tree
- log_better
- weak_consistency
- GFS
Nodes
GFS | |
content | GFS (aka Google File System) |
children | files_autosplit, high_speeds_parallel, internal_use, record_append, single_data_center, single_master, why_hard, GFS_goals, auto_failure_recovery, big_sequence, big_storage |
big_storage | |
content | Big Storage |
parents | GFS |
why_hard | |
content | Why is Big Storage hard? |
children | faults, performance |
parents | GFS |
performance | |
content | Performance |
children | sharding (I forget what sharding has to do with performance) |
parents | why_hard |
faults | |
content | Faults |
children | tolerance |
parents | why_hard |
sharding | |
content | Sharding |
children | shard |
parents | performance |
tolerance | |
content | Fault Tolerance |
children | replication |
parents | faults |
replication | |
content | Replication as a means to add fault tolerance |
children | almost_identical |
parents | tolerance |
almost_identical | |
content | "Almost Identical" inconsistency risk |
children | consistency |
parents | replication |
consistency | |
content | Consistency in replications |
children | strong_consistency, bad_replication |
parents | almost_identical |
strong_consistency | |
content | A strongly consistent system will be identical when duplicating data |
children | low_performance_tradeoff, not_strongly_consistent (NOT strongly consistent), system_behaves |
parents | consistency |
low_performance_tradeoff | |
content | A strongly consistent system has a low performance cost as a tradeoff. |
parents | strong_consistency |
system_behaves | |
content | A strongly consistent system behaves just like it was one server |
parents | strong_consistency |
bad_replication | |
content | Bad Replication Design |
children | events_order |
parents | consistency |
events_order | |
content | no way to ensure events (writes/reads) processed in correct order |
parents | bad_replication |
GFS_goals | |
content | GFS Goals: Big, Fast, Global |
parents | GFS |
high_speeds_parallel | |
content | High Speeds, Parallel Access |
parents | GFS |
files_autosplit | |
content | Files Automatically Split |
children | shard (One of the splits of a file is called a "shard") |
parents | GFS |
shard | |
content | Shard |
children | chunk_server (Shards and chunks may be analogous) |
parents | files_autosplit, sharding |
auto_failure_recovery | |
content | Automatic Failure Recovery |
parents | GFS |
single_data_center | |
content | Single Data Center |
parents | GFS |
big_sequence | |
content | Designed for big sequential reads/writes |
parents | GFS |
remarks | As opposed to random reads/reads |
internal_use | |
content | Used internally by Google |
parents | GFS |
weak_consistency | |
content | Designed with weak consistency |
children | nature_of_gfs (I think this is what is meant by weak consistency here?), not_strongly_consistent |
remarks | Heretical to use weak consistency for academics |
single_master | |
content | Single Master |
children | chunk_server (Master knows which chunks are stored on which chunk,servers), master_data |
parents | GFS |
chunk_server | |
content | Chunk Server stores the actually chunks |
parents | single_master, shard |
remarks | Are "chunks" the same thing as shards? |
master_data | |
content | Master Data |
children | filename, handle, log_checkpoint |
parents | single_master |
filename | |
content | Filename |
children | nv |
parents | master_data |
nv | |
content | Non-volatile storage |
parents | filename |
handle | |
content | Handle |
parents | master_data |
log_checkpoint | |
content | log, checkpoint |
children | disk_storage |
parents | master_data |
disk_storage | |
content | Stored to Disk |
parents | log_checkpoint |
log_better | |
content | Log is better than something like database or b-tree because it is more efficient |
record_append | |
content | How a record is appened in GFS |
children | client_data_ps, last_chunk |
parents | GFS |
last_chunk | |
content | Where is the last chunk? |
children | ask_master |
parents | record_append |
ask_master | |
content | Ask the Master server |
children | no_primary, primary_dead |
parents | last_chunk |
no_primary | |
content | No Primary? |
children | find_replicate |
parents | ask_master |
find_replicate | |
content | Find an up-to-date replicate |
children | pick_primary |
parents | no_primary |
pick_primary | |
content | Picks Primary |
children | version_bumped |
parents | find_replicate |
version_bumped | |
content | Version Bumped |
children | tells_primary_secondary |
parents | pick_primary |
tells_primary_secondary | |
content | Tells Primary and Secondary Replicates to Master |
children | lease |
parents | version_bumped |
lease | |
content | Leased on Primary: "you are primary for 60s" |
children | primary_dead (this is what the lease helps with), split_brain_solution |
parents | tells_primary_secondary |
client_data_ps | |
content | Client sends copy of data to Primary and Secondary |
children | primary_offset |
parents | record_append |
primary_offset | |
content | Primary Picks Offset |
children | replicas_write_to_off |
parents | client_data_ps |
replicas_write_to_off | |
content | All replicas told to write the data to that offset |
children | all_replicas_ok |
parents | primary_offset |
all_replicas_ok | |
content | If all replicas reply back "yes", all okay |
children | what_if_some_append |
parents | replicas_write_to_off |
what_if_some_append | |
content | What if only some append? |
children | nature_of_gfs, records_different_order |
parents | all_replicas_ok |
nature_of_gfs | |
content | things sometimes not appending is just the nature of GFS |
parents | what_if_some_append, weak_consistency |
records_different_order | |
content | Records in replicas can be in different orders |
parents | what_if_some_append |
primary_dead | |
content | What if Master server thinks the Primary is dead? |
children | master_doesnt_pick, two_primaries |
parents | lease, ask_master |
two_primaries | |
content | Two primaries in a system is known as "split brain" |
children | master_doesnt_pick (Otherwise, you end up causing "Split Brain"), network_partition, split_brain_solution |
parents | primary_dead |
network_partition | |
content | split brain can be caused by a network partition where parts of the network can transmit but maybe not receive |
parents | two_primaries |
split_brain_solution | |
content | The solution to Split Brain (two primaries) is to use a lease on a primary. After the lease is up, commands are no longer sent to that primary. |
parents | lease, two_primaries |
master_doesnt_pick | |
content | The Master should NOT designate a primary |
parents | two_primaries, primary_dead |
two_phase_commit | |
content | Two-Phase Commit. A mechanism for strong consistency |
parents | not_strongly_consistent, extra_bits |
not_strongly_consistent | |
content | GFS is not strongly consistent |
children | extra_bits, two_phase_commit |
parents | strong_consistency, weak_consistency |
extra_bits | |
content | GFS would need "extra bits" for strong consistency |
children | two_phase_commit (One of the things you'd add to GFS to make it,strongly consistent) |
parents | not_strongly_consistent |