distributed_systems_MIT/lec04

lec04

dz / distributed_systems_MIT / lec04

Node Tree

fault_tolerance
- replication
- vmware_ft
logging_channel
depends

Nodes

fault_tolerance
content	Fault Tolerance
children	replication (Tool Used For Fault Tolerance), vmware_ft

replication
content	Replication
children	limits_to, replication_schemes, worth_it, expected_failures
parents	fault_tolerance

expected_failures
content	Expected Failures To Address
children	fail_stop_faults
parents	replication

fail_stop_faults
content	Fail Stop Faults: Stops Computing Entirely
children	hardware_errors
parents	expected_failures

limits_to
content	Limits To Replication (Not Covered)
children	software_bugs, correlated_failures
parents	replication

software_bugs
content	Bugs in Software
parents	limits_to

vmware_ft
content	VMWare FT. This lecture studies this particular replication design.
children	full_state_detailed (This is the approach that VMWare FT uses, which makes,it unique.), output, primary_fails, timer_exact, unicore_processor, vmm
parents	fault_tolerance

hardware_errors
content	Hardware errors can be turned into fault errors sometimes. The advantage of this is that these errors can be detectable.
parents	fail_stop_faults

correlated_failures
content	Correlated failures include hardware defects (such as from defective batch of servers from a single company), and natural disasters like earthquakes.
children	physical_separation
parents	limits_to

worth_it
content	Is replication worth it?
parents	replication

depends
content	Depends on value of a reliable service

physical_separation
content	Physical separtion (different countries)
parents	correlated_failures

state_transfer
content	State Transfer
children	smaller_operations (more favorable than state transfer), whole_state
parents	replication_schemes

replication_schemes
content	Replication Schemes
children	replicated_state_machine, state_transfer
parents	replication

replicated_state_machine
content	Replicated State Machine
children	internal_deterministic, smaller_operations (This is a "pro" for using RSMs over), designing_rsm
parents	replication_schemes

whole_state
content	Sends whole state of primary
children	just_send_external (Sending external events typically means sending less)
parents	state_transfer

internal_deterministic
content	Works on the assumption that most internal operations of a CPU are deterministic
children	unicore_processor (single-core instructions are determinstic)
parents	replicated_state_machine

just_send_external
content	Just send external events (input events, packets, etc)
children	nondeterministic_events (External events are the non-deterministic events)
parents	whole_state

smaller_operations
content	RSMs tend to have smaller operations (compared to state transfer), tends to be more favorable
children	ops_more_complex (Potential downside of RSMs)
parents	state_transfer, replicated_state_machine

ops_more_complex
content	Operations in RSMs tend to be more complex
parents	smaller_operations

unicore_processor
content	VMWare FT replication works on unicore processors
children	multicore_nondeterministic (multicore unable to be used with this replication scheme)
parents	internal_deterministic, vmware_ft

multicore_nondeterministic
content	multicore processors can't be used because the way instructions are interleaved makes them non-deterministic
children	multicore_parallelism
parents	unicore_processor
flashcard (front)	Why can't multicore processors be used in the VMWare FT Replication scheme?
flashcard (back)	The way multicore processors interleave instructions makes them non-deterministic and therefore unsuitable for the VMware FT replication scheme.

level_of_replication
content	What level of replication should be used?
children	full_state_detailed
parents	designing_rsm

designing_rsm
content	Designing a Replicated State Machine (RSM)
children	how_close_is_sync, level_of_replication, new_replica_expensive
parents	replicated_state_machine
flashcard (front)	What does RSM stand for?
flashcard (back)	Replicated State Machine.

how_close_is_sync
content	How close is synchronization? (between primary/backup)
children	sync_ideal
parents	designing_rsm

sync_ideal
content	Ideal Synchronization: if primary fails, switch over to backup with no anomalies.
parents	how_close_is_sync
remarks	this never actually happens in practice, anomalies do occur

new_replica_expensive
content	Creation of a new replica is expensive
children	full_state_detailed
parents	designing_rsm

full_state_detailed
content	Copying full State of machine (registers, memory) is very detailed
children	application_level (more efficient than machine-level replication)
parents	level_of_replication, new_replica_expensive, vmware_ft

application_level
content	Most replication schemes are application-level
children	replication_application
parents	full_state_detailed

replication_application
content	Replication needs to be a part of the application in order to work.
children	existing_software (Existing software runs on top of machine and can work,without modification or any knowledge of replication.)
parents	application_level

existing_software
content	Existing software will work as-is using machine-level replication.
parents	replication_application

multicore_parallelism
content	Multicore Parallelism is not covered
parents	multicore_nondeterministic, nondeterministic_events

nondeterministic_events
content	Examples of non-deterministic events
children	inputs, multicore_parallelism
parents	just_send_external

inputs
content	Inputs are the most common non-deterministic event
children	network_packets
parents	nondeterministic_events

network_packets
content	Inputs in this scope are just network packets
children	data_interrupt
parents	inputs

data_interrupt
content	When a packet arrives, the data in the packet, and the interrupt type is stored.
children	timing_interrupt
parents	network_packets

timing_interrupt
content	The timing of the interrupt (where it is in the instruction set) must be identical.
parents	data_interrupt

vmm
content	Virtual Machine Monitor
children	packet_sends_vm_backup
parents	vmware_ft

packet_sends_vm_backup
content	Network packets, sends to the VM, then sends a version of the packet to the backup
children	primary_outputs_only
parents	vmm

primary_outputs_only
content	Both primary and backup see inputs, primary is the only one that outputs.
parents	packet_sends_vm_backup

logging_channel
content	Logging Channel: stream of events.
children	log_entry_format, only_weird_instructions, arriving_packets
remarks	Context: sending "Log events on the log channel"

primary_fails
content	What if the primary fails?
children	backup_stops_logs
parents	vmware_ft

backup_stops_logs
content	Indicator that primary fails is if the backup stops getting logs from the primary.
children	backup_goes_live
parents	primary_fails
remarks	Apparently logs get sent quite frequently to the backup (many times a second). Some kind of "heartbeat" or timing interrupt? I forget the exact terminology

backup_goes_live
content	The Backup Goes "Live"
children	vm_allows_backup_to_run
parents	backup_stops_logs

vm_allows_backup_to_run
content	The VM allows the backup to run. The backup then stops discarding output.
parents	backup_goes_live

only_weird_instructions
content	Only "weird" instructions get sent to the log channel
parents	logging_channel

log_entry_format
content	Format of a log entry
children	interrupt_type, log_entry_data
parents	logging_channel
remarks	They don't explicitely say what the format of a log entry is in the paper.

interrupt_type
content	Interrupt Type
parents	log_entry_format
remarks	I just wrote "type", but I'm assuming it's interrupt type

log_entry_data
content	Data (from network packet)
parents	log_entry_format

timer_exact
content	Assumes VM has timer in exactly the same place for both the Primary and Backup
children	physical_timer_to_guest, backup_gets_ahead
parents	vmware_ft

physical_timer_to_guest
content	Physical timer interrupts are sent to guest
parents	timer_exact

arriving_packets
content	Arriving Packets
children	NICS_DMA
parents	logging_channel

NICS_DMA
content	Some NICS use DMA (direct memory access) in their implementation.
children	primary_no_DMA
parents	arriving_packets

primary_no_DMA
content	Primary cannot directly access NIC and the DMA directly
children	private_mem
parents	NICS_DMA

private_mem
content	Events from NIC are DMA'd into private memory in VM, then they are copied over to the primary
children	bounce_buffer ("Bounce Buffer" is the term for what this does)
parents	primary_no_DMA

bounce_buffer
content	Bounce Buffer
parents	private_mem

backup_gets_ahead
content	What if backup gets ahead of primary execution? This can't ever happen.
children	event_buffer_nonempty (Event buffer is used to prevent backup from getting ahead)
parents	timer_exact

event_buffer_nonempty
content	Event buffer: VM only executes instructions if non-empty
parents	backup_gets_ahead

output
content	Handling output events
children	network_packets_only, awkward_failures
parents	vmware_ft

network_packets_only
content	In this context, the only thing being output are network packets
parents	output

awkward_failures
content	What are the kinds of awkward failures that could happen?
children	network_split_brain (example of failure), output_rules, test_and_set (Prevantative Solution)
parents	output

output_rules
content	Output Rules Preventative Measures against certain kinds of failures
children	output_waits_for_backup (This prevents issues related to backup not receiving,network packets over log channel)
parents	awkward_failures

output_waits_for_backup
content	Output can't produce any output until backup receives all previous events to this point in time.
parents	output_rules

test_and_set
content	Test And Set: an outside authority that deices which machine (primary/backup) can be "live"
children	acts_like_lock, network_split_brain ("Test and Set" server used to solve this)
parents	awkward_failures

network_split_brain
content	Network Issues can cause split brain
parents	awkward_failures, test_and_set

acts_like_lock
content	Test/Set server acts like a lock. The primary/secondary send requests to this server to get write permission, which in turn set a flag on the Test/Set server.
parents	test_and_set