designing_data_intensive_applications/ch02

ch02

dz / designing_data_intensive_applications / ch02

Summary

Chapter 2: Data Models and Query Languages

Node Tree

Nodes

data_models
content Data Models
children general_purpose, how_it_will_be_used, layer_one_atop_another, represented_next_lower

represented_next_lower
content How is it represented in terms of next-lower layer?
parents layer_one_atop_another, data_models

layer_one_atop_another
content Layering one atop another
children represented_next_lower
parents data_models

how_it_will_be_used
content Embodies assumptions on how it will be used
parents data_models

general_purpose
content General Purpose
children document_model, graph_model, relational_model
parents data_models

relations
content Relations
children sql_table, tuples
parents relational_model

relational_model
content Relational Model
children business_data_processing, comp_to_document_db, data_laid_out_in_open, declarative, foreign_key, hybrid_rel_Doc, locality_relational, nosql (not actually relational), or_mismatch, polyglot_persistance, query_optimizer, relational_vs_doc_today, relations, row_id_vs_text_strings, schema_on_write, shredding, unique_id
parents general_purpose

document_model
content Document Model
children document_reference, doesnt_fit_in_doc_model, hybrid_rel_Doc, joins_are_weak, relational_vs_doc_today, storage_locality
parents general_purpose

graph_model
content Graph-based model
children complex_many_to_many, datalog, graph_queries_sql, objects, property_graph_model, traverse, triple_store_model
parents general_purpose

sql_table
content SQL table
parents relations

tuples
content tuples
children sql_rows (AKA)
parents relations

sql_rows
content SQL rows
parents tuples

nosql
content NoSQL
children object, polyglot_persistance
parents relational_model

polyglot_persistance
content Polyglot, persistance (both)
parents relational_model, nosql
remarks What did I mean by this?

business_data_processing
content Business data processing
children transaction_processing, batch_processing
parents relational_model

transaction_processing
content Transaction Processing
parents business_data_processing

batch_processing
content Batch Processing
parents business_data_processing

object
content Object
children json, or_mapping_frameworks, or_mismatch
parents nosql

or_mismatch
content object-relational mismatch
children impedance_mismatch, or_mapping_frameworks (Tries to smooth the gap.)
parents object, relational_model
remarks How we want to represent things in code (objects) does not match how it is represented in the database (SQL)

or_mapping_frameworks
content Object-relational mapping frameworks
parents object, or_mismatch

json
content JSON
children lack_of_schema, one_to_many_tree_structure, reduce_IM_storage_application, better_locality
parents impedance_mismatch, object

impedance_mismatch
content Impedance mismatch
children json, reduce_IM_storage_application
parents or_mismatch

reduce_IM_storage_application
content Reduce impedance mismatch between storage and application
parents impedance_mismatch, json

better_locality
content Better locality, compared to multi-table schema
parents json

lack_of_schema
content Lack of schema
parents json
remarks sometimes an advantage

one_to_many_tree_structure
content One-to-many relationships explicitly tree structure
children one_to_many
parents json

one_to_many
content One to many
parents one_to_many_tree_structure

many_to
content Many to...
children many_to_many, many_to_one

many_to_one
content One
children normalization, similar_mto1_mtom
parents doesnt_fit_in_doc_model, many_to

many_to_many
content Many
children network_model, record_multiple_parents, similar_mto1_mtom, IMS (Difficult)
parents many_to

doesnt_fit_in_doc_model
content Doesn't fit nicely in document model
children joins_are_weak, many_to_one, normalization
parents document_model

joins_are_weak
content Joins are weak
parents document_model, doesnt_fit_in_doc_model

IMS
content Information Management System (IMS)
children hierarchical_model
parents many_to_many

hierarchical_model
content Hierarchical Model
children network_model (Generalization)
parents IMS

normalization
content Normalization
children removing_duplication_db
parents doesnt_fit_in_doc_model, many_to_one

removing_duplication_db
content Removing Duplication in database
parents normalization, id

id
content Row ID
children removing_duplication_db
parents row_id_vs_text_strings

row_id_vs_text_strings
content Row ID vs Text Strings
children id, text_strings
parents relational_model

text_strings
content Text Strings
children human_readable
parents row_id_vs_text_strings

human_readable
content Human Redable
parents text_strings

network_model
content Network Model
children record_multiple_parents, access_path
parents many_to_many, hierarchical_model

record_multiple_parents
content Records could have multiple parents
children app_tracks_multi_paths
parents network_model, many_to_many

access_path
content Access path
children pointers_rootrecs_links
parents network_model

pointers_rootrecs_links
content Pointers, root records, chain of links
children app_tracks_multi_paths
parents access_path

app_tracks_multi_paths
content App had to keep track of multiple paths
children querying_updating_complicated
parents pointers_rootrecs_links, record_multiple_parents

query_optimizer
content Query Optimizer
children complicated_but_only_built_once
parents relational_model

querying_updating_complicated
content Querying and updating complicated and inflexible
children data_laid_out_in_open (Contrast)
parents app_tracks_multi_paths

data_laid_out_in_open
content Data laid out in the open
children no_complicated_paths
parents querying_updating_complicated, relational_model

no_complicated_paths
content No complicated paths
parents data_laid_out_in_open

complicated_but_only_built_once
content Complicated, but only needed to be built once, and many applications benefit from it.
parents query_optimizer

comp_to_document_db
content Comparison to Document Database
children similar_mto1_mtom, similar_to_hierarchical
parents relational_model

similar_to_hierarchical
content Similar to hierarchical
parents comp_to_document_db

similar_mto1_mtom
content Similar way of representing many-to-one and many-to-many
children related_item
parents comp_to_document_db, many_to_many, many_to_one

related_item
content Related Item
children document_reference, unique_id
parents similar_mto1_mtom

unique_id
content Unique ID
children foreign_key
parents relational_model, related_item, pg_edge, pg_vertex

foreign_key
content Foreign Key
parents relational_model, unique_id

document_reference
content Document Reference
parents document_model, related_item

relational_vs_doc_today
content Relational vs Document Model Today
children doc_pros, rel_pros
parents document_model, relational_model

doc_pros
content Document Model Pros
children better_performance_locality, schema_flex
parents relational_vs_doc_today

schema_flex
content Schema Flexibility
children schemaless
parents doc_pros

better_performance_locality
content Better performance (locality)
parents doc_pros

rel_pros
content Relational Model Pros
children better_joins, better_mm_m1
parents relational_vs_doc_today

better_joins
content Better support for joins
parents rel_pros

better_mm_m1
content Better many-to-many and many-to-one relationships
parents rel_pros

shredding
content Shredding
children shredding_cumbersome (shredding is cumbersome), split_document_into_tables (definition)
parents relational_model

split_document_into_tables
content Splitting a document into multiple tables
parents shredding

shredding_cumbersome
content Shredding is cumbersome, complicates application code
parents shredding

schemaless
content Schemaless
children schema_on_read
parents schema_flex

schema_on_read
content Schema On Read
children different_structures, implicit_structure_interp (definition), schema_on_write (contrasts)
parents schemaless

schema_on_write
content Schema On Write
parents schema_on_read, relational_model

implicit_structure_interp
content Implicit structure interpretted only when read
parents schema_on_read

different_structures
content Adventageous if items in collection don't have similar structures
parents schema_on_read

storage_locality
content Storage Locality
children keep_docs_small, locality_relational, only_for_large_doc_access
parents document_model

only_for_large_doc_access
content Only applies if you need a large part of a document at one time
parents storage_locality

keep_docs_small
content Recommend to keep documents small
children writes_inplace_full (reason for keeping documents small)
parents storage_locality

writes_inplace_full
content in-place updates do not require rewrite (assuming the entry is the same size). A full update requires an entire rewrite
parents keep_docs_small

locality_relational
content Storage Locality in Relationonal Models
children big_table, oracle, spanner
parents storage_locality, relational_model

spanner
content Google spanner: interleaved
children oracle (Similar)
parents locality_relational

oracle
content Oracle: multi-table index cluster tables
parents spanner, locality_relational

big_table
content Big Table: column family
parents locality_relational

hybrid_rel_Doc
content Hybrid of relational and documenta
parents document_model, relational_model
remarks not sure what this means

query_languages_data
content Query Languages for Data
children declarative, imperative, mapreduce_querying

declarative
content Declarative (SQL)
children parallelization, relational_algebra, allows_bts_optimizing
parents relational_model, query_languages_data

imperative
content Imperative (CODASYL)
parents query_languages_data

relational_algebra
content Relational Algebra
parents declarative

mapreduce_querying
content MapReduce Querying
children couchdb, mapreduce, mongodb
parents query_languages_data

allows_bts_optimizing
content Allows for behind-the-scenes optimizing
children parallelization
parents declarative

parallelization
content Parallelization
parents allows_bts_optimizing, declarative

mapreduce_desc
content programming model for processing large amounts of data in bulk across many computers
parents mapreduce

mapreduce
content MapReduce
children map, mapreduce_desc (description), reduce
parents mapreduce_querying

map
content Map
children pure_functions
parents mapreduce

reduce
content Reduce
children pure_functions
parents mapreduce

pure_functions
content Pure Functions
parents reduce, map

mongodb
content MongoDB
children aggregation_pipeline
parents mapreduce_querying

couchdb
content CouchDB
parents mapreduce_querying

aggregation_pipeline
content Aggregation Pipeline
children sql_expressiveness_with_json (description)
parents mongodb

sql_expressiveness_with_json
content SQL-like expressiveness with JSON-based syntax
parents aggregation_pipeline

objects
content Objects
children edges, vertices
parents graph_model

complex_many_to_many
content Handles complex many-to-many relationships
parents graph_model

vertices
content Vertices/nodes/entities
parents objects

edges
content Edges/relationships/arcs
parents objects

property_graph_model
content Property Graph Model
children cypher_query_language, pg_edge, pg_vertex
parents graph_model

triple_store_model
content triple store modle
children semantic_web, subject_predicate_object, turtle_triples_notation
parents graph_model

pg_vertex
content Property Graph Vertex
children collection_of_properties, sets_of_edges, unique_id
parents property_graph_model

subject_predicate_object
content Subject, Predicate, Object
parents triple_store_model

sets_of_edges
content Sets of edges
children incoming, outgoing
parents pg_vertex

outgoing
content outgoing
parents sets_of_edges

incoming
content incoming
parents sets_of_edges

collection_of_properties
content Collection of properties (KV pairs)
parents pg_vertex

pg_edge
content Property Graph Edge
children vertex, unique_id, vert_label
parents property_graph_model

vert_label
content Vertice Label
parents pg_edge

vertex
content Vertex
children head, tail
parents pg_edge

head
content Head
parents vertex

tail
content Tail
parents vertex

traverse
content Traverse
parents graph_model

cypher_query_language
content Cypher Query Language
parents property_graph_model

graph_queries_sql
content Graph Queries in SQL
children recursive_common_table_expressions
parents graph_model

recursive_common_table_expressions
content Recursive Common Table Expressions
children variable_length_traversal
parents graph_queries_sql

variable_length_traversal
content Variable Length Traversal paths in query
parents recursive_common_table_expressions

turtle_triples_notation
content Turtle Triples notation
children RDF_data_model (turtle is a human readable notation for RDF)
parents triple_store_model

semantic_web
content Semantic web
parents triple_store_model

RDF_data_model
content RDF Data Model
children SPARQL, resource_description_framework, SPO, SPO_URIs, XML
parents turtle_triples_notation

resource_description_framework
content Resource Description Framework
parents RDF_data_model

XML
content XML
parents RDF_data_model

SPO
content SPO
children SPO_URIs
parents RDF_data_model

SPO_URIs
content SPO URIs
parents SPO, RDF_data_model

SPARQL
content SPARQL
parents RDF_data_model

datalog
content Datalog
children cypher, qlist_datomic, sim_to_tsm
parents graph_model
remarks older than SPARQL, Cypher

cypher
content cypher
parents datalog

qlist_datomic
content QList, Datomic
parents datalog

sim_to_tsm
content similar to TSM
parents datalog
remarks what is "TSM" here?