The Art Of Computer Programming
Notes for The Art of Computer Programming series by Donald E. Knuth.
Table Of Contents
2.2.6 Arrays And Orthogonal Lists
2.3.2 Binary Tree Representation of Trees
2.3.3 Other Representations of Trees
2.3.4 Basic Mathematical Properties of Trees
Chapter 1: Basic Concepts
1.1 Algorithms
1.2 Mathematical Preliminaries
1.3 MIX
1.4 Some fundamental programming routines
Chapter 2: Information Structures
2.2 Linear Lists
2.2.2 Sequential Allocation
Data usually has much more structural data than we actually want to represent directly in a computer. [gpuolfjer]
Consider not only the data structure, but the class of operations to be done on the data. [gfawddqwk]
Linear List: sequence of nodes whose essential structure properties involve only the relative positions between items as they appear in a line. [gpddodpju]
Linear List Operations: Access, Insert, Delete, Combine, Split, Copy (List), Get Number Of Nodes, Sort, Search. [guwwkfilp]
Linear List Types: Stack: insert/delete at end of list, Queue: Insertions made at one end, deletions made at the other end, Deque: insert/delete at both ends. [gkwwskdla]
A deque is more general than a stack or queue. [gdhlrslwo]
Stacks are particularly useful for the processing of languages with a nested structure. [gadqiousq]
What is a Schachtelsatze? (!TAOCP vol1, pg 241). [gurrlfdfu]
Stacks generally occur most frequently in connection with explicitly or implicitly recursive algorithms. [gukeuflup]
Kind of like how the book references chapters that haven't been published yet. Long term thinking indeed. [gpadakoqp]
Special terms for stacks: top of stack, bottom of stack. [gqesljhpf]
Knuth does not like push/pop terminology. "The brevity of words 'push' and 'pop' has its advantages, but these terms falsely imply a motion of the whole list within computer memory. Nothing is physically pushed down; items are added onto the top, as in hnaystacks or stacks of boxes". [gfjdpalhd]
Terminology for queues: front and rear. [gidppesse]
Terminology for deques: left and right. [gdlfuuhsl]
Descriptive words in English for stack/deque/queue algorithms: "up-down" for stacks, "waiting in line" for queues, and "left right" for deques. [giihdlrko]
Simplest and most natural way to keep a linear list inside a computer is to put the list items in consecutive locations, one node after the other. [gfshpdwdq]
LOC(x[j + 1]) = LOC(x[j]) + c. C is the number of words per node. [gqkiqahdl]
In general: LOC(X[j]) = L0 + Cj. [glaahlsrd]
Sequential allocation is quite convenient for dealing with a stack. [gpaujjaid]
T is stack pointer, place Y on top of stack: T <- T+1; X[T] = Y; [gfpihkaii]
Queues and Deques get a little bit tricker. Uses two pointers called F and R. [ghorpjsaj]
Rear Insert: R <- R+1; X[R] <- Y; [gasfrlkad]
Front Remove: F <- F + 1; Y <- X[F]; if F = R, then set F <- R <- 0. [gpdkedeoh]
If R stays ahead of F, this approach takes up a lot of space. [gufwkfufj]
Better approach: set M nodes aside, arranged implicitely in a circle: if R = M then R <- 1, else R <- R + 1; X[R] <- Y. And: If F = M, then F <- 1, else F <- F + 1; Y <- X[F]. [gaoahuoai]
Actions above assume nothing goes wrong, which is quite unrealistic! To fix this, add underflow and overflow. [gjjfieeai]
Work on exersize 1. Apparently discusses nontrival aspect of this simple queuing mechanism. [godkaides]
What to do when underflow/overflow occurs? [gqqeokksw]
Underflow might be meaningful: remove items until underflow occurs. [gjafhqalh]
One would hate to give up when only one overflow occurs with room remaining in other lists. [grupqrljh]
If a program uses two lists, they can grown towards eachother. One expands to the left, the other expands to the right. [gdawwwhrk]
There is no way to store three or more lists with variable size such that: A. overflow happens only when the size of all lists exceeds total space. B. Each list has fixed location for bottom element. In order to satisfy A, give up B. [gfauerqeq]
Allow bottom elements of list to change positions. [gqhepfrfh]
No absolute memory address for L0. Not a constant. [guhjuorhk]
Study and figure out and explain MIX code in figure 8, pg 246. [gpelqkeaa]
Important Special Case: Each variable sized list is a stack. [gfllokifw]
Insert: TOP[i] <- TOP[i] + 1; if TOP[i] > BASE[i + 1], then OVERFLOW, else set CONTENTS(TOP[i]) <- Y. [gqeifqkaj]
Delete: if TOP[i] = BASE[i], then UNDERFLOW, else set Y <- CONTENTS(TOP[i]), TOP[i] <- TOP[i] - 1. [gpqfeelwq]
Note that BASE[i + 1] is (i + 1)st stack. [grrufquio]
When OVERFLOW occurs, repack memory. Some ways to do this outlined, intended for linear lists allocated sequentially. [gwrjfqfrr]
1 of 3 possibilities can happen when handling OVERFLOW: [gjauskpsl]
A. find smallest k i <= N and TOP[k] < BASE[k + 1], if it exists. Then move up a notch: set CONTENTS(L + 1) <- CONTENTS(L) for TOP[k] >= L > BASE[i + 1]. [giqiilwjd]
I need to better grok how these shifts work here. [gfswskahl]
B. No k can be found like in A, but a k is found: largest k for 1 <= k < i and TOP[k] < BASE[k + 1]. Move down: set CONTENTS(L - 1) <- CONTENTS(L) for BASE[k + 1] < L < TOP[i]. [grkdkeqhk]
C. TOP[k] = BASE[k + 1] for all k != i. There is no room. Give up. [gwuwluiqq]
Choosing better initial conditions helps avoid overflows in certain situations. However, it can save at most a fixed number of overflows. [grrpfuofo]
Improved Method: make room for more than one entry each time memory is repacked. [gapfuweko]
J. Garwick: Suggests complete repackinmg of memory when overflow occurs. [guhsuswds]
Algorithm G: Reallocate Sequential Tables. (pg 248). [ghhwiiiku]
Algorithm G: reallocate sequential tables (pg 248). If total memory not exceeded, will be re-arranged to make operation CONTENTS(TOP[i]) <- Y possible. [groueskur]
Most interesting part of algorithm G is repacking process. It is nontrivial due to all the shifting up and down. [ghrrkhfqr]
Algorithm R: relocate sequential tables. (pg 249). Based on easily verified fact that the data to be moved downward can't overlap with any data that is to be moved upwards or that will stay put. [goadujkwr]
Mathematical analysis of dynamic storage allocation like what is being done above (algo G + R) is difficult. [glpddswfi]
Some theory that can be derived for tables that only grow by insertion. Deletion and subsequent insertion to cancel effect ignored. [gkuapwdiu]
Moral of the story: a very large number of moves to be made if reasonably large number of items put into table. This is the price of packing many sequential tables tightly. [gsqlhafeo]
When memory is only half loaded, it tiends to require minimal arrangement with algorithm G. [ghqhjrowj]
Almost full memory: Algorithm R takes a while. OVERFLOW much more likely. [gjhqehiwf]
If many variable sized tables are used in a program, don't expect to use up 100% of memory before storage is exceeded. [geafqlpde]
2.2.3 Linked Allocation
Instead of sequential list, use a more flexible scheme consisting of a node containing a link to next node in list. [gfqhouwee]
Linked memory takes up more space, due to the links. [gjwlauhar]
But there is also implicit gain: tables can overlap, sharing common parts. [gdqdhphqk]
In many cases, sequential memory is less efficient, will often leave lots of unused space. [gaiodwapa]
Easy to delete an item from linked list. Sequential allocation generally implies moving a large part of the list into different locations. [gsulpooae]
Easy to insert item into the midst of a list with linking scheme. non-trivial with sequential table. [gpsokfrkw]
References are much faster in sequential case. usefulness of linked list predicated on the fact that a large majority of applications want to walk through lists linearly, not randomly. [gujkeqqii]
Linked lists can be split/joined easily. [ghiejwkis]
Linked lists lend themselves better to more intricate structures than simple linear lists. Variable number of variable sized lists, any node can point to start of new list, nodes can be linked together in several orders corresponding to different lists, etc. [gowpjqroe]
Simple operations, such as proceding sequentially through list, are typically slightly faster with sequential lists. [gdpqohdis]
Linking technique frees one from the consecutive nature of comptuer memory. [gkjdkpise]
AVAIL list/stack: list of available space, used with some mechanism for finding empty spacve for new node. [gwqdpqafa]
Set address of x to new node: X<-AVAIL, AVAIL<-LINK(AVAIL). Shorthand notation for this is X<=AVAIL. [glapfifup]
Node deleted, no longer needed: LINK(x)<-AVAIL, AVAIL<-X. Shorthand is AVAIL<=X. [gapjfqapd]
Storage pool: the set of all nodes that can be allocated. [gujrdjrld]
"We terminate the progrma with regrets" [gahppwilf]
Things omitted from initial discussion of AVAIL: how to setup, test for overflow, managing sizxe of storage pool. [geisoirew]
Current implemetnation/discussion of pools assume nodes of equal sizxe, different siees discusess 2.6. [gelfowepo]
Simplest kind of linked list is a stack. [ghduofpah]
Insert Y to top of stack with pointer P: P <- AVAIL, INFO(P) <- Y, LINKE(P) <- T, T <- P [gijqpjkpr]
Pop Y off stack: If TNUL, then UNDERFLOW, otherwise set P<-T, T<-LINK(P), Y<-INFO(P), AVAIL<
P. Note: "<=" operator here puts P back into pool of free nodes. [gffeijhkw]
Study MIX program 10 (pg 258) [gsdlrrrpw]
Insertion/deletion operations involve cyclic permutation of 3 links. [gauqfsjqs]
P is AVAIL value before insertion (and P not NULL): AVAIL has previous value LINK(P). LINK(P) has previous value T. T has previous value AVAIL. AVAIL->T->LINK(P)->AVAIL [gppkddeel]
Deletion P is T before, P not NULL, Y<-INFO(P): AVAIL->LINK(P)->T->AVAIL. [gqdrudhqj]
Cyclic aspect not really relevant, important point is that precisely 3 links are permuted in these operations. [gwihfphwl]
Linked allocation applies in a particularly convenient way to queues. [gpajaeqou]
Make sure to handle empty lists properly. This is a common programmer error. "specify all conditions carefully". [giuhdwhep]
Another error: forgetting to change some of the links when structure s being maniupoulated. [gpuiuuaou]
Policty for empty queue: F eq NULL, R eq LOC(F). [gjqhdpwki]
Queue deletion: If F eq NULL, UNDERFLOW else set P<-F, Y<-INFO(P), AVAIL<=P and if F eq NULL, then set R<-LOC(F) [gqdiaijuu]
R is changed when queue becomes empty, this is the boundary condition. [grpkhsiid]
Not the only way to represent queues in a linearly linked fashion. Other ways are discussed later (see ex. 30) [gqokolwei]
So far, discussions on performing operations on tables have been abstract, no programs in which techinques are used, no onto practical examples! [goweijarh]
Topological sorting: important process in network problems, PERT charts, even linguistics. [ghrwfsjpj]
Topological sort is potentially useful with problems involving partial ordering
. [goedljiro]
Partial ordering of set S is relation between objects of S, denoted by symbol (Paul: curvy less than equal to? looks like <=) [gsssjhifq]
Satisfies following properties for S (will use <= to represent notation used in book): Transitivity: x<y and y<
z, x<z. Antisymmetry: x<
y and y<x, x
y. Reflexivity: x<=x. [giqwshpsr]
x<=y reads as "x preceds or equals y". [gfjiqpeld]
y!<=z reads as "y does not precede z". [glophdeqa]
PERT networks "x must be done before y". [gksqduolw]
Antisymmetry propety means there are no closed loops or paths that close on themselves. Problem of topological sort: emed the partial order in a linear orer. [ghdrdsiqe]
linear sequence a1, a2, ... an, aj<=ak, j Algorithm proves tha this operation is possible for every partial ordering. [gljeoapdo]
Topological sort example: glossary of technical terms. Find a way to arrange the words in the glossary so that no term is used before it is defined. [grqsrjpaf]
Topological sort process: take object that isn't proceded by any other object in the ordering. Place first in output and remove from set S. repeat until whole set has been sorted. [giqflkrij]
algo T: topological sort (pg 265): inputs a sequence of relations j<=k, indicating that object j precedes k in a certin partial orering. output is set of objects embedded in a linear order. [glijuswfo]
Try algo T by hand on input 18 (264-265) [geaqldlok]
Combo of linked list and sequential allocation. [gqeopjkri]
Sequential memory used for main table, because it makes references to random parts of tables in T3. [goioquosk]
linked memory: tables of "immediate successors" no particular order in input. [godlerdwl]
Analysis of algo T: simple using Kirchoff's law. [gurjjlrja]
Exeuction time has approximate form $c2 n$. $m$ is number of relations, $n$ is number of objects. and $c2$ are constants. [geikppuaf]
Topological sorting technique similar to T (but without the important features of queue links): published by A.B Kahn. [gakupokkq]
Even better topological sort algorithm in 7.4.1 (not printed yet?) [gkruuhfhh]
2.2.4 Circular Lists
Circularly linked list: links back to first item instead of NULL. [gsddqdjwf]
Makes it possible to access all of the list starting form any point. [gjfifiaqo]
Important primitive operations: A. insert Y at left, B. insert Y at right, C. set y to left node and delete. [ghhjowlsl]
A. Insert Y at left: P<=AVAIL, INFO(P)<-Y, LINK(P)<-PTR, LINK<-(PTR)<-P. PTR: link variable that points to rightmost node in list. LINK(PTR): points to leftmost node in list. [gsdoisewq]
B. Insert Y at right: insert Y at left, then PTR<-P. [gfjlpuepa]
C. Set Y to left node and elete: P<-LINK(PTR), Y<-INFO(P), LINK(PTR)<-LINK(P), AVAIL<=P. [gojujlewf]
Operations above do not consider empty list can be handled if empty list handled by making PTR NULL. [gwdkeejde]
Operations give actions of output-restricted deque. Therefore circular list can be used as stack of queue. [gliqqhkri]
Operations A+C stack. Operations B + C: queue. [guwppqeef]
Erase a list (put all items on AVAIL): IF PTR ne NULL, then AVAIL <-> LINK(PTR). "<->" denotes P<-AVAIL, AVAIL <- LINK(PTR), LINK(PTR)<-P. [gwrdqjoos]
Insert entire list L2 at the right of L1: if PTR2 ne NULL, then: (if PTR1 ne NULL, then LINK(PTR1)<->LINK(PTR2)) SET PTR1<-PTR2, PTR2<-NULL. [gffhewhid]
Assumes PTR1 and PTR2 point to disjoint circular lists L1 and L2. [gpffqaphw]
Splitting circular list in 2: with above, can correspond to concatentation/deconcation of strings. [gesdkowpd]
Circular list with one pointer to rear node is equivalent to linear list with two pointers to front/rear. [girraiwlu]
How to find end of list when there is circular symmetry? Iterate throgh list, stop when start found again. Or, add special recognizable node as stopping place: list head. [guuoqwjpd]
List head makes it possible (though not efficient) to get to any point in the list from any other point. [gqpqreawa]
Example use of circular lists: arithmetic on polynomials in variables x,y,z with integer coefficeints. [gqalakajw]
Linked alocation works well: polynomials can grow to unpredictable size. May want to represent many polynomials in memory at once. [grkflhppw]
Algorithm A: addition of polynomials (pg 276). [gdowdafld]
Noteworthy feature of algorithm A: manner in which variable Q1 folows Q around the list. [gosjqqprr]
Algorithm M: multiplication of polynomials (pg 277). [gwjssukqf]
Algorithm M analogous to Algorithm A. [giklwfafk]
2.2.5 Doubly Linked Lists
Adds two links to each node: points to either side of node. [gupieoquw]
Manipulations of linked lists almost always become easier with list head. creates complete symmetry between left and right (loops back). [ghdaujkps]
List head satisfies: RLINK(LLINK(x)) \= LLINK(RLINK(x)) \= x [gwdkudroh]
Doubly linke list takes up more memory than singly. But tradeoff is worth it for the efficient operations with 2-way links. [gfeieoulh]
Deletion: RLINK(LLINK(X))<-RLINK(X), LLINK(RLINK(X))<-LLINK(X),AVAIL <= X. [gupwofrdo]
Nodes in one-way linked lists cannot be deleted without knowing the previous node. [gpjwilsor]
Several algorithms require removing random nodes from middle of list. Doubly linked lists suitbale for this. [gsjhsqfaw]
Easy to insert an adjacent node to NODE(X): P<=AVAIL, LLINK(P) <-x, RLINK(P)<-RLINK(X),LLINK(RLINKE(X)<-P, RLINK(X)<-P. [glkasdjdf]
Discrete simulation: simluation of system in which all changes are in the stae of system are assumed to happen at discrete instants of time. [goifokuor]
Program outlined simulates elevator system in mathematics building at caltech. [gruprdpij]
Five floors, number 0-4. [gsliislef]
Each floor has 2 call buttons: UP and DOWN, corresponding variables CALLUP[j] and CALLDOWN[j]. 0<\j<\
4. [gupfeusde]
CALLCAR[j] varaiables. buttons within elevator car, which direct it to destination floor. 1 means pressed, cleared to 0 when fulfilled. [grhdlioew]
Elevator has one of three states: GOINGUP, GOINGDOWN, and NEUTRAL. [godelhhkq]
Eleavtor system is simulated using 2 coroutines: one for passegners, other for elevator. [gehhfuuuq]
TIME: variable for curret value of simulated time lcock in units o tenths of seconds. [gilpkordj]
Other variables: FLOOR: eleavtor position. D1: 0, except when people getting in/out. D2: 0, if elevator idle on floor for 30+ seconds. D3: 0, except when doors open and nobody is getting in/out. [glqfaoeaq]
Coroutine U (users) [pg 283]: everyone who enters the system begins to perform the actions specified below, starting at step U1. [gajohroow]
Coroutine E (eelvator) [pg 284]: this coroutine reprsents actions fo elevator. Step E4 also handles when people get in and out. [goqfhfujd]
Subroutine D (decision subroutine) [pg 287]: This subroutine is performed at certain critical times, as specified in coroutines above, when a decision about the elevators next direction is to be made. [gedppfadk]
Elevator system described above is complex compared to other algorithms discussed in this book. Choice of real-life system better than any cooked-up "textbook example" would ever be. [glklhlrlq]
Passing of simulated time and handling of simultaneity can be programmed by having each entity represented by node with NEXTTIME and NEXTINST fields. [grqflqhas]
NEXTTIME: time for when next action for this enttity will take place. [gkkaoipjw]
NEXTINST: memory address where this entity is so start executing instructions (analogous to ordinary coroutine linkage). [gosiifwpd]
WAIT: doubly linked list. entities placed here to wait for time to pass. [ghqfffaqr]
WAIT: "agenda" sorted on nexttime field. [gdpkddjir]
Doubly linked lists also used for elevator and QUEUE. [glwifhirl]
Program itself is quite long, can be broken up into 6 parts. [ghriiwofa]
First part: define initial contents of tables: WAIT, QUEUE, and ELEVATOR. [gpjfdadhh]
Next part, basic subroutines and main control routines for the simulation process. [gjdqdikwu]
Routine CYCLE: heart of simulation control. which activity is to act next? [gqhqiahih]
Program for coroutine U comes next, followed by coroutine E. [gaaelhdes]
Decision subroutine not considered here (see excercises). [goshroaqw]
Quasiparallel processing: technique of using WAIT list or agenda to control sequencing of coroutines. [gsdjpahij]
Use special trace program ("profiler") to get good indication of overall efficiency. [gqilshdws]
"It is hoped that some read will learn as much about simulation from the example above as the author learned about elevators while the example was being prepared" [geppeqppu]
2.2.6 Arrays and Orthogonal Lists
Simplest generalization of a linear list is a 2d or higher dimenionsal array of information. [geidieqfu]
2d MxN aray: each node A[j,k] belongs to two linear lists "row j" list and "column k" list. [gqkirledd]
When (2d) array is stored in sequential memory locations, storage allocated so that: LOC(A[j,k])\=(a0) + (a1)j + (a2)k. [gwpfhsusp]
4d array with one words elements Q[I,J,K,L]. Allocate storage to be LOC[Q[I,J,K,L] \= a0 + (a1)I + (a2)J + (a3)K + (a4)L. [gfdlawwpr]
Commonly allocated by arranging elements in lexicographic order, AKA "row major order". [goueppjso]
General definition for memeory storage of k-demmensional array with C-word elements A[I1, I2,..., IK]. (Outlined in book). [gpfdfqqik]
Can be used to correspond to number in mixed radix number system for representing date and time TIME[W,D,H,M,S]. Array would have 2,419,200 nodes. A "pretty fancy application" would use this. Not very practical. [ghaejhdui]
The normal moethd for storing arrays are suitable when all elements are present: a complete rectangular structure. [goodiqhia]
Triangular matrix: stores entries A[j,k], 0\<\k\<\
j\<\=n. [gopkuuqlp]
Previous allocation method does not work for storing triangular matrix sequentially. New form required. [gaehfkfrs]
Lexicographic order does work example triangular matrix. Simple equation falls from this. [gwriwoifh]
Better approach involves packing two triangular matrices of the same size together. [goqioddlf]
Tetrahedral array: generalization of triangular matrices to higher dimensions. [gwduadslp]
Linked memeory allocation can also be applied to higher dimensional arrays in a natural way. Works well for non-rectangular structure. [gleewwuks]
Example: list in which each node has 4 link entries: SEX, AGE, EYES, HAIR. Represents Person. [gopfjuias]
Insertion would be efficient, deletion would be slow, unless double linking used. [gqudjquoh]
Sparse matrices: example that uses linked allocation for orthogonal lists. They are matrices of a large order whose elements are mostly zero. [gpwdoqpqr]
Use circularly linked lists for row and column: ROW/COL indices, VAL value, UP/LEFT links to next nonzero entry upwards/leftwards. [goeallefs]
Pivot step operation: example of nontrivial algorithm using sparse matrice in this form. Operation important part of algorithms for solving linear equations, inverting matrices, solving linear problems via simplex method. [gruhlfsek]
Algorithm S: Given a matrix as represented in figure 14, perform pivot operation. [pg 304]. [ghohdeerl]
2.3 Trees
Trees: most important nonlinear structures that arise in computer algorithms. [gosidiqju]
"branching" relationship between nodes like what is found in the trees of nature. [gfdqewqpr]
Finite set T with one or more nodes. [gqqpkjfjr]
One node is specially designated, called the root of tree root(T). [gpuafadfj]
remaining nodes partitioned into m (greater than 0) disjoint sets T1,...,Tm. Also are trees. Subtrees of root. [gkeekdsah]
Recursion is innate characteristic of trees, hence the recursive definition. [gwiaidkqu]
Degree: number of subtress of a node. [gareeppww]
Terminal node: node with degree of 0. Sometimes called leaf. [gaqrwepjj]
Branch node: a non-terminal node. [gdjhfsifr]
Level: root(T) is 0, any other is one higher than nodes level with respect to subtree of root(T) containing it. [guafqwpsi]
Ordered tree: tree where relative order of the subtrees is important. Sometimes called "plane trees". [gpjehpoew]
Oriented tree: disregard ordering of subtrees of nodes, treat them as equivalent. Only relative oreintation considered, not order. [gjqurjlqj]
Forest: (usually ordered) set of zero or more disjoint trees. [gkkfueedk]
Tree vs forest: little distinction. Delete root of tree, and you have a forest. Add node to a forest and regard trees as subtrees of new node, yields a tree. [greueplpp]
Binary tree: finite set of nodes that is either empty or consists of a root and the elements of two disjoint binary trees called the left/right subtrees of the root. [geqluqlwk]
Ways to show tree structre: nested sets, nested parentheses, indentation. [gufaejqoh]
A forest can be regarded as special case of a list structure. [guiakqhol]
List: a finite sequence of zero or more atoms or lists. [gijdfruju]
Atom: undefined concept here, can refer to elements from any universe of objects that can be desired as long as it is distinguishable from list. [grpfkewar]
Lists vs trees: lists can overlap, or even be recursive. [gehaqeiod]
2.3.1 Traversing Binary Trees
Binary trees are good to understand before other trees, as general trees are usually represented in terms of some equivalent binary tree (2.3.2). [glkjqirej]
Traversing, or "walking through" tree. Visit nodes exactly once. [gfqflpjif]
Complete traversal is linear arrangement of nodes. [gaeuhawes]
3 principal ways to traverse binary tree: preorder, inorder, postorder. [ghkerakkq]
preorder: root, traverse left, traverse right. inorder: traverse left, root, traverse right. postorder: traverse left, traverse right, root. [gdjqkqrqj]
names come from relative position of root with resepct to its subtrees. [gqsrljulf]
in many applications of binary tree, symmetry between left/right subtrees. [gfesqjrri]
symmetric order: AKA inorder. [gfipwahhq]
Algorithm T: traverse binary tree inorder [pg 320]. [gqlashslp]
"visit" here means do whatever activity intended as tree is being traversed. [gfrarsowp]
Proof that algo T works see pg 320. [ghjrwojfu]
preorder algo almost identical inorder. postorder slightly more difficult. [gikrieejh]
Notation for successors/predecessors of nodes. let p
be pointer to binary node. [grqraihhk]
p*
is address of successor of NODE(P) in preorder. p$
is successor, inorder. p#
is successor, postorder. *p
is predecessor, preorder. $p
is predecessor, inorder. #p
is predecessor, postorder. [gphelahpf]
ex. 16 can help clear up confusion about notation. [gewuiweal]
Threaded tree representation: terminal links (otherwise wasted and unused space) are replaced by "threads" as an aid to traversal. [gjwoqhdwl]
Threads always go to a higher node of tree. [gqjfialka]
Every node now has exactly 2 links. [gwipqopwl]
Use some sort of tag to distinguish between thread and regular link. [grufhpsjj]
Traversal becomes simpler with threaded tree. [gallwujke]
Algorithm S: symmetric (inorder) successor in threaded binary tree [pg 323]. [gjadhfeli]
Compared algo S to algo T: no stack required. [goowpsdia]
Program T: Implementation of Algo T. [gskqjqjui]
Program S: Implementation of Algo S. [guprhsusi]
Threaded trees take a bit more time to insert/delete nodes. [gkeqfqopj]
Threaded trees grow almost as easily as ordinary ones do. [glefrikio]
Algorithm I: insertion into threaded binary tree [pg 327]. [gopuwflew]
Right threaded binary tree: threaded RLINKS, LLINKS that are empty set to NULL. Middle ground between threaded and unthreaded representation. [gfkphqudl]
Two binary trees T and T' are said to be similar if they have the same structure. Informal: they have the same shape. Formal: both empty, or both non-empty and L/R subtrees are respectively similar. [gpwqaeflp]
T and T' are said to be equivalent if they are similar and corresponding nodes contain the same informaiton. [gelwouehd]
Figure of 4 binary trees helpful [pg 328]. illustrates dissimilar, similar, and equivalent. [giiqpeolw]
Theorem A: helpful for determining if trees are similar or equivalent. [pg 328] [gsjlelarr]
Algorithm C: copy a binary tree [pg 329]. [grlekkdjf]
2.3.2 Binary Tree Representation of Trees [pg 334]
We now turn from binary trees to just plain trees. [guiuwqqse]
How to represent any forest as binary tree [see pg 334-335 for figures and visual explanation]. [gipqaejus]
Natural correspondence between forest and binary tres. AKA the transformation from forest to binary tree. [gqipseprj]
Rigorous definition for how to binary tree is corresponding to the forest [pg 335]. [gpujahole]
Threaded binary tree representation: right thread links go to rightmost child of a family to the parent. [geeuqokqd]
Left thread links do not have such a natural interpretation. Lack of symetry between left and right. [gwashhpkd]
Preorder and postorder traversal can be applied to forest (therefore, trees). Inorder doesn't have a simple analog from binary trees. [geadhhowa]
Let A: visit root of first tree. B: traverse subtrees of first tree. C traverse remaining trees. Preorder: ABC. Postorder: BAC. [gwjeeposo]
Preorder is "time-honored concept", sometimes can be meaningfully called "dynastic order". [gurfrshju]
Order of succesion to throne: lineal chart of all the aristocracy, write out nodes in preorder, consider only the living. [gileespaq]
Traversing forest in preorder exactly the same as taversing corresponding binary gree in preorder. [ghwkkpeoq]
Traversing forest in postorder is exactly the same as traversing the correspondin binary tree inorder. [gifepeadj]
Use trees to represent algebraic formulas. [gqfjswhrl]
Polish notation: Preorder traversal: prefix notation of formula. Postorder traversal: postfix notation. [gkowisrso]
Algorithm D: differentiation. [pg 340] [glejhsokk]
Some notes on differentiation [pg 338-339]. [gslweqlpe]
Algo D is like the control routine for an interpretive system or machine simulator. To complete it, define routines that peform th actual differentiation. [gfswkasji]
Build tree construction function that makes new trees by joining smaller ones together. [gwfqlqrol]
TREE function: x is node, either constant, variable, or operator. U and V are pointers to trees. [gewusoheo]
TREE(x, U, V): makes new tree with x in its root node and U and V subtrees of root. [gsfskfjoq]
TREE(x, U): tree with only one subtree. [gulwuaiio]
TREE(x): new tree with x as terminal root node. [gqhrupsoa]
COPY(y): copies tree provided by U. [ghkpwqela]
Nullary operators (constants and variables), Unary operators (logarithm and negation), binary operators (addition, subtraction, multiplication, division, exponentiation). [gqdkhajfo]
Program D: differentiation. [pg 343]. [grploehpq]
2.3.3 Other Representations of Trees
Sequentional memory techniques: represent trees in sequential memory. [ghdodjwrs]
Preorder sequential representatiln: all subtrees of a node appear immediately after that node, RLINK arrows never cross eachother, LTAG field is made redudnant. [glfpiapdu]
RLINK field almost redudndant, RTAG and LTAG are what is needed to represent structure. [gukdkpalo]
RLINK/LTAG redudancy is little to no help unless scanning forest sequentially. Otherwise, extra computation required to deduce missing information. [glrojqadi]
RLINK is null for more than half of sample forest. Ways to make use of wasted space: [gpaquhjra]
1. Fill RLINK with subtree below that node. RLINK now called SCOPE. [giadudwuf]
2. Decrease node size by removing RLINK field, add special "link" nodes just before nodes that formerly had non-null RLINK. [gakdkelfl]
3. OMIT RLINKS instead of LLINKS. List nodes in a new order called family order. [gpaslifau]
Family order for forest: visit root of first tree. Traverse remaining trees (family order). Traverse subtrees of root of first tree (family order). [gqflkeuij]
Level order: list nodes left to right, one level at a time. Similar to family order, but families are chosen FIFO instead of LIFO. [grqaidsur]
Postorder with degrees: list nodes in postorder and give the degree of each node instead of links. [gswkukssr]
Postorder with degrees useful for "bottom-up" evaluation of functions defined on nodes of a tree. [giejhwaoo]
Algorithm F: evaluate a localy defined funciton in a tree. [pg 352]. [gqhkjpjih]
ex 2.3.2-10: proof that shows postorder with degrees has sufficient information for tree structure. [gekfrdrfd]
In binary tree representation LLINK can be more accurately called LCHILD. [gwkqqwiea]
Leftmost child usually the "youngest" of children in tree. [grqkqeeuw]
Triply linked tree: each node has LCHILD, RLINK, and PARENT links. [gsderfefo]
Upward links useful in an algorithm for dealing with equivalence relations. [girdrkfeq]
Equivalence relation: relation between the elements of a set of objects satisfying transitivity, symmetry, and reflexivity. [gudwisokr]
Quite different from partial ordering, even though 2 of 3 defining properties are the same. [gsaiephqe]
Equivalence problem: read in pairs of equivalent elements, determine later wheter 2 particular elements can be proved equivalent or not on the basis of given pairs. [giudffhhd]
Equivalence classes: disjoint classes such that 2 elements are equivalent if and only if they belong to the same class. [glhwpiwew]
Algorithm E: process equivalence relations. [pg 354]. [gjhaswlao]
Try algorithm E on input 11 [pg 355] [gsklaupfu]
See exercise A. "remaining relations are more interesting". [gkiapekiu]
8 binary tree methods possible by using straight, circular, and doubly linked lists in the LLINK/RLINK directions. [geuqrsreu]
Ring structure: what happens when circular linking is used in both directions. [gqoksaopf]
Algorithm A: addition of polynomials [pg 357]. [gujaaakwf]
2.3.4 Basic Mathematical Properties of Trees
Nonmathematicl readers are advised to skip to 2.3.4.5. [gpdlkwuuu]
Graph: set of points (vertices) together with a set of lines (edges) joining certain pairs of distinct vertices. [gfirjpded]
Vertices are adjacent if there is an edge joining them. [goisahrih]
Walk: sequence of adjacent vertices, with a start/end vertice. [gpfofqewl]
Walk is a path if vertices are distinct. [gfjpleilu]
Cycle: path with at least 3 vertices, first and last vertice the same. [gdqfladkq]
Graph is connected: path between any two vertices in graph. [glffieslj]
Free tree/unrooted tree: connected graph with no cycles. [ghsqhoehf]
Theorem A: if G is a graph, the following statements are true: [glssdrlda]
A. G is a free tree. [giwwwjspk]
b. G is connected, but if any edge is deleted, the resulting graph is no longer connected. [gdwulfsrw]
c. If V and V' are distinct vertices of G, there is exactly one simple path from V to V'. [gaadjwfwl]
If G is finite, contains a non-zero number of vertices, following is equivalen to a, b, and c: [gllaaplhw]
d. G contains no cycles and has n-1 edges. [giqqerhaq]
e. G is conneted and has n-1 edges. [gidopuras]
Proof is on pg. 364. [grufwljwl]
Applying Kirchhoff's law in the analysis of a free tree, using as an example an abstracted flow chart for program 1.3.3A. [gpipopwkk]
Goal: find all relations between the quantities that are implied by Kirchhoff's law, hopefully gain insight into problem. [geswesisr]
Convert flowchart to graph. Boxes become vertices. Arrows become edges. [guqephalf]
Thick line denotes a free subtree. [gforofwek]
G': free subtree of G. consequence of theorem A. Any edge V-V' not in G', plus G', contains exactly one cycle. [gwkleoeff]
Add single edge to flowchart free subtree G', obtain fundamental cycles. [grkqlehdp]
Each fundamental cycle represents a solution to Kirchhoff's equations. [gkrioiuoo]
Proof that all solutions of Kirchhoff's equations may be obtained as sums of multiples of the fundamental cycles. [pg 367]. [guukpwwrs]
Any execution of a computer program that goes from start to stop gives a set of values E1, E2... E27 for the number of times each edge is traversed, and these values obey Kirchhoff's law, but are there solutions to Kirchhoff's equation that do not correspond to any computer program execution? [gporadiod]
Theorem K summarizes preceding discussion [pg 369]. [gipwdfqwq]
2.3.4.2 Oriented Trees
Direct Graph (or digraph); formally defined as a set of vertices and a set of arcs, with each arc leading form vertex v to v'. [geawfsawu]
out degree of vertex: number of arcs leading out from it. number of arcs such that init(e) is equal to v. [gkuwqarup]
In degree of vertex: number of arcs with fin(e) equal to v. [gaekifrfw]
init(e) is initial vertex of e, fin(e) is final vertex of e. [gijldrwff]
oriented walk: e1 ... en arcs from v to v', v is init(e), v' is fin(en), fin(ek) is init(ek + 1) for 1 lte k lt n. [goqwskffs]
Simple oriented walk: init(e1) ... init(en) are distinct and fin(e1) ... fin(en) are distinct. [gljdswdjf]
orinted cycle: fin(en) is equal to init(e1) and simple. otherwise, is path. [ghroehakl]
strongly connected graph: a directed graph, where ether e is an oriented path from v to v' for any 2 vertices v not equal to v'. [gferfepdj]
rooted: at least one root, vertex R such that either is an oriented path from V to R for all V not equal to R. [gudsepduh]
Oriented tree (sometimes called rooted tree) is a directed graph with a specified vertex R such that each vertex (that isn't R) is the initial vertex of any arc, and R is the root. [gahehajak]
Each vertex has unique path to R: no oriented cylces. [gjoapjrre]
An oriented tree is a free tree when direction of arcs is neglected. [gqhhofhup]
ordered, oriented, free: differences in tree structre is in the amount of information taken to be relevant. [ghsrorepl]
Eulerian trail: oriented walk in a directed grpah where every arc in the graph occurs exactly once, and fin(em) is equal to init(e1). [gpeloeisp]
Directed graph is balanced if every vertex has as many in degrees as out degrees, AKA equal number of edges with V as final vetex and initial vertex. [gdpwilsrf]
Isolated vertices: in/out degree equal to zero. [gkjkulkid]
Theorem G: a finite, directed graph with no isolated vertices possesses an Eulerian trail if and only if it is connected and well balanced [Proof on pg 375]. [gekhskkal]
Lemma E [pg 375] [ghflkireh]
Theorem D: essence is that it shows a simple way to construct an Eulerian trail in a balanced directed graph, given any oriented subtree of the graph. [garqsioll]
2.3.4.5 Path Length
Path length of trees often directly related to execution time. [giwjqkojp]
Extened binary tree: binary tree with nodes, representing emtpy subtree. [gwpwwpjfr]
External path length: sum of path lengths form each external node to the root. [gdhiahkff]
External node: terminal or leaf nodes in extended binary tree. [gsqlsoadu]
Internal path length: like external path length, with internal nodes. [gararfjps]
Relation between internal and external nodes: E equals I + 2n. [gflpqeppu]
E: external path length, I: internal path length, n: number of internal nodes. [gkheuhwas]
Formula above can be proven by deleting an internal node some distance from the root [elaborated on pg 402]. [gpqaepfjl]
Internal and external path length greatest when there is a degenerate tree with linear structure. [ghkueuarf]
Average path length over all binary trees is $n\sqrt{n}$ (see ex. 5). [gdolapjlr]
How to construct binary trees with minimum path length? [guoppeqld]
Minimum path length binary trees important for many algorithms because it minimizes computation time. [gqroiqewd]
One node 0 from root (root), at most 2 1 distance away from root, 4 2 away, etc. [gwksudkaa]
Internal path length at least as big as the sum of the first n terms of th eseries: 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, ... [geqawsdoe]
Optimum value is $n \log(n) + O(n)$. [guukaphwj]
Complete binary tree: a binary tree with n nodes that is "filled" completely left to right (as much as possible). [gjrhpfwph]
A complete binary tree can be represented in sequential memory. Locations of nodes are implicit. [gadahkfhf]
t-ary tree: set of nodes that is either empty or conists of t ordered disjoint t-ary trees. [guoaldrkh]
Weighted path: m real numbers w1, w2, ... wm. Construct binary tree with m external nodes, try to minimize sum $\sigma wj$, where l is the length of path to root. Sum is taken over all external nodes. [gjsouwqpa]
Optimal search procedure: use a tree that minimizes the weighted path length. [gfeoupqik]
Huffman's Method: elegant algorithm for finding a tree with minimum weighted path length. [gsqhkdlpe]
Example of Huffman's Method [pg 402]. [grirwpwid]
Proof that HM minimizwes the weighted path length. [goeoiousp]
If weights have ibeen sorted in non-decreasing order: maintain 2 queues: original weights + combined weights. smallest unused sed weight will appear at the front of the queues. [gjdfiquar]
Huffman's Method: every time two weights combined, they are at least as big as the weights previously combined. A neat way to find Huffman's tree. [gfppirqhi]
Huffman's method can be generalized to t-ary trees as well as binary trees. [gaflluasi]
2.3.5 Lists and Garbage Collection
(Note: I made a typo in my handwritten notes and wrote it out as 2.5.3. Oopsies).
Any forest can be regarded as a list, but not vice versa. [gsqqkwsfj]
List head: additional element added to the top of every list. Utilized to make list operations (such as removing an element in a list) easier. [gdhosodwu]
Thought the concept of an "atom" may seem artificial, it is primarily introduced to aid in the effective use of computer memory. [gqsipkwrr]
Lists: linear list whose elements may contain pointers to other lists. [gsojreqok]
Common list operations: creation, destruction, insertion, deletion, splitting, concatentation. [gfakefheu]
Operations usually for tree structures: copying, traversal, input/ouput of nested information. [gsakresdl]
Some important points about the List structures being considered here in this section: [geqjowifs]
1. A TYPE field indicates what kind of information the node represents. [gueasdrrq]
2. Nodes can be formatted in a 1 or 2 word format in MIX. [gquojkhsp]
3. While List structures are not usually the most efficient format for any particular problem, they are quite easy to work with, can be used in a wide range of problems, and can save on debugging time. [grlwrorqp]
5. The problem of maintaining the list of available space is consderably more difficult than discussed previously with simpler cases. [ghlqkrdeh]
Two available methods for maintaining the available space list: reference counters and garbage collection. [gjswqredw]
Reference counter: each node contains a count field, storing how many arrows point to the node. When it drops to 0, it becomes available. [geijjsiqo]
Garabge collection: each node has a 1-bit field known as a "mark bit", used to indicate that the node is not acessible. [gsjpadiio]
Algorithms using GC work until all the available space is used up. A recycler algorihtm uses the mark bits to identify nodes that aren't accessible, and returns them back to available storage. [gijewlqwu]
Drawback of reference counting: does not always free nodes that are avaialbe. [goqeusdhq]
Overlapped Lists fine. Recursive Lists will never be returned to storage by reference counting technique. [gaakrsaqj]
[Meta: Writing style here uses a structure I like that I wasn't sure was "valid". 2 paragraphs talk about 2 things. First paragraph says "there are 2 things", procedes to talk about first thing, second paragraph second thing.] [gqwrepaqw]
Drawback of Garbage Collection: runs very slowly when nearly all memory space is in use. [ghupdsdfi]
Sometimes it is difficult to know when lists should be marked as garbage, especially if the programmer uses nonstandard techniques. [gqjiwlsjl]
GC requires programmers to keep valid info in all pointers at all times. [gpaufadds]
Instead of using a mark bit for each node, keep track of bits in a table. [goufpwrdh]
J. Weizenbaum reference counter technique: use doubly linked List structures and only put a reference counter in the header of each list. [galjjwkwa]
When Reference Counter of list reaches zero, append to end of Available list. Efficient, but "it tends to make incorrect programs run correctly for a while". [golslqdsa]
Tends to be difficult to debug, must be handled carefully. [goqdakuwj]
GC algorithms are useful in other situations that require marking nodes that are directly or indirectly referred to by a given node. [gdhopwhew]
GC typically has 2 phases: mark all nongarbage nodes, then make sequential pass over entire area, putting all unmarked nodes into a free space. [gqpwrhiqh]
Algorithm A (marking) [pg 415] [gsrleskrl]
Difficult to precisely analyze algorithm A. NM time worst case (N is number of cells marked, M is number of nodes in entire memory). [gjdsjwuff]
Algorithm A is too slow to make garbage collection a usable technique. [gsuukhiwe]
Algorithm B: marking, but use a stack to keep track of branch points [pg 415]. [grjkopokj]
Algorithm B performs well, but not usable for GC because there is no place to keep the stack. [gpopuaoud]
Alternative: use a fixed stack size by combining algorithms A + B. [groqhuwki]
Algorithm C: marking which achieves the same effect as Algorithms A + B, using a fixed size aux table as a stack [pg 415]. [gqwdhpkqk]
Algorithm D: marking, but use linked memory to maintain stack, scattering it through the same memory space being garbage collected. [gweiflkew]
Algorithm D requires that all list structures be well formed. [gpauwqifs]
List manipulation algorithms momentarily leave lists malformed. Do not use Algorithm D during these moments. [glfirkips]
Algorithm E: another marking algorithm, following the considerations from Algorithm D. Independently discovered by Peter Deutsch and Herbert Schorr. [pg 418] [gdplswuks]
Idea used in Algorithm E can be applied to other problems, such as tree traversal (see 2.3.1-2 ex). [gspaowiww]
Fastest garbage colleciton method known combines Algorithm B and E (see ex 5). [gioolhplh]
Best GC algorithms have performance time of $c2M$. c1 and c2 are constants, N is number of nodes marked, M is number of nodes in memory. [gpasujjul]
M - N: number of free nodes found. [gpqieekse]
Amount of time required to return nodes to free storage: $(c2m) / (M - N)$. [gqdllerla]
Can be seen that memory availability affects performance: performs well when it is generally available, worse when it becomes more full. [ghresdrer]
Many programs have the property that the ratio $P=N/M$ of good nodes in memory is quite small. [gpkdkukra]
Move active list data to another memory pool of equal size when it fills up, then move back to first when that fills up. [gdlqakiqs]
Useful because more data can be kept in high speed memory at once. [gpjufrlhq]
[Cache locality? Seems like a surprisingly modern consdieraiton for TAOCP] [gldeilkwr]
possible becaus elink fileds tend to link to nearby nodes. [gwdarojow]
Hybrid GC system: combine GC with other methods of returning free cells, use GC as "last resort" when other methods have failed. [gjsisarap]
Sequential representation of Lists can save many link fields at the expense of complexity. [gkuupdhlk]
Reference Counting can work with lists, even when lists point to themselves. [gharewqll]
Ex 12: discusses approaches involving GC in realtime applicaitons. [gjlfokkdl]
2.4 Multilinked Structures
In higher level applications several types of structures are usually represented simultaneously. [geskouuih]
Multilinked structure: nodes with several linked fields in each node. [grefrpjaj]
How much structural information ought to be explicitely recorded in memory? [goilkkfsi]
[COBOL is mentioned. Was it commonlyu used when this section was written? First concrete mention of a programming language up to this point (other than MIXAL of course)]. [gedrihhjf]
Investigate algorithms suitable for use in COBOL compiler: [grjwsidfs]
Process COBOL-like syntax of names and level numbers, to be usable in algorithms. [gidlroeqo]
Determine if given qualified reference (A0 of A1 of ... of An) is valid. [gpjsaohoe]
Find all pairs of items given a CORRESPONDING statement. [gkorrwehp]
Assume compiler already has symbol table routine that will convert a name to a link that points to entry for name. [glareuowe]
Data Table: table in COBOL compiler, in addition to symbol table, that has an entry for each item of data in the program. [gqpfihuws]
Five Link fields in each Data table entry: PREV: previous entry with same name, if any. PARENT: smallest group, if any, containing this item. NAME: symbol table entry. CHILD: first subitem of group. SIB: next subitem in group containing this data. [guoejhaqr]
Link structure will be utilized in algorithm design, even if not all five links turn out to be necessary or sufficient. [gwoupkdup]
[worth examining figures on pg 2427 and pg 428] [gjkfqfoou]
Algorithm A: Build Data table. Input is sequence of pairs (L, P) where L is positive integer Level number and P points to a symbol table entry [pg 428]. [giwepddpo]
After building data table, next problem is how to look up entries given the format A0 of A1 of ... of An. [gfuauaeaj]
A good compiler will also check that the reference is unambiguous. [gsfiuqwuk]
Algorithm B: check a qualified reference. [pg 429]. [gkeddpahr]
Third and final algorithm concerns MOVE CORRESPONDING. [gosjdksak]
MOVE CORRESPONDING a to b. Where a and b are references to data items and MOVE CORRESPONDING is an abbreviation for the set of all statements MOVE a' to b'. [guplkpkar]
recognize all corresponding pairs a' b': move through tree whose root is a, in preorder, simultaneously looking in b tree for matching names, skipping over subtrees where no corresponding elements can occur. [guuwdqfkd]
Algorithm C: find CORRESPONDING pairs. [pg 431]. [grpiesffe]
Each of the 5 link fields in the data table can be viewed as "clues" to the program, each planted to make the program run faster. [gfqqkqifw]
PREV: not used in algorithm C, but very important in algorithm B. Without it, lots of lengthy searches. [gjhjoieuo]
PARENT: used in algorithms B + C, C could have used aux stack instead. [gkfjdiskk]
NAME: used only in steps B6 and C3. Other methods can do the equivalent, but are slower. [gufdldrqj]
If SIB was augmented to be threaded, it would be possible to lcoate PARENT. It is slower. [gakkrjhuq]
Algorithm A: Data table entries usually take consecutive memory locations in the order they appear in COBOL source program. Sequential nature can lead to simplifications. [gfprskpes]
2.5 Dynamic Storage Allocation
Linked memory philosophy: "If there isn't room for the informaton here, let's put it somewhere else and plant a link to it." [geaoihdso]
Dynamic storage allocation algorithms deal with nodes of varying sizes sharing a common memory area. [grokplphk]
Operations: reserving and freeing variable sized blocks of memory from a larger storage area (whose memory locations are consecutive). [giojhjqoe]
Mostly smaller-sized nodes vs mostly larger-sized nodes lead to slightly different approaches to dynamic storage allocation. [gjrdpiloe]
"block" and "area" used to denote a set of contiguous memory locations. [gkjppujki]
Circa 1975, people started calling pool of available memory a "heap". Will not be used t here. only for priority queues. [giuhaeeed]
Problems to address wieh working with dynamic allocation: [grapaajah]
a. How to represent parittioned free space inside computer? [glprekero]
b. Using representaiton. What is a good algorihtm for finding N consecutive free spaces and reserving them? [glkhhwlai]
Representaiton: Keep track of availble space with list. Also maintain list inside space (usually). [gidkfrdkl]
Available segments can be linked together. [gjksokfdi]
Order: size, memory address, or essentially random. [gokuplaod]
Find N consectutive words: location some block of $m >= n$ words, reduce size to $m - n$. [gjeasorei]
Many suitable block sizes exist potentially. How to choose? [giakafphp]
Best-fit method: choose smallest block that is size N or more. [giholrauu]
First fit method: choose first block found that is size N or more. [gliuruluh]
Best-Fit was used for many years, saves the larger areas. But, there are drawbacks. [glosqwpld]
Best fit can be slow, since it involves a fairly long search. This yields diminishing returns. [geeakkdwl]
Best-fit tends to increase the number of very small blocks. Proliferation of small blocks is undesirable. [gljjuoqsp]
Algorithm A: first fit method. [pg 437]. [gjhpeferr]
There is a way to improve Algorithm A. "The reader will find it a pleasure to discover it without being told the secret". See Ex. 6. [gpkwqefde]
Sometimes it is better to reserve the whole block rather than a portion of it when the extra is quite small. (ex: a block of size 1 is rather useless). [gilwrfhqf]
Control word: first word for each variable sized block tha tells how many words hav ebeen reserved, so later the block becomes available, the entire $N + k$ set is freed. [gdoskaiqp]
Liberation: How to return blocks to the avaiable space when they are no longer needed? [gklsiarlq]
[Liberation! what great terminology] [gjqswirow]
GC is tempting, but not recommended. GC often isn't "disciplined" enough with pointers to make the free areas easy to locate. Also, it's slow. [gpdaoqaao]
In GC, adopt a philosphy of returning blocks to AVAIL list as soon as they are free, and collapse adjacent available areas together. [gwisqpfko]
Use GC with process of compacting memory, or moving all reserved blocks into consecutive locations, so that available blocks come together when GC is done. [gkhlrfppf]
Collapasing problem: 2 adjacent free areas should be collapsed into one. When an area bounded by two available blocks is freed, all 3 areas should be merged into one. [gidajkaqo]
This achieves good balanced in memory even though regions are constantly being reserved/freed over long periods of time. [gqkfrsjpd]
Problem: determine whether the areas of either side of the returned block are currently available, and if they are, update AVAIL list properly. [gdffkkiso]
First thing: maintain AVAIL list in order of increasing memory locations. [gqaaopraf]
Algorithm B: liberation with sorted list [pg 440]. [geuwrqjsd]
If AVAIL is unsorted, brute force approach to collapsing problem would require complete search through list. Sorted takes half th etime on average. Algorithm B can be modified to take 1/3 of time. [gafaissop]
Method that eliminates all searching when storage is returned, can be modified to avaoid almost all searching when memory is reserved. [guukfpkdw]
Uses TAG field at both ends of each block, SIZE field in the first word of each block. [gjhpwoero]
Maintain AVAIL as doubly linked list. TAG field at each end of block used to tell if adjacent blocks are available and can be collapsed. [geskjoiai]
Algorithm C: Liberation with boundary tags. [pg 441]. [gqwsreepi]
"Buddy System" dynamic memory: keep separate lists of available blocks of size $2^k$ $0 <= k <= m$ for $2^m$ words. If a request for block size $2^k$ is not avaialbe, a larger block will split apart into 2 blocks, called "buddies". [gklswephi]
The address of the buddy can be obtained from the address and size of block. [ghlueuosh]
The address of a block of size $2^k$ is a multiple of $2^k$, in binary notation, the address has at least k zeros to the right. [ghollsiuo]
Let buddy(x) be the address of the buddy of the block whose address is x and size is $2^k$. [glokqorfp]
$$buddyk(x) = x-2^k$$ if $x * mod(2) = 2^k$. [gdhqqfrws]
Can be computed with XOR function. [gsjsaparf]
1-bit TAG field for each block to indicate reserved/available. [gajahrpdh]
available blocks have forward/backward fields, links for doubly linked list. [goeophkoa]
Algorithm R: buddy system reservation. [pg 443]. [gprfdpsjd]
Algoirhtm s: buddy system liberation [pg 443]. [gepkufaoh]
fifty-percent rule: average number of available blocks tends to be approximately (1/2)pN, where N is number of reserved blocks in system, each equally likely to be freed, and p is some probability value. [ghdeousfr]
When the probability value p is near 1, due to small c value and if block sizes are infrequently equal to one another, there are half as many available blocks as unavailable one. [gkfaoafep]
Besides 50% rule, most knowledge of dynamic storage allocation performance comes from monte carlo experiments. [gleppooeo]
[Knuth describes experiment for simulating dynamic memory allocation, then explains results] [giwfliiqd]
In experiments, first fit performed better than best fit, took longer to memory overflow. [gkfkdqdhe]
Buddy system: found up sizes to nearest power-of-2. Performed much better than expected. [gleojjhoq]
Buddy system could allow for two adjacent free areas of the same size to exist without merging, but is unlikely. [ghdrpqprj]
When overflow occured, memory was at 45% reserved, which is good! [gfjfhwhpr]
Algorithm A, after modification, only required 2.8 inspections of block sizes on average, even though about 250 blocks were present. [gukjredje]
GC: if accessible items never occupy more than 1/3 total space, and nodes are relatively small, best strategy is to divide pool in half and do allocation in one half. [gudiaqaji]
When space fills up, move to other half and remove holes between blocks. [gqauifkeh]
Distributed fit: specialized technique that could be used when distribution fo block sizes is known in advance, and each block present has equal likelihood of being next freed. [gsjfiilsr]
It is generally not worthwhile to write a compacting program to handle overflow, unless it is with GC. [graakrkud]
To handle overflow, remove items from memory, and store them externally, with some way of accessing when deeded. [gferdrfop]
Implication: programs must be severely restricted to what allowable memory blocks they can access. [ghfrkfosi]
Special computer hardware and paging for efficient operations. [gelquuwue]
Ways to determine which blocks are the most likely candidates for removal. [gawspdrkl]
Doubly linked list: move back to front of the list each time it is accessed. [gqhqpudsf]
Circular list: use a "recently used" bit. When it is time to remove a block, move through list, resetting bits until finding a block that has not been used since the last time it reacehd this part of the circle. [gqlslakwj]
Linking automata -> pointer machines. [gawiquwqo]