Reverse : think stack duplicates or string signitures : hashmaps ---------------------------------------------------------------------------------------------------------------------------------------- 2 Object Orienttabed Design ---------------------------------------------------------------------------------------------------------- 2.1 Goals Patterns and Principles - Objects come from classes, which are a specification of teh data members that the object contains, as well as the member functinos. - Each class presents to the outside world a consise and consistent view of the objects that are instance of this class, without going into too much detail, or giving others access to the inner workigs of the object. 2.1.1 Object Orinetated Design Goals - We want to acheive robustness, adaptibility, and reusablilllty. * Robust not only shuold a program give the correct output for all good input, it should also be able to handle bad input, or unexpected input which is not explicitly defined by the application. *Adaptibility Software designed in the past has to meet new changed conditions, should be able to evolve. *Reusability Code should be usable for different applications in different systems. 2.1.2 Object-Orientated Design Principles - 3 main areas; Abstraction, Modularity, Encapsulation - Abstraction 1) To distil a system into its most fundamental part, and describe these parts in a simple and recise language, typically naming and explaining functinolaity of a system 2) Applying the abstraction paradigm, leads to abstract data types 3) An abstract data type specifies what an object does, but not how it does it 4) In c++, the funcitonality of a data structure is expressed through a public interface (sort of the class's public member functions) 5) An ADT is realised by a concrete data structure, which is called a class in c++ 6) A class defines the data stored, as well as the operations supported by it 7) Classes also specify how operations are perfored in the body of each function. 8) A C++ class is said to implement an interface, if its functions include all the functions declared in the interface. - Encapsulation 1) Different components of a system should not reveal the internal details of their respective elements 2) Gives the programmer the freedom in implementing the details of a system 3) The only constraint on the programmer is to maintain the abstraction interface that the outsider sees. - Modularity 1) Modern systems typically consist of several different components that must interact correctly in order for the entire system to work properly 2) Keeping these interactions working requres that these different comonents be well organised 3) This code structuring approach centeres around the concept of modulaity 4) Modularity refers to an organizing pricinpe for code in which different components of a software system are divided into separate functional units. 5) The hieracal structre imposed by modularity helps to enable software reusability, this allows for software to be reused to solve different problems 6) Ane xample given, think how an architect might reuse his design of a wall across houses 2.1.3 Object-Orientated Design Patterns - OOP facilities reusable, robust, and adaptable software -- Good code requires effective use of object-orientted design techniques - A design pattern is a solution to a typical design problem - This pattern gives a template for a solution in an abstract way, to tackle concrete problems - 2 main patters: 1) A patter for solving algorithm design problems 2) A pattern for solving software design problems 2.2 Inheritance and Polymorphism 2.2.1 Inheritance - A technique for reusing code and organising code in a hieracal way for modularity is called inheritance - This technique allows the design of generic classes, that can be specialised into more particular classes - A generic class is also known as a super class, a base class, or a parent class - Any class that specifies a extends a base class, doesn't need to give new implementation for a general functions, since it inheritcs them - Member Functions - :: <- this specifies what class's function we are calling - Protected Members - Child classes don't have access to private memebers of its parents class - Special assess privileges for derived classes can be provided by delcaring memebrs to be protected = A protected member is apublic to all classes derived from thie one, but private to all others - Protcted funcitons are commonnly used for utility functions, which derived classes might find useful - Constructors and Destructors - Constructor called when a class object is created - Classes are craeted base class first, and destroyed in the opposite order - Static Binding - When a child class is created from a base class, the child class becomes a subtype for the parent class, which means that the child class can stand in wherever the base class is acceptable. - Due to static binding, the parents functions will be called when the child member functions are called, this behavior is due to static biding - This is because the determining of whihc members funcitons to call is based on the object's declared type, not its actual type. - So if you have liek an array of pointers, and the pointers are pointing to a Parent class, then you set a parent pointer to a child, and call child's member ufnction, it will call the parent one.v - Use the scope operator :: to pick which classes function to use - Protected Members - A protected member is public to all child classes from it, but are private to all other classes. - Dynamic Binding - Static binding is done by default to dertmine whihc member function to call for a derived class - The alternative is dynamic binding, oa object's content determine whihc member function is called - To specify that a member function should use dynamic binding, the keyword "virtual" is added to the function's declaration - The decision as to what funciton is called is done at runtime, and this is where the keyword "dynamic" comes in - Virtual Destructors - no such thing as virtual constructor, but virtual decontructors are very important - The appropriate destructor needs to be called, depending on the hierachy of the object - if the destructor is noncirtual, then the parent will get destroyed - If a derived class has allocated memory dynamically, calling the wrong destructor would cause a memory leak - Dynamic binding is important since it allows us to create an object, whose behavior depends on its contents, which is fundamental to polymorphism. 2.2.2 Polymorphism - Means of many forms - In OOP, means the ability of a function to take different types - Typically applied in c++ pointer variables - In paticular a pointer declared to any class A, implies that said pointer, can also point to any derived class B, of A - If both these classes have define a virtual function s - Consider what happens when we invoke p->a() - Since dynamic binding is used, if p pointer to B, then it invokes B::a - B is said to override A - Alternatiely, if p points to A, well then A::s is called - A pointer variable which points ti a class object that has at least one virtual function is said to be polymorphic - that is, p can take on many forms, depending on the specific class it is refferring to - Inheritance, poloymorphis, and funciton overloading suppoer reusuable software - Specialisation - Imagine we have a class Dog, which has the functions drink n sniff. - If you specialise this class to Bloodhound, you woudnl't override drink, but you would override the sniff function - Bloodhound specifies the base class function - Extension - Writing new functions which arent in the base class, but are needed - Eg: Imagine extending dog for border collie, this might mean adding a "herding" function, as border collies have this and orther dogs dont 2.2.3 Examples of polymorphism 2.2.4 Muliple Inheritance and Class Casting - Muliple and Restricted inheritance - Althought MI can be useful, especially in defining interfaces, it inroduces a number of complexities ( like variables having the same name in base classes, guess it sort of breaks encapsulation) - There are three types of inheritace, public, protected and private - Protected Inheritance: Fields in the parent class which are public, become protected in the child class - In Pricate Inheritace: Fields in the parent class which are public, become private in the child case - These aren't used so often - Casting in an Inheritace Hierarchy - An object can be viewed as being of various types, but it can be decalred as only one type - Therefore a variable's delccared type determines how it is used, and even determines how certain functions will act on it - Enforcing that all variables be typed and the operations delcare the types they expect is called STRONG TYPING, which reduces bugs tbf - Nonetheless, sometimes we have to change the variables explicit type, and this is done by casting - the syntax for this is "dynamic_cast(expression) - Dynamic casting can only be applied to polymorphic objects, that is, objects tht come from a class which have at least one virtual function - Dynamic casting is most often applied for casting pointers within the class hierachy - If an illegal pointer cast is attempted for casting pointers within the class, the result is a null pointer, so its good to have null checks 2.2.5 Interfaces and Abstract Classes - For two obejcts to interact, they must "know" about each toher's member functions - To enfore this, the OOPDP asks that classes specify the API or interface that their objects persent to other objects. - In the Abstract Data Type approach, an interface defining an ADT is specified as a type definition and a collection fo member functinos for this type, with arguments for each function being of a specified type (think like a Java interface) - An interface is a colletion of function declarations with no data and no bodies - When a class implements an interface, it must implement all of the given member functions - C++ doesn't have a formal way of implementing interfaces, but we have an informal way of implementing them -eg: A stack container, we can provice a minimal interface in the class definiction, with no filled out methods - this isn't a valid class in c++ tho, its just a documentation of one - Abstract Classes - An abstract class in C++ is a class that is used only as a base class for inheritance, and can;t be actualyl instantiated - To handle not filling in the function, we need to tell the compiler that the class is abstrat - This is done by specifying that one or more memebrs of its functions are abstract, or pure virtual. - A function is defined as being pure virtual by giving "=0" in place of its body - C++ then won't allow creation of a class which has one of more function bodies of type "=0", and these functions are called pure virtual - Interfaces and Abstract Base Classes - Use an abstract base class as an interface for an AbstractDataType, which can then be extended into a concrete defintion / class - 2.3 Templates 2.3.1 Function Templates - Imagine you have a funciton which takes two ints, and returns the smallest one - This is a bloody handy function! Wouldn't it be good if it worked for more types, liek longs and shorts?? - Each one of these functinos would require a fifferent declarations and definition, however, making many copies of the same function is error prone - C++ provices an automatic mechanism, called the function template, to produce a generic function for an arbitraty type T - A function template provices a well-defined pattern from which a concrete function mayb later be formally defined / instantiated - You do this by placing "template " above the function declatation - The generics given have to be of the same type, if only one generic template ting is made. - The compiler will just look at the object type give, and determine what function to finstatntinoalt 2.3.2 Class Templates - Classes can be templated, which is powerful, as it allows us to provice one datastructure for different types - The Standard Template Library uses class templates extensively - To instantiate a template class, we provice the type to the constructor in "" format :) 2.4 Exceptions 2.4.1 Exception Objects - In C++, exceptions are thrown by code that encounters some unexpected conditions - This can happen at ocmpile and runtime - A thrown expection is caugh by other code which handles the exception "somehow" or the program is terminated - They are a relatively recent addition to C++, prior, the program usually just aborted at the source of the error, by having the function return some sort of special value - Exceptions are much cleaner, for historical reason, the C++ Standard Library won't use exceptions, and use statuses or flags instead. - Exception types oftern form hierachies; eg: MathExceptionrepresents all mathematical errrors, then derive morespecific expections for particular conditions 2.4.2 Throwing and Catching Exceptions - Exceptions are processed in the context of "try" ad "catch" blocks - A try block is a block of statements proceeded by the keyword try, there can be multiple catchblocks (mental i know, is it ? ) - a throw statement is used in the try statement to go into a given catch block 2.4.3 Exception Specification - When we declare a function, we shodl also specify the exceptions it mught throw - Lets user and coompiler what to expect - by doing this, we don't have to have a try-catch block around them - also a function can have liek as many exceptions as it wants, its a good this to derive them/ link them from a base class - You could aslo have a generic base class ---------------------------------------------------------------------------------------------------------------------------------------------------- 3 Arrays, Linked Lists, and Recursion ---------------------------------------------------------------------------------------------------------------------------------------------------- 3.1 Using Arrays 3.1.1 Storing Game Entries in an Array - Storing objects in arrays is a common use for arrays - Insertion - The approach is to shift all the entries ot the arrays smaller than the index given, to the right - Similar with removal, you need to thrown an error if the index given is bigger than the size - Sorting an Array standard tbh 3.1.3 Two-Dimensional Arrays 3.2 Singly Linked Lists - Arrays are nice and simpmle for sotring things in a certain order - They are not very adaptable (insertions, deletions, and resizing are hard) - A linked list is a collection of nodes that together form a linear ordering (follow the leader game) - if you put a const keyword after the function declaration in c++ it makes it illegal for that function to change a members value of that class - a friend class is one that can acceess private and protected methods of the class it was declared in 3.2.2 Insertion to the Front of a SLL ---------------------------------------------------------------------------------------------------------------------------------------- 5 Stacks, Queues, Double Ended Queues ---------------------------------------------------------------------------------------------------------------------------------------- 5.1 Stacks - Stack container: Last in first Out - Objects can be inserted at anytime, but only the last object can be removed - Internet Browsers store last visited address in browsers, Text editors undo moves are stored in a stack - Push, Pop, Top, size(), empty - Can be made in an array (vector), or in a generic linkedlist format (trade off between dynamic memory) - easy to revese things (liek a vecotr) with a stack 5.2 Queues - First in First Out - enqueue = insert element at the rear of the queue - dequeue = remove elemnt at the front of the queue - front = returns, but don't remove fisrt element - Can be made with a circular array (vector), or in a generic/circular linkedlist format (trade off between dynamic memory) - Used to make a breadth first search of a binary tree 5.3 Double Ended Queues - Supports insertion and deletion at both the front and rear end of the queues - This is sometimes called a deck - Implemented with a doubley linkedlist (8.1) 5.4 Priority Queue - An abstract data type for storing a collection of prioritised elements that supports arbitrary insertion, but supposrts removal of elements in order of priority, that is, the element with first priority can be removed at any time. - This is fundamentally different from all position based structures previously mentioned - Uses keys, which must fit the equality property (reflexive, antisymmetric, and transistive) - A priority queue is a container of elements, each associated with a eky - The key determines the priority - insert(e) = insert the element e (with implicit associated key value) - min() = return an element in the PQ with the smalled associated key valye - removeMin = remove from the PQ the element min() - More than one element can have the same key - An examlpe of usuage coupld be on a certain flight being fully booked, there are some cancellations, the airline maintains a priority queue of stanby passengers hopeing to get a seat. Priority deteremined by the fare paid, the frequent flyer status, and the time when the passenger is inserted into the priority queue - A priority queue can be implemented by using a linkedlist, but is more regulary implemented with a heap, which is likek a binary serahc tree - Can be used to do a heap sort ---------------------------------------------------------------------------------------------------------------------------------------- 7.1 Trees ---------------------------------------------------------------------------------------------------------------------------------------- - Nonlinear data structure - Much faster sequential data structure than vectors or arrays for some algorithms - Organise data in a hieracy 7.1.1 Tree Definitions and Properties - Top element called the root - A set of nodes storing elements in a parent child relationship - It can be emptry, where there are no nodes - Nodes are siblings if they have same parent - A node is external if it has no children, else it is internal - An edge, is a pair of nodes - A path is a sequence of nodes such that any consecutive nodes form an edge - Orederd if there is a llinear relationship between nodes such that we can search for them 7.1.2 Tree Functions - Size, left, right, parent, content ---------------------------------------------------------------------------------------------------------------------------------------- 7.2 Tree Traversal Alogrithms ---------------------------------------------------------------------------------------------------------------------------------------- 7.2.1 Depth and height - Depth is the number of ancestors of p, exclusing p itself - Height is the maximum length of a path in the tree 7.2.2 Preorder Traversal - Root of node is visited first, then its children are traversed recursively - If the tree is ordered, then the subtrees are traveresed according to the ordedr of the children 7.2.3 Postorcer Traversal - Opposite of preorder - recursively traverses the subtrees rooted at the children of the root first, before visiting the node 7.2.4 Bredth First Search - Searches all nodes on the same level at the same time, before progressing to the children - This is done iteratively by using a queue ---------------------------------------------------------------------------------------------------------------------------------------- 7.3 Binary Trees ---------------------------------------------------------------------------------------------------------------------------------------- - A binary tree is an ordered tree in which every node has at most two children - Each child is labeled as either left or right - a left child preceedes a right child in the ordereing of childres of a node - This can be used to work out arithmetic expressions, when the value of the expression are placed as the leafs on a node - Binary trees can be mde with a linkedlist, or indeed a vector 7.3.5 Vector - Done in a vector, by simpley numbering the nodes. - The numbering is done by a way called level numbering root = f(v) = 1, left leaf = 2f(u), right leaf = 2f(u) + 1 - Has a bad usuage of space, as some array indexes won't get used. but it is very efficient Binary Search Tree - elements on the left n right of are ordered - Euler Tour - Template function pattern is the name of the pattern used for the traversal methods mentioned You can represent general trees with binary trees ---------------------------------------------------------------------------------------------------------------------------------------- 8.3.1 Heaps ---------------------------------------------------------------------------------------------------------------------------------------- - A heap data strutue is a binary tree that stores elements with their associated keys at its nodes and satisfies two additional properties 1 a relational property defined in the terms of the way keys are stored in t 2 a structural property defined in terms of the nodes itself 1- uses a heap order priority, in a heapa T, for every node v other than the root, the key associated with v is greater than or equal to the key associated with v's parent. The property used to sort each pair saved inside a node in the heap is called a comparator. You IMPLEMENT A PRIORITY QUEUE WITH A HEAP ---------------------------------------------------------------------------------------------------------------------------------------- 9.1 Maps ---------------------------------------------------------------------------------------------------------------------------------------- - An Absract data type which allows us to store elements so that they can be located quickily. - Typically additional information besides its serach key - Typically a map stores key-value pairs called entries (Composition Pattern Pair) - In addition, the map requires the keys to be unique (not in multimap - Defined as associative stores - Can find entries using both a key, and an iterator ---------------------------------------------------------------------------------------------------------------------------------------- 9.2 HashTables ---------------------------------------------------------------------------------------------------------------------------------------- - Keys associated with values are thought of as addresses - Two major Components, bucket arrays, and a hashfunction 9.2.1 Bucket Arary - A bucket array for a hash table is an array A of size N, where each cell A is thought of as a bucket (a collection of key-value pairs). N is the capacity of the array - If the buckets have single sole unique entires, then retrieval is o(1) - However, space used is proportional to N, thus is N is too big, we have a waste of space, the second drawback, is that it requires keys to be within the range 1 - N integers. this is not always the case 9.2.2 Hash Functions - A hashfunction maps each key in out map, to an integer in the range [0, bucket array capacity minus 1] - Equipped with this, we can apply the bucket array method to arbitray keys - Howver, if there are two or more keys with the same value, this will cause a COLLISION - A good hashfunction minimises collisions, by distributing keys evenly thoughtout an array - a hashfunction should give the probability of two different keys getting hashed to the same bucket = 1/array of buckets capacity - For practical reasons, a hashfunction should be fast n easy to compute - A hashfunction has two actions; mapping the key to an integer (the hashcode), and then mapping the hashcode to a range of indicies [0, bucketarray capacity]. This second part is called the compression function 9.2.3 Hash Codes - All datatypes must be converted into an integer - char, short, int, can all be cast into an int easily in c++ - long has twice the representation of an int - don't cast long down to int, loses half the information given in original value - This will cause more collisions .:. take all info into consideration - Much better to break the component down and take the summation of it - Better for floating points too, so we don't lose info - Polynomial Hashes - Summation described above is not good for charater strings - Imagine converting chars to ASCII and then summing, suddenly, the word spot is now equal to stop, because have the same letters (unwanted collision) - here, use a polynomial hash. This is a hashfunction, which takes the components of an object x, in an order, given as coeficcients to the summation function, this becomes a polynomial hash code. - This usually gives a nice spread for English words - Cyclic Shift Hash Codes - Replaces polynomial hash funcitons, by giving a cyclic shift, of a partial sum by a certain number of bits - Hashing floating points - int and float are 32-bit quantities on most machines - don't hash the same way, as would lose fractional part - C++ offers an option of reinterprit cast to get round unrelated types - Treats quantities as a sequaence of bits, and makes no attempt to intelligently convert the meaning of one quantity to another 9.2.4 Compression Functions - hash code is not suitable for immediate use, as the hashcode > bucket arry N = Array Capacity - Division Method h(k) = k mod N - If we take N to be prime, this spreads out the distributed hashed values (eg: imagine dividing by 100, and 101) - Multuply, Add, Divide Method (MAD) - More spohisticated method of spreading hashcodes evently through hashbuckets h(k) = ak + b mod N (where N is a prime number, a and b are nonnegative integers) 9.2.5 Collision - Handling Schemes - HT: to take a bucket array, and a hash function, and use them to implement a map by storing each entry (key, value) pair in the bucket - Collision is two distinct keys mapping to the same bucket - Separate Chaining - For each bucket, store a linkedlist inside the given bucket, this is fine if the collisions aren't too often - This can take up a lot more space (think space time trade off between open addressing) - Average of O(1), but if just one list, then this becomes o(n) <- this depends on the load factor (n / N) - Open Addressing - Separate chainignn good, but requires an auxially data structure - However, open addressing will reassign values into other unused buckets in the array of bucket cells itself Linear Probing - If we insert an entry into an already occupied bucket, then we just try the next bucket in the list instead. - This means, we have to alter our search strategy, to double check items are in the correct bucket - Although this saves space, and is quite quick, it does lead to clustering fairly quickly as well. Quadratic Probing - Iteratively checking for free next buckets, but by squaring the distance between bucket places - takes one step longer, but reducces clustering even more Double Hashing - If there is a collision, use another hashing function, to get another bucket, far away to reduce clustering - This is a bit slower than the other two, but reduces clustering even more so 9.2.6 Load Factors - To keep hashtables effective, the load factor needs to be kept minimal - Load Factor = number of pairs to store / Number of buckets in hashtable array (its a ratio) - Has to be super low for open addressing, but can be 0.9 for Separate Chaining ---------------------------------------------------------------------------------------------------------------------------------------- 9.3 Ordered Maps ---------------------------------------------------------------------------------------------------------------------------------------- - For not onlny knowing the key value pair, but also the order entires in a map - Gives more funcitonality, but at a trade off 9.3.1 Ordered Search Tables using Binary Search - Need a sequential data strucutre - Can't really store in a vector or so, as o(n) for worst case for searching for values - However, we can use a Binary Search Tree: o(log n) for searching for pairs 9.3.2 Applications of Ordered Maps - Maximal Pairs (if pushed, talk about flight entries in a database?) ---------------------------------------------------------------------------------------------------------------------------------------- 9.4 Skip Lists ---------------------------------------------------------------------------------------------------------------------------------------- - Makes random choices in arranging the entries in such a way, that search and update times are o(log n) on average - Time compmlexity does not depend on the distribution of keys, instead it depends on the random number generator in the implementation of insertions - A skip list for a map consists of a series of lists, where each list stores a subset of the entires of M sorted by inscreasing keys plus entries w ---------------------------------------------------------------------------------------------------------------------------------------- 9.5 Dictionaries ---------------------------------------------------------------------------------------------------------------------------------------- - Like a hashmap, however, dictionaries allow multiple entries to have the same key - (like english dictionalty, multiple defition of the same word) - When you find an element, you find references (for an iterator), which reference all mathcing pairs --------------------------------------------------------------------------------------------------------------------------------------------- 6.1 Vectors --------------------------------------------------------------------------------------------------------------------------------------------- - The index is the number of elements before the given values - A collection of element sin a linear order is a list - A sequence/list that supports access to its elements via an index is a vector - Example implementation of the Abstract Data Type Vector - An array that automatically resizes - The array we choose is sufficiently large such that we don't run out of memeory, but not too large that the array cannot be stored in contiguous memory - Complexity - To add elements to this array implemnetation, we need to - If a class handles memeory, they need a constructor, destructor, and a copy constructor, and an assigment operator, to have robust memeory safety.