Coding Horror on the BigO
[ 000957.html ]
Assigned work -- send a question and bring a deck of cards.
Chapter 19 pages 923-924 -- Searching and sorting
Can we make algorithms do anything else other than search?
Yes .... examples??
Chapter 19 pages 923 -- Algorithm timing
How big does a program have to be (or any other factor) to notice a difference between one algorithm and an other algorithm?
Twelve (12) lines can do it. Indeed faster algorithms are often bigger than slower ones. The extra choices and loops can reduce the amount of work to be done.
Exception -- a badly coded slow algorithm can be a lot longer than an elegant fast algorithm.
Chapter 19 pages 923 -- big O notation
How would one find out what the big O notation would be for an algorithm?
Take CSCI 431 -- Algorithm Analysis. To prepare for these -- CSCI330 and MATH 372.
Chapter 20 pages 976 -- Algorithms
Is there an relatively easy way of determining an algorithm's worst case scenario short of trying every possible outcome?
Hire a computer scientist to do it for you!
Or look in one of the reference books -- Knuth -- He has analysed evrything -- and is in the library.
In most cases you just look at the loops and think about the maximum times they can possibly repeat. In particular look for deeply nested loops.
Example -- if you see
for(int row=0; row<SIZE; row++)
for(int col=0; col<SIZE; row++)
for(int pivot=0; pivot<SIZE; row++)
do_something_simple;it doesn't take much to see at least an O(SIZE^3) time.
Chapter Handout pages 10-11 -- Big O Notation
Could you explain a little more in depth what this notation does and when you would to use it?
In CS202 you use it like the EPA sticker on an automobile. The first guideline to choosing a good solution to a problem.
Chapter 19 Pages 923-924 Linear Search Big O
Can you explain Big O notation?
See below
Chapter 19 pages 923 -- Big O Notation
Can you explain the Big O notation and why it is used to estimate the worst-case run time for an algorithm.
I can try.... see below.
Another student said that the Wikipedia link is helpful.
Ch. 19; Pg: 923; Subject: Big O
How does Big O notation work?
It works because we do. It takes some math to calculate the big-O notation -- but less than figuring out the exact formula. We can then tell how the program will perform in the worst case. To understand this takes some some math.
Draw up a table that shows
Table
| 3*n | n^3 | 3^n |
|---|---|---|
| 0 | 1 | 3 |
| 3 | 8 | 9 |
| 6 | 27 | 27 |
| ... |
In the first column add 3 each time. In the second column calculate the cube. In the last column multiply by 3 each time.
Chapter 19 pages 931 -- log and algorithms
can you please elaborate on the relationship between log and algorithms?
They both share the letters "l", "o", and "g":-)
But it is striking that the mathematical "log" function turns up when we calculate the speed of an algorithms. So does the inverse of the logarithm function the exponential function-- "exp".
Log pops up whenever we have a good "divide and conquer" algorithm.
The exponential (exp(n)) seems to turn up whenever we have an interesting or difficult problem. Especially -- trying to find the "best" sequence or subset for a set of items according to complex rules.
Here is a table of exponentials:
Table
| n | 2^n |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 4 |
| 3 | 8 |
| 4 | 16 |
| 5 | 32 |
| 6 | 64 |
| 7 | 128 |
| 8 | 256 |
| 9 | 512 |
| 10 | 1024 |
You add exponents to multiply the exponentials.
Here is a table of logarithms with base 2:
Table
| lg(n) | n |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 4 |
| 3 | 8 |
| 4 | 16 |
| 5 | 32 |
| 6 | 64 |
| 7 | 128 |
| 8 | 256 |
| 9 | 512 |
| 10 | 1024 |
Chapter 19 pages ppp-ppp -- searching and sorting
what key aspect of both the binary search and the merge sort accounts for the logarithmic portion of their respective Big Os?
Recursion! Both use the "Divide and Conquer" strategy. They both divide the problem into two equal
halves and then reapply the same algorithm to each half. This accounts for the "log(n)" term.
Chapter 19 pages 932 -- Merge Sort the best?
Are there other sort algorithms that are more efficient then Merge that the book does not mention?
The best algorithm is not defined until you define the data that is being sorted. If, for example, you have a lot of data that you know is pretty random then Quick sort will be (on average) be twice as fast as mergesort. But, on some data, in the worst case, quicksort is much slower than merge -- and your users complain.
The best sorts are those with O(n log(n)) performance. We know we can not beat that.
They are much better (on large data) than the n^2 sorts.
Trouble is, there exist data where the best sort is bubblesort -- a sort that is horrible on large random data. If two or three items are out of order then bubble sort runs very fast.
So it is a question with a complex answer.
In practice, you should use the functions you have in the library. You won't be able to beat them:-(
Can you give an example of sorting a vector using the recursive merge sort algorithm?
Using one of the C++ <algorithm> functions: [ timeMergeSort.cpp ]
[ mergeMainC.cpp ] [ mergeSortC.h ] [ mergeSortC.cpp ]
Chapter 19 pages 932 -- Merge Sort
Can you give a clearer explanation of how Merge Sort works?
I can try.... I'll do a demo if we have time.
Why are constants thrown out when trying to figure out the order of the big_O?
Partially convenience. The constants tend to reflect the speed of the computer. You can always get a better CPU and solve the problem faster. Worse -- we can show that clever programming and using more storage can speed up an algorithm by a fixed constant multiplier.
The nonconstant part tells you the important properties of the algorithm. For example
calculate the table
Table
| n | 100*n | n^2 |
|---|---|---|
| 10 | 1000 | 10 |
| 20 | 2000 | 400 |
| ... |
Ultimately an a*n is overtaken by b*n*n, when n is biggest. The a and b just tell you when it happens.
So most computer scientists don't care whether an algorithm takes 10*n^2 or 5*n^2.
(Note I have colleagues who like to argue that we should worry about the constants... He always reminds me that shifting a C++ program from using vectors to using arrays always makes it run faster. I reply that it also adds bugs...).
It means that the problems can be solved -- if we have time to solve them. Small amounts of
data a done quickly. Large amounts of data take enormous amounts of time.
The time increases drastically
with the size of the problem. With an "exponential algorithm", each time you add one item to
the data the time is multiplied by a number greater than one. Like this
Table
| Size of data | Time doubles |
|---|---|
| 1 | 2 |
| 2 | 4 |
| 3 | 8 |
| 4 | 16 |
| ... | |
| 10 | 1024 |
| ... | |
| 20 | ???? |
Exercise -- how long is a million seconds? a billion seconds? a trillion seconds?
What does it mean when a algorithm is tractable or intractable?
A tractable problem has an algorithm that solves the problem in polynomial time. This means that for size n the time is O(n^p) for some p.
The intractable problem haven't got a known polynomial algorithm to solve them. Worse they can all be used to solce each other -- they are equivalent. Think of it. several hundred different problems that we can't find a fast solution... AND if any one of them did have a polynomial solution .... then they ALL would.
Worse we can not prove that there is NO polynomial solution.
This gets rather technical. But if you can find a polynomial solution to one of the
intractable problems:
Most computer scientists think that this won'e happen. Instead someone will become rich and famous by showing that it can not be done.
Chapter 19 pages 939 -- libraries
Could you give a brief history of the <algorithm> library?
A big debate on the internet/usenet/Google groups.... much experimental programming. The specs put into the standard. SGI releases the source code and every body (but microsoft) uses it.
Has it been included in C?
It does not fit -- uses templates -- C has no templates.
Are there other algorithm libraries in C++?
Probably. BOOST probably has some, and the next revision of the standard will add more.
See [ 19.html ] later.
Chapter 19 pages 938-939 -- searching and sorting
Which type of search and sort seem to be the most popular
The ones in your library!
In practice you need to go beyond the theoretical Big-Os and think about the particular problem in hand.
(1) What do you know about the data? If there is any thing special about the data you can probably come up with a variation of an algorithm that takes advantage of the quirk in the data.
(2) How fast must it respond to keep the users happy? No point in wasting programmer time (expensive) to save CPU time(cheap). But if you have to respond quickly (Think games, heart pace makers, missiles, autos, proton therapy, ....) then it is worth tuning the algorithms.
Chapter 19 pages 923 -- algorithm
Is a binary search quicker then a linear search an whats the difference
See next.
Chapter 19 pages 932 Algorithms
Merge sort and Binary search are the most efficient algorithms,
are they also the ones that will get used the most?
No! The sort you will find on most computer systems is "quicksort with randomization". For small amounts of data most programmers use "Insertion sort" which is faster on upto about 10 items. Mergesort is not used because it may have a good worst case but its average performance is worse than quicksort. We make a tricky change to quicksort so that the worst case never happens twice on the same data! This keeps the users happy.
I expect the card searching demo to clarify the choice between linear and binary search.
I will post my conclusion later.
Chapter 19 pages 925-930 -- Binary search
Can you give another example of a binary search algorithm and a further explanation of why binary search is more efficient than linear search.
Try looking up something in the index of the book. Demo.
I just wrote [ binary.cpp ] which uses a binary search to find roots of certain equations.
Chapter 19 pages ppp-ppp -- Linear/Binary
Can you give board examples of both please?
See next.
Which search method is better, linear or binary search
Good question -- let us find out.
Time for a demo with cards....
Special Preparation: Bring a deck of playing cards
Demonstrations of sorting and searching using playing cards.
. . . . . . . . . ( end of section CSci202 Computer Science II, Session 15 Algorithms) <<Contents | End>>
Laboratory 7 Comparing Algorithms for Sorting ??
[ lab07.html ]
Next -- Linked Data Structures
[ 16.html ]
Project 4 is due in next class!