The
contents of this website, the links contained therein directly and
indirectly, and the contents of the said links, are copyrighted.
They are provided exclusively for
non-profit educational use by the students currently enrolled in this
course and for the duration of this semester. No other use or any use
by others is allowed
without authorization of the professor in this course and copyright
holder or holders.
Lecture Notes
These
Lecture Notes
are subject
to
change as I will cover the
course material. You are
welcome to read them in advance, but by doing so you accept the risk
that the notes you have read might have been changed or deleted.
All
students are
responsible for visiting and reading all the links (whether used in
class or not)
unless noted otherwise.
Here
is
a link to free
Mathematica player: http://www.wolfram.com/cdf-player/. You can open
and view .nb files with it, but you cannot execute them. Use it on you own risk - this is not an endorsement.
Here
is
a
link to video (29
min.) Part 3 (Mathematica + audio) - hints how to type in
Mathematica are at the begining of the video. http://csc.csudh.edu/suchenek/CSC401/Videos/Summations.m4v All students are
responsible for all material, not just for the parts covered by the
slides and videos.
Internal Path Length (ipl) and External Path Length (epl) in binary trees
Here are slides with a
brief review of trees from CSC 311 Data Structures class; they
are
all copyrighted
materials, so the students in this class, and only the students
in this class, are allowed to read them but not to copy or distribute
them: http://csc.csudh.edu/suchenek/CSC401/Slides/Trees.pdf
All students are
responsible for the above material. In the current context, definition
of a binary tree as a set of positive integers that is closed under
positive integer division by 2 is particularly relevant.
Here are definitions of the
ipl and epl in a binary tree (taken from CSC 311 Data
Structures website):
Fact. A binary tree with n internal nodes has n+1 external nodes.
Exercise: Prove it!
Example:
Here is a binary tree with 7 internal nodes and 8 external nodes.
The internal nodes are enumerated with blue numbers, and the external
nodes are enumerated with neon green numbers.
Fact: The minimal
ipl in any binary
tree on n nodes is:
where
Here is a link to
experimental computation of of the first closed-form formula for the
above sum of floors of binary logarithms of consecutive integers:
Link to slides The
following slides are copyrighted
material, so the students in this class, and only the students
in this class, are allowed to view these slides but not to copy or distribute
them. http://csc.csudh.edu/suchenek/CSC401/Slides/Sorting.pdf
private static int
Split(int[] L, int lo, int hi)
//Uses
the
occupant x at index lo in array L as pivot,
//sends
all
elements < x to the begining of the range (lo, hi), and
//all
elements
>=x to the end of that range {
int splitPoint = lo;
int x = L[splitPoint];
for (int i = lo+1; i <= hi; i++) {
if
(Comp(L[i],
x)) //need to move L[i] before split point
{
L[splitPoint]
=
L[i];
cnt2.incr();
splitPoint++;
L[i]
=
L[splitPoint];
cnt2.incr();
} }
L[splitPoint] = x;
cnt2.incr();
return splitPoint; }
Here is a link to an optional
short article about construction of an adversary to C-Heapsort that
forces it to perform big Theta (n^2) comparisons, no matter what "easy"
improvements to pivot selection were made:
The
following video is copyrighted
material, so the students in this class, and only the students
in this class, are allowed to view it but not to copy or distribute
it.
Complete binary tree with nodes enumerated level-by-level.
A heap and 3 non-heaps with nodes showing the values they represent.
Link to slides - demo The
following slides are copyrighted
material, so the students in this class, and only the students
in this class, are allowed to view these slides but not to copy or distribute
them. http://csc.csudh.edu/suchenek/CSC401/Slides/Heapsort_demo.pdf
The best
overall student in CSC 401, CSC 501class and the student who gets the
highest score
on the final exam in CSC 401, CSC 501 class will receive a signed copy
of the
above paper, each.
The exact worst-case
characterization of the entire Heapsort:
Optional
reading (Sections 4 through 7, and 9) for all students regarding
redistribution of scientific discovery and intellectual property.
The following is a copyrighted
material, so the students in this class, and only the students
in this class, are allowed to read it but not to copy or distribute
it.
Link to slides The
following slides are copyrighted
material, so the students in this class, and only the students
in this class, are allowed to view these slides but not to copy or distribute
them.
Same
as
the above in MS Windows journal format. It may be used to transcribe
the handwritten text with handwriting
recognition feature
of
journal. Works in the lab in SAC 2102. http://csc.csudh.edu/suchenek/CSC401/Slides/Graph.jnt
Proof (further explains argument used in the proof of correctness of
Dijkstra-Prim algorithm presented in the file
Spanning_trees_Shortest_path.pdf linked above.)
Note: The running time of any of the
"fast" versions is notO(n)
because n is not a measure of the size
of n. The lg n is! (To be exact, size(n) = floor(log_2 n) + 1, or
anything roughly proportional to it.)
Hence, the said
running time is O(a^size(n)),
where
a > 1, that is,
it
is exponential.
But Fibonacci function's recursive definition has this analytic solution
that can be computed in O((log
n)^k), that is,O(size(n)^k),
for some constant
k. In particular, it can be computed in polynomial time in the
worst case.
Can we
accomplish that for all other problems that run in exponential time in
the worst case?
For NP-complete problems (definition will be given, later) we don't
know.
(Note. The Primality Test was once
suspected NP-complete but now we know a
program, invented by AKS, that
solves it in O((log
n)^k) (polynomial,
that is) time.)
Note. If, rather than considering worst-case running time,
we focus only on the average-case running time then some NP-complete
problems become deterministically solvable in linear time (on average).
Below is a link to a paper I wrote around 1988 which demonstrated that
the CNF satisfiability problem
is solvable in linear (with constant 2) time on
average, under any reasonable comprehension of the
adjective average.
It is a result of the fact that most of Boolean functions on n
variables must necessarily have exponentially-long (in n) propositional
representations. (This observation has been attributed to Claude
Shannon and is often referred to as Shannon's
counting argument.)
In other words, although deciding whether given CNF formula on n
variables may require exponential (in n) running time, the lengths of
vast majority of the shortest CNF formulas of n variables are also
exponential (in n), thus making the average running time for such
decision program linear in the size of its input.
Here is a link to a short and easy-to-read article in Communications of
the ACM that lists some major positive developments in solving
instances of CNF satisfiablity and practical implications thereof: