Recent Entries 10
- pattern major 112d agoMeasuring time complexity in the length of the input v/s in the magnitude of the inputI know that formally the time compliexity of an algorithm is measured in the length of the input, which in binary would be the number of bits required to encode the input. The problem that I have with this definition is that wouldn't this definition make a lot of seemingly obvious poly-time algorithms involving numbers exponential time? For example, consider the algorithm of listing all integers between 1 and another integer n. The algorithm would be a simple loop, listing a number in each iteration from 1 to n. This loop would have $n$ iterations and would thus take $O(n)$ time. Now if $n$ is encoded in $b$ bits then $b = \lg n \implies 2^b = n \implies$ in the length of the input the algorithm is $O(2^b)$ which is exponential, which is very unintuitive because the loop, in each iteration, does constant amount of work and there are $n$ iterations in total. By the same logic, consider a loop that prints Hello World $n$ times, where $n$ is an integer, then by the same procedure as above, this algorithm would also be exponential time in the length of the input $n$. What am I missing?
- pattern minor 112d agoIs Big-Theta a more accurate description of worst case run time than Big-O?Question I was asked: Does it make a difference if I say "The worst case run time is $O(n^2)$ vs the worst case run time is $\Theta(n^2)$?" To me, the only difference is that when we say $O(n^2)$, the function may also be $O(n)$, we do not know. But when we say $\Theta(n^2)$, we know for a fact the function is $O(n^2)$ and $\Omega(n^2)$, because it is bounded by $c_1n^2\leq f(n)\leq c_2n^2$ (correct me if I am wrong). Therefore, can we not say that $\Theta(n^2)$ gives us a more accurate (or at least equal) sense of worst-case run time than $O(n^2)$?
- debug minor 112d agoFind an upper bound for T(n) = T(n/2) + T(n/2 + 1) using the Substitution Method base case failsGiven the algorithm ``` MYSTERY-ALG(n >= 0) 1 if n < 3 then 2 return 1 3 else 4 return MYSTERY-ALG(n/2) + MYSTERY-ALG((n/2) + 1) ``` I defined a recurrence $ T(n) = \begin{cases} 1 & 0 \leq n 0$, must show $T(n) \leq cn$, $\forall n \geq n_0$. $T(n) = T(\frac{n}{2}) + T(\frac{n}{2} + 1)$, since $\frac{n}{2} 2 \implies n_0 > 2 \implies T(\frac{n}{2}) \leq \frac{cn}{2}$ and $T(\frac{n}{2} + 1) \leq \frac{cn}{2} + c \implies$ $T(n) \leq \frac{cn}{2} + \frac{cn}{2} + c = cn + c$ $\overset{?}{\leq} cn$ is not possible since $c > 0$. Try 2 Assume $0 \leq ck - b$ for $k 0$, $b > 0$ must show $T(n) \leq cn - b$, $\forall n \geq n_0$. $T(n) = T(\frac{n}{2}) + T(\frac{n}{2} + 1)$ $\leq \frac{cn}{2} - b + \frac{cn}{2} + c - b = cn - b - b + c = cn - b - (b - c)$ $\leq cn - b$, as long as $b - c \geq 0$ $\implies b \geq c$. From here, it's not clear how to find $b$ and $c$. Strangely, this expression is also independent of $n$. If we take $c = b = 1$, $n_0 = 3$ then the $T(3) = T(1.5) + T(2.5) = 2 \leq 3 - 1 = 2$ bound holds, but the base case, for example, $T(1) = 1 \leq 1 - 1$ fails. Is there any way of making this work?
- pattern minor 112d agoAnalytic combinatorics and less-precise running timesAnalytic combinatorics and concrete mathematics are the mathematics of asymptotic counting, and they draw from combinatorics, analysis, and probability. These techniques have been applied to the analysis of algorithms, e.g. in books by Knuth and (his student) Sedgewick. However, the results tend to be very precise -- they might give a running time as $c n \log(n) + O(n \log(n)^2)$ for example. Running times in Introduction to Algorithms on the other hand generally require simpler math are more likely to be stated as $\Theta(n \log(n))$. My question is whether there are examples where the advanced mathematics of analytic combinatorics is necessary even to get a rough Big-Theta running time. After all, a commonly-applied version of singularity analysis is a statement about Big-Oh bounds: This is from chapter 13 of Selected Papers on Analysis of Algorithms: In chapter 26 Knuth uses complex analysis to find that "the average carry propagation time to add n-place numbers" is Whereas in Introduction to Algorithms we find this:
- pattern minor 112d agoWhy is my implementation of Dijkstra's Algorithm using min heap faster than using an unsorted array for a complete graph?Based on theory, the implementation using adjacency matrix has a time complexity of E+V^2 and the implementation using min heap has a time complexity of (E+V)logV where E is the number of edges and V is the number of vertices. When E>>V, such as for a complete graph the time complexity would be V^2 and (V^2)logV. This would mean that the implementation using min heap should be slower. However I tested both implementations and found that the runtime for min heap is faster. Why is this so? Here is my implementation: - adjacency matrix and unsorted list `def dijkstraUnsortedArr(graph, start): distances = [math.inf for v in range(len(graph))] visited = [False for v in range(len(graph))] predecessors = [v for v in range(len(graph))] distances[start] = 0 while True: shortest_distance = math.inf shortest_vertex = -1 for v in range(len(graph)): if distances[v] - adjacency list and min heap `def dijkstraMinHeap(graph, start): distances = [math.inf for v in range(len(graph))] visited = [False for v in range(len(graph))] predecessors = [v for v in range(len(graph))] heap = Heap() for v in range(len(graph)): heap.array.append([v, distances[v]]) heap.pos.append(v) distances[start] = 0 heap.decreaseKey(start, distances[start]) heap.size = len(graph) while heap.isEmpty() == False: min_node = heap.extractMin() min_vertex = min_node[0] for v, d in graph[min_vertex]: if not visited[v]: if (distances[min_vertex] + d) 0 and self.array[i][1] Here's the graph of the runtime against the number of vertices v:
- pattern minor 112d agoWhat's the runtime complexity of this algorithm for breaking up string into words?I am given a input string $s$ ("bedbathandbeyond") and a set of words {"bed", "bath", "beyond", "bat", "hand", "and"}. I need to divide the input string $s$ into a series of words in the dictionary. In this case, the two allowed outputs would be ["bed", "bath", "and", "beyond"] and ["bed", "bat", "hand", "beyond"]. Both outputs are allowed and none is better than the other. Here is my recursive solution. On input $s$, it either outputs a decomposition of $s$ into a sequence of words in the dictionary, or Fail. - If the input $s$ is empty, output the empty decomposition. - Otherwise, go over all prefixes of $s$. For each prefix that belongs to the dictionary, recursively try to decompose the corresponding suffix, and if successful, output the corresponding decomposition. - If no decomposition was found, output Fail. In order to speed up the procedure, I envision using memoization. That is, I maintain a hashtable with answers to previous queries, and on input $s$, I first check whether I already know the answer for $s$. If not, after computing the answer, I put it in the hashtable. If no memoization is used, the runtime is clearly exponential: we have $O(n)$ branching factor, and $O(n)$ as the depth: $O(n^n)$. With memoization; however, the algorithm seems to be polynomial: the procedure can be called with at most $n$ different inputs, and it then performs a polynomial amount of work on it. I've been told by a very knowledgeable person that it's still exponential. Could someone explain why? I've been thinking very hard about it and I'm not sure why that's the case. If it matters, I'm implementing this algorithm in python.
- pattern minor 112d agoDoes Master Theorem apply to $T(n) = 4T(n/2) + n^2 \log n$Based on CLRS Theorem 4.1, master theorem doesn't apply to $T(n) = 4T(n/2) + n^2 \log n$. However, I saw the 4th condition of master theorem on slides of Bourke. If $f(n)=\Theta(n^{\log_ba}\log^kn)$, then $T(n)=\Theta(n^{\log_ba}\log^{k+1}n)$. So $T(n)=3T(n/3)+n\log n$ can apply to case #2, see for example this question. With the same logic, $T(n) = 4T(n/2) + n^2 \log n$ should be $T(n) = \Theta(n^2\log^2n)$. But it's actually $T(n) = \Theta(n^2\log n)$. Is there anything wrong of my thinking?
- pattern minor 112d agoUpper bounds for a binomial coefficientI have an algorithm with worst-case time complexity in $\mathcal O (\binom{k}{p-1})$, where $k$ is a parameter and $p$ is the input size of that algorithm. I further have determined that $p-1 \leq k $. I need an argument which shows that the runtime complexity is solely a function of its parameter $k$, i.e. that the input size $p$ does not matter for runtime complexity. My own argument goes as follows: $$ \binom{k}{p-1} = \frac{k!}{(p-1)!(k-(p-1))!} \leq k!$$ Where the first equality follows from the definition of the binomial coefficient and the second inequality follows from the observation that both $(p-1)!$ and $(k-(p-1))!$ are greater than or equal to $1$. Thus the algorithm runs in $\mathcal O (k!)$, which is a not a function of $p$ as required. Is this argument correct or am I missing something? Maybe I am overcomplicating things and there is a faster / better argument?
- pattern minor 112d agoWhat is considered an asymptotic improvement for graph algorithms?Lets say we are trying to solve some algorithmic problem $A$ that is dependent on input of size $n$. We say algorithm $B$ that runs in time $T(n)$, is asymptotically better than algorithm $C$ which runs in time $G(n)$ if we have: $G(n) = O(T(n))$, but $T(n)$ is not $O(G(n))$. My question is related to the asymptotic running time of graph algorithms, which is usually dependent on $|V|$ and $|E|$. Specifically I want to focus on Prim's algorithm. If we implement the priority queue with a binary heap the run-time would be $O(E\log V)$. With Fibonacci heap we could get a run-time of $O(V\log V + E)$. My question is do we say that $O(V\log V + E)$ is asymptotically better than $O(E\log V)$? Let me clarify: I know that if the graph is dense the answer is yes. But if $E=O(V)$ both of the solutions are the same. I am more interested in what is usually defined as an asymptotic improvement in the case we have more than one variable, and even worse - the variables are not independent ($V-1\le E<V^2$, since we assume the graph is connected for Prim's algorithm). Thanks!
- pattern major 112d agoChecking equality of integers: O(1) in C but O(log n) in Python 3?Consider these equivalent functions in C and Python 3. Most devs would immediately claim both are $O(1)$. `def is_equal(a: int, b: int) -> bool: return a == b ` `int is_equal(int a, int b) { return a == b; } ` But consider what is happening under the surface. Integers are just binary strings and, to determine equality, both languages will compare the strings bit-by-bit. In either case this scan is $O(b)$ where $b$ is the number of bits. Since integers have a constant size in bits in C, this is simply $O(1)$. EDIT: C doesn't compare bit-by-bit see this answer In Python 3 however, integers do not have fixed size and the scan remains $O(b)$ for the number of bits in the input, or $O(\log a)$ where $a$ is the value of the input in base 10. So if you're analyzing code in Python, any time you compare two integers, you are embarking on a surprisingly complex journey of $O(\log n)$ with respect to the base 10 value of either number. For me this raises several questions: - Is this correct? I haven't seen anyone else claim that Python compares ints in log time. - In the context of conducting an interview, should you notice or care if a candidate calls this $O(1)$? - Should you notice or care about this distinction in the real world? EDIT: It is easily verified (and intuitive) that Python cannot compare arbitrarily large ints in constant time. So a better way to ask question 1 above might be "What (if any) is the justification for calling this operation $O(1)$? Because it's pragmatic? Conventional? Implied by the RAM model?