Recent Entries 10
- pattern minor 112d agoFactor a number in the longest possible product of distinct numbersI got stuck with quite a simple problem: Given a positive number $X$ find the largest number $k$, for which exists the positive distinct integers $Y_1,…,Y_k$ such that $(Y_1+1)(Y_2+1)⋯(Y_k+1)=X$ Any of my approaches based on integer factorization or working with the divisors of the $X$ have failed. For example all my tries have failed to solve the problem for $X=684913065984000$. A correct solution in this case is $k=15$, with $Y=[1,2,3,4,5,6,7,8,9,11,15,19,23,31,63]$. Can anyone help me with a solution? My guess is that we may need some DP but I can't come with any kind of practical approach.You may find this problem on the Kattis online judge here
- pattern minor 112d agoPrecise algorithm for finding higher order derivativesI'm trying to make an algorithm that finds the first 10 or so terms of a function's Taylor series, which requires finding the nth derivative of the function for the nth term. It's easy to implement derivatives by following the definition of the derivative: $$f'(x) = \lim_{h\to0}\dfrac{f(x+h)-f(x)}{h}$$ implemented here in Python: ``` dx = 0.001 def derivative(f, x): return (f(x + dx) - f(x)) / dx ``` The value seems to be even closer to the actual value of the derivative if we define it like this: ``` dx = 0.001 def derivative(f, x): return (f(x + dx) - f(x - dx)) / (2 * dx) ``` which just returns the average of `(f(x + dx) - f(x)) / dx` and `(f(x) - f(x - dx)) / dx`. For higher order derivatives, I implemented a simple recursive function: ``` dx = 0.001 def nthDerivative(f, n, x): if n == 0: return f(x) return (derivative(f, n - 1, x + dx) - derivative(f, n - 1, x - dx)) / (2 * dx) ``` I tested the higher order derivatives of $f$ at $1$, where $f(x)=x^9$, and as can be proved by induction, $$\dfrac{d^n}{dx^n}(x^k)=\dfrac{k!}{(k - n)!}x^{k-n}$$ Therefore, the nth derivative of $f$ at $1$ is $\dfrac{9!}{(9 - n)!}$. Here are the values returned by the function for n ranging from 0 to 9: ``` n Value Intended value ----------------------------------- 0 1.000 1 1 9.000 9 2 72.001 72 3 504.008 504 4 3024.040 3024 5 15120.252 15120 6 60437.602 60480 7 82298.612 181440 8 32278187.177 362880 9 95496943657.736 362880 ``` As you can see, the values are waaaay off for $n$ greater than $5$. What can I do to get closer to the actual values? And is there an algorithm for this that doesn't have $O(2^n)$ performance like mine?
- snippet minor 112d agoHow can I compute logarithm when comparison is undecidable?In Haskell, I have the following datatypes that encodes arbitrary real numbers and arbitrary complex numbers, respectively: ``` newtype ArbReal = ArbReal {approximate :: Word -> Integer} data ArbComplex = ArbReal :+ ArbReal ``` For the `ArbReal` type, the `ArbReal` constructor labels a function that, when fed an integer $n$, computes the encoded real number to $n$ decimal digits below the decimal point, rounded. For example, when `ArbReal f = pi`, `f 0` = 3, `f 1` = 31, `f 2` = 314, and so on. Note that there is no guarantee to the direction of rounding. Given `ArbReal g = 0.5`, `g 0` can be either 0 or 1. This is inevitable, for if there were, an interval would be decidable. `ArbComplex` encodes a complex number by specifying its real part and imaginary part. I've successfully implemented addition, subtraction, multiplication, and division on both types. Division by 0 falls in an infinite loop, though. I also implemented nth root function of real numbers, square root function of complex numbers (where branch cut doesn't exist, hence multivalued), and $\pi$. Now it's time to implement natural logarithm (on complex numbers, without a branch cut). And that's where a problem emerged. I was implementing the algorithm (namely, AGM iteration) in this paper, but: Finally, if $0< x <1$, we may use $\log(x) =−\log(1/x)$, where $\log(1/x)$ is computed as above. This paragraph forces a comparison, which is undecidable. So it's impossible to implement the algorithm directly. Indeed, in my current version of implementation, $\log 1$ falls in an infinite loop. Is there a tweak on the algorithm that makes the algorithm computable? Or must I implement a completely different algorithm?
- pattern minor 112d agofast and stable x * tanh(log1pexp(x)) computation$$f(x) = x \tanh(\log(1 + e^x))$$ The function (mish activation) can be easily implemented using a stable log1pexp without any significant loss of precision. Unfortunately, this is computationally heavy. Is it possible to write a more direct numerically stable implementation which is faster? Accuracy as good as `x * std::tanh(std::log1p(std::exp(x)))` would be nice. There is no strict constraints but it should be reasonably accurate for use in neural networks. The distribution of inputs is from $[-\infty, \infty]$. It should work everywhere.
- pattern minor 112d agoRobust two lines/segments intersection point in 2DGiven two line segments the problem is to find an intersection point of corresponding lines (assuming that they are not parallel or coincide). There is a Wikipedia article which gives us exact formulas, but there are two of them: one that uses `t` ratio and approaches intersection point from first line segment and the other -- uses `u` and the second line segment. How can I select which one to use in my scenario? For example: my initial implementation which always used `t` failed on ``` first_segment = Segment(start=Point(x=-5, y=0), end=Point(x=72057594037954921, y=0)) second_segment = Segment(start=Point(x=0, y=0), end=Point(x=0, y=3)) ``` gives ``` Point(x=5.921189464665284e-16, y=0.0) ``` which is incorrect, but when I switch to use `u` or change order of arguments it gives me correct ``` Point(x=0.0, y=0.0) ``` So my question is: is there robust way to calculate intersection point?
- debug minor 112d agoIs order of matrix multiplication affecting numerical accuracy of the result?I have to multiply three matrices of floats: `A` (100x8000), `B` (8000x27) and `C` (27x1). Is there any difference in accuracy between `A(BC)` and `(AB)C`? If yes - how may I determine the more accurate multiplication order? Speed is not a factor here. Matrices `A` and `B` contain 8000 samples of (respectively) 100 and 27 features.
- pattern minor 112d agonumerically stable log1pexp calculationWhat are good approximations for computing `log1pexp` for single precision and double precision floating point numbers? Note: `log1pexp(x)` is `log(1 + exp(x))` I have found few implementations of `log1pexp` for double precision but they don't provide an explanation on how they arrived at the approximations. Hence, I am not able to implement `log1exp` for single precision numbers (without converting to double precision intermediates of course). Reference implementation: https://github.com/davisking/dlib/blob/master/dlib/dnn/utilities.h#L16-L29
- debug minor 112d agoRigorous error bounds for eigenvalue solversI computed the first four eigenvalues of a quite large ($2^{24}\times 2^{24}$) but very sparse matrix. I used pythons in-build function sparse.linalg.eigsh to compute them. I need a validation that the results are correct up to an error of at most $10^{-3}$. Very specifically I want to ask if there is a rigorous error-analysis involving rounding errors up to machine precision in each step. More generally: Are there other practical algorithms for eigenvalues with such a rigorous error analysis.
- pattern minor 112d agoCould a Van Emde Boas tree be used for storing matrices?I'm aware that typical techniques to store matrices in sparse form are compressed formats or maps where the key is the pair of indices and value the value of the entry in a matrix. I was wondering if vEB trees could be used to such purpose as well. At the end of the day the dynamic set to be stored would be the pairs $(i,j)$ in lexicographic order. If $MN$ is the size of the matrix a vEB tree would allow the access to a specific entry in time $O(log(log(MN))$ which isn't bad, though the space requirement doesn't change, but maybe with some improvements something interesting could come out. Is there some research in this direction? I quickly looked up on google doesn't seem anything specific comes out. Thank you
- pattern minor 112d agoWhy is adding log probabilities considered "numerically stable"?Every once in a while, I come across the term numerical stability, which I don't really understand. In particular, I have seen a description of the practice of "adding logs rather than multiplying numbers" as "numerically stable." I would like to know why this is considered numerically stable. By "adding logs rather than multiplying numbers" I mean that if you have several numbers, for example 0.01, 0.001, 0.0001, and you want to get their product, you can instead add the logs of each term. In this case assuming log base 10, the result would be -2 -3 -4 = -9. This doesn't give you the same output as multiplying, but it's a good way to get something like the product so that you don't experience numerical underflow. My question is that I'm a bit confused because the definitions of numerical stability I've found on the Internet don't seem to apply to this case. The definitions of "numerical stability" I've found are that it occurs when a "malformed input" doesn't affect the performance of an algorithm, see for example here. In this case I don't really see how we would consider the numbers 0.01 etc "malformed," they are what they are. It would be more accurate to say that the algorithm (of multiplying them) is bad in this context since the computer can't handle it, so we choose a better algorithm. So why do people say this is "numerically stable"?