HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Which optimization algorithm would you recommend for this small multidimensional problem?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thisproblemmultidimensionalyouoptimizationalgorithmwouldsmallforwhich

Problem

Which algorithm would be suitable for finding or estimating the vector
$$\mathbf{s}_{opt}=\begin{bmatrix}
s_1 & \cdots & s_N
\end{bmatrix}=\arg\max_{\mathbf{s}}\sum_{n=1}^{N}p_{s_n,n}$$
under the constraint
$$\sum_{n=1}^{N}l_{s_n,n}\leq L_{max}$$
given known values for the integer $N$, the vector
$\mathbf{p}_n=\begin{bmatrix}
p_{1,n}\\
\vdots \\
p_{M_n,n}
\end{bmatrix}$, the vector
$\mathbf{l}_n=\begin{bmatrix}
l_{1,n}\\
\vdots \\
l_{M_n,n}
\end{bmatrix}$, the integer $L_{max}$ and all integers $M_n$?

Here are some restrictions showing that the size of the problem isn't very big:
$$N\:\epsilon\:\mathbb{Z}\:|\:0<N<50$$
$$p\:\epsilon\:\mathbb{R}\:|\:0\leq p_{m,n}\leq 1\:|\:p_{1,n}=1\:|\:p_{M_n,n}=0$$
$$l\:\epsilon\:\mathbb{Z}\:|\:0\leq l_{m,n}\leq 30\:|\:l_{M_n,n}=0$$
$$M_n\:\epsilon\:\mathbb{Z}\:|\:2\leq M_n \leq 20$$

Another circumstance which may be of interest is that the problem will be solved repeatedly. In the first iteration $N=1$ and then for each iteration $N$ will increase as per the sequence $N=1,2,3,...$. The vectors $\mathbf{p}_n$ and $\mathbf{l}_n$, as well as the other inputs, will remain unchanged for all iterations $N=1,2,3,...$, meaning that the same problem just grows slightly step by step.

I am planning to implement this algorithm in JavaScript.

Edit: Real-world case

The mathematical optimization problem above has its origin in a real world (however, stupid) problem. Assume you need to communicate a text of $N$ words, but you have a limited total amount of characters to use, $L_{max}$. For each word $n$ you have found $M_n$ number of alternative representations (such as acronyms and synonyms) each with a probability of the reader to understand approximated to $p_{m,n}$. Now the goal is to choose which representation $s_n$ of each word should be chosen in order to maximize the probability of the receiver to understand the text. This probability can be approximated to the mean value all individual word probabilities.

Solution

This problem is effectively knapsack, where the size of the knapsack is $L_{max}$. The items are $1,\ldots,n$, and for each item $i$ we can select a quantity (to include in our knapsack) of $j\in\{2,\ldots,M_n\}$. Taking quantity $j$ of item $i$ gives reward $p_{j,i}$ and takes up $l_{j,i}$ space in the knapsack.

Let $D(n,r)$ be $max_s \sum_{i=1}^{n} p_{s_i,i}$ under the constraint that $\sum_{i=1}^{n} l_{j,i} \leq r$. $D(N,L_{max})$ (which is the value you're after) can be computed in $O(L_{max}\cdot NM)$ time (where $M$ is $max_i M_i$) using dynamic programming.

Effectively $D(n,r)$ represents the best way (mean probability) to represent the first $n$ words using at most $r$ space. The best way to represent $D(n,r)$ is to combine some choice of representing word $n$ with the best way to represent the remaining $n-1$ words in however many characters are left. This is expressed by the recurrence relation:

$$D(n,r)=max_{s_n\in\{2,\ldots,M_n\}} p_{s_n,n} + D(n-1,r-l_{s_n,n})$$

Where $D(,r)=-\infty$ if $r<0$ and $D(0,)=0$.

What is very important is that you store (and reuse) the intermediate values of $D(n,r)$ during the computation, as this will save significantly on computation time. Reusing these values will also allow you to deal with the growing input requirement, since the values encountered when computing $D(n,)$ can be reused when computing $D(n+1,)$.

Context

StackExchange Computer Science Q#42389, answer score: 4

Revisions (0)

No revisions yet.