HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Minimal number of attempts at a multiple choice exam needed in order to pass, without any prior knowledge

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
withoutnumberneededorderknowledgepassprioranyexamchoice

Problem

A test is consisted of $N$ multiple choice questions, each has $k$ possible answers.

A test solution is the sequence of answers $S\in[k]^N$.

Given is a black box which receives a solution as input and returns the number of correct answers. After failing a test, the student may take the test again, and the questions do not change.

In order to pass the test, a student must have at least $m\le N$ correct answers.

-
What is the minimal number of attempts you need to make (as a function of $N,k$ and $m$) in order to be sure to pass the test, assuming you have no knowledge about the questions in hand?

-
Assuming you are allowed to use randomization, how can we minimize the expected number of attempts needed?

A trivial algorithm is to "guess" an initial solution $S_0$, and then change one answer at a time until you have $m$ correct ones. This results in $1+(k-1)\cdot m$ attempts at the worst case.

For the probabilistic part, it seems some kind of an experts algorithm could work here: pick a few random tests and then construct additional tests such that answers appearing in high-scoring solutions will get higher probability.

Solution

Here is an algorithm to identify the answers with successive attempts.

It has a worst case complexity of $O(m(1+\log_2 k))$

This corresponds roughly to collecting in each attempt one bit of
information on the right answer, for each question in succession.

Of course we cannot do that, because we can only give one whole answer
per question. So the trick in to transpose the problem, by identifying k
correct answers and trying to find their questions.

Apologies for the rough presentation. I am running out of time.

The algorithm

We assume, to simplify the presentation, that $N\geq k$. But it does
not really matter. We also assume that $k=2^p$ for some positive
integer $p$, again to simplify the presentation.

We assume that the only result of an attempt $T\in [1,k]^N$ is a score $s(T)$
corresponding to the number of correct answers. The individual
answers for each question are noted $T[j]\in [1,k]$ for the answer to question
$j$, ($j\in [1,N]$), in attempt $T$.

We define $T_i=i^k1^{N-k}, \forall i\in[1,k]$

The $N-k$ last answers always contribute the same amount $r$ to the result
$s(T_i)$ since they do not change.

Together, the first $k$ answers of all $k$ attempts will produce $k$
good answers, distributed in some way over all $k$ attempts, since for
each question there is exactly one good answer among the $k$ being
proposed.

If one attempt has 2 good answers, or more, for one of the first $k$
questions, then another must have 0. Thus, if the scores $s(T_i)$ are
not all equal, the smallest is equal to $r$. Furthermore, if they are
all equal, then they are all equal to $r+1$. So the value of $r$ is easily determined.

We first assume $\exists r\in[0,N-k],\; \forall i\in[1,k],\; s(T_i)=r+1$

This implies that each attempt has exactly 1 good answer among the
first $k$ answers (to the first $k$ questions). But we do not know which.

We define a second set of $k-1$ attempts
$T_i'=i^{k/2}1^{N-k/2}, \forall i\in[1,k]$

Note, that $T_1'=T_1$, so that k-1 new attempts only are needed,
but the duplication makes things simpler to explain.

If the good answer $1$ in the first $k$ questions in $T_1$ was for one of
the first $k/2$ questions, then $1$ is a bad answer for questions
$Q_j,\; \forall j\in[k/2+1,k]$. Hence all the $T_i$ attempts that had
a good answer for one of these $k/2$ questions will lose 1 point in
score, thus falling to $r$. All other attempts, that had a good answer
in the first $k/2$ questions will have an unchanged score $r+1$.

If the good answer $1$ in the first $k$ questions in $T_1$ was for one of
the last $k/2$ questions, i.e. for questions $Q_j,\; \forall
j\in[k/2+1,k]$, then all attempts $T_i'$ benefit from that good
answer $1$. Those that had a good answer for the first $k/2$ questions
will keep it and see their score increase from $r+1$ to $r+2$, while
the others will lose their good answer as before, but have it replaced
by the good answer $1$ for one question, thus keeping the score $r+1$.

Thus we can conclude that, if some score decrease, then the firat
attempt $T1$ has a good answer $1$ in one of the first $k/2$ questions,
and if some increase it is in one of the next $k/2$ questions.
Forall other attempts $T_i'$, those with the higher score have a good
answer for one of the first $k/2$ questions, and those with the lower
score have a good answer in one of the next $k/2$ questions.

Recall that $k=2^p$, i.e. $p=\log_2 k$.

If we number the first $k$ questions in binary, starting from 0, we
can see that this is a way to determine the $p^{th}$ bit of the index
question that is correctly answered by some attempt $T_i$.

If for some $i>1$, the score decrease from $T_i$ to $T_i'$, then
$T_1$ has a good answer for a question $Q$ such that the $p_{th}$ bit
of its binary index is $0$, while it is $1$ for an increase. For the
other attempts $T_i$, those with the higher score for $T_i'$ have a
good answer for a questions with the index $p^{th}$ bit $0$, and the
lower scores have a good answer for a questions with the index
$p^{th}$ bit $1$

It is then quite simple to repeat the procedure identically for each of
the other bits of the binary index of the first $k$ questions.

That makes $p=\log_2k$ series of $k-1$ attempts, in addition to the
first $k$ attempts, that is a total of $k+(k-1)\log_2k$ attempts,
to get the first k answers.

However we can improve a bit on that count. We do $\log_2 k$ attempts
to identify the question associated with each answer, i.e. each index
$i\in [1,k]$. Actually we need to identify only $k-1$ such question.
The last one in necessarily the one that was not found yet.

Thus in the above algorithm, we do not need to compute $T_{k}'$, nor
any of the other attempts of index $k$, except for the first $T_k$ in
the first attempts of the algorithm. Since it is also unnecessary for
$T_1'$ and similar, we only need, at most $\log_2k$ series of $k-2$
attempts, in addition to the initial $k$ attempts, that is a total of
$k+(k-2)\log_2k$ attempts, to get the first k answ

Context

StackExchange Computer Science Q#43649, answer score: 3

Revisions (0)

No revisions yet.