HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

Password Attacker (Google apac test problem)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
apacproblemgooglepasswordattackertest

Problem

Original Problem

Short Summary:


Given a \$M\$ possible characters all of which must be used at least
once, find the number of possible combinations of length (exactly)
\$N\$.


The first input contains the number of cases, followed by pairs of M,
N
. Output is modulo 1000000007(\$10^9+7\$).


Example Input:

4
1 1
3 4
5 5
15 15



Example Output:

Case #1: 1
Case #2: 36
Case #3: 120
Case #4: 674358851


The stupid solution is to simply generate all possible repeated_permutations and then exclude the ones which do not use at least M different characters (this of course is painfully slow).

def num_pw_slow(m,n)
m.times.to_a.repeated_permutation(n).count do |row|
m.times.map {|i| row.include? i}.all?
end
end


The real solution:

It's noticeable that the number of solutions which include all M characters is also: (The number of repeated permutations) \$ - \$ (the number of repeated permutations missing at least one character).

The number of repeated permutations is simply \$m^n\$. The second is a little more involved. Combinations are given by \$n!/ (n-k)!k!\$ where \$n=m\$ and \$k = m-1\$. Permutations are \$(m-1)^n\$, the tricky part is we need the set of permutations, since each of the \$(m-1)\$ possible combinations differs from another by a single character, its repeated_permutations will contain a duplicates for all permutations which exclude that character.

For example if m=3, n=4, there are \$3^4\$ repeated_permutations possible - with repeated permutations of ab, bc, ca (possible combinations of m, with length m-1) missing one required character when expanded to length n, the set operation over which needs to eliminate duplicates from repeated_permutations of a, b, c So the solution is then:

34 - (243 - 1**43)


This can be generalized for any m, n e.g.

`mn - (m-1)*num_combinationsn
#second term is the number of duplicates amongst repeated_permutations of (m-1)
+ (m-2)*num_combinations

Solution

This kind of problem can be solved with dynamic programming. The key points are :

-
you care about the number of combinations and not the list of them

-
assuming you know how to compute it for some values, you can go forward and compute it for even more values.

Here is how it could go (not verified whatsoever).

Given the length n (n > 0) and the number of distinct characters m (m > 0), you know that :

-
nb_comb(n, 1) = 1 which corresponds to strings with a single character.

-
nb_comb(n, m) = 0 if 1

-
nb_comb(n, m) = m nb_comb(n-1, m-1) + m nb_comb(n-1, m) if 1

-
consider string containing all m characters and add any of the already added character : you have nb_comb(n-1, m) such strings and m ways to add an already existing character at the end. This contributes to m * nb_comb(n-1, m) new combinations.

(You can check that strings have not been counted twice (either the last character is a duplicate from another or it is not a duplicate).)

Without knowing it corresponds to Stirling numbers, I managed to deduce the same equations as the one in that stackoverflow question.

Python recursive solution to check it works :

import itertools

def nb(n, m):
    #print(n, m, "?")
    assert n>0
    assert m>0
    if m == 1:
        res = 1
    elif n == 1:
        res = 0
    else:
        res = m * (nb(n-1, m-1) + nb(n-1, m))
    if False: # check results
        slow_res = sum(1 for p in itertools.product(range(m), repeat=n) if len(set(p)) == m)
        assert slow_res == res
    return res

tests = [
    (1, 1, 1),
    (2, 2, 2),
    (3, 3, 6),
    (4, 3, 36),
    (5, 5, 120),
    (15, 15, 674358851),
]
for n, m, r in tests:
    nb_ = nb(n, m) % (1000000007)
    assert nb_ == r, "(%s,%s,%s != %s)" % (n, m, r, nb_)


This still gets way too slow for bigger solutions. Having a look at the functions calls for the case 15-15, for instance, adding a call to print at the beginning of the fonction and running python my_script | sort -n | uniq -c | sort -n, you get :

...
   1001 (1, 11)
   1001 (1, 5)
   1287 (2, 10)
   1287 (2, 7)
   1716 (2, 8)
   1716 (2, 9)
   2002 (1, 10)
   2002 (1, 6)
   3003 (1, 7)
   3003 (1, 9)
   3432 (1, 8)


showing that the fonction gets called with the same arguments thousands of times.

A solution would be memoization but a better one is to go the other way, generating the different values.

A way to do so is :

def nb2(n, m):
    # nb(i, j) is in results[i-1][j-1]
    results = [[None for j in range(m)] for i in range(n)]
    for j in range(m):
        results[0][j] = 0
    for i in range(n):
        results[i][0] = 1
    for i in range(1, n):
        for j in range(1, m):
            results[i][j] = (j+1) * (results[i-1][j-1] + results[i-1][j])
    return results[n-1][m-1]

Code Snippets

import itertools

def nb(n, m):
    #print(n, m, "?")
    assert n>0
    assert m>0
    if m == 1:
        res = 1
    elif n == 1:
        res = 0
    else:
        res = m * (nb(n-1, m-1) + nb(n-1, m))
    if False: # check results
        slow_res = sum(1 for p in itertools.product(range(m), repeat=n) if len(set(p)) == m)
        assert slow_res == res
    return res

tests = [
    (1, 1, 1),
    (2, 2, 2),
    (3, 3, 6),
    (4, 3, 36),
    (5, 5, 120),
    (15, 15, 674358851),
]
for n, m, r in tests:
    nb_ = nb(n, m) % (1000000007)
    assert nb_ == r, "(%s,%s,%s != %s)" % (n, m, r, nb_)
...
   1001 (1, 11)
   1001 (1, 5)
   1287 (2, 10)
   1287 (2, 7)
   1716 (2, 8)
   1716 (2, 9)
   2002 (1, 10)
   2002 (1, 6)
   3003 (1, 7)
   3003 (1, 9)
   3432 (1, 8)
def nb2(n, m):
    # nb(i, j) is in results[i-1][j-1]
    results = [[None for j in range(m)] for i in range(n)]
    for j in range(m):
        results[0][j] = 0
    for i in range(n):
        results[i][0] = 1
    for i in range(1, n):
        for j in range(1, m):
            results[i][j] = (j+1) * (results[i-1][j-1] + results[i-1][j])
    return results[n-1][m-1]

Context

StackExchange Code Review Q#91476, answer score: 2

Revisions (0)

No revisions yet.