HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Fast counting of interactions between groups given users interactions and groups assignments

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fastassignmentsgroupscountinginteractionsbetweenandusersgiven

Problem

Given:

-
An array c that contains the group assignment of every user (c[i]=4 indicates that user i belongs to group 4)

-
A matrix y of 0 and 1 that indicates whether user i interacted with user j (y[i][j] = 1 if i interacted with j, 0 otherwise)

What is the fastest way to count the number of interactions and lack of interactions between every pair of groups?

This is what I have:

Solution 1:

This is what I have:

import numpy as np

def test1(y, c):

    N = len(c) # number of users
    C = len(set(c)) # number of clusters 

    # Final counts matrices
    I = np.zeros((C,C)) 
    notI = np.zeros((C,C))

    # Loop every row i/ column j, look at the groups where the users belong
    # and update I accordingly
    for i in range(N):
        for j in range(N):
            if y[i][j] == 1:
                I[c[i]][c[j]] += 1
            else: 
                notI[c[i]][c[j]] += 1

    return I, notI


timeit:

nusers = 100
nclusters = 4
y = np.random.random_integers(0, 1, (nusers,nusers))
c = np.random.choice(nclusters, nusers)

%timeit test1(y,c)


10 loops, best of 3: 42 ms per loop

Solution 2:

def test2(y, c):

    N = len(c)
    C = len(set(c))
    I = np.zeros((C,C))
    notI = np.zeros((C,C))

    for i, jj in enumerate(y):
        for j, value in enumerate(jj):
            if value:
                I[c[i]][c[j]] += 1
            else: 
                notI[c[i]][c[j]] += 1

    return I, notI


timeit:
10 loops, best of 3: 30.8 ms per loop

Solution

Re-formulating the problem into a matrix multiplication one speeds up the algorithm very much.

Algebra:

Create a matrix G of n_groups x n_users such that every column is a group mask, that is, all users (columns) that belong to that group (row) are flagged are 1 and 0 otherwise. If Y is the matrix of interactions, then the interactions between every pair of groups can be obtained by the following equation:

$$I = GYG^T$$

Code:

The code is:

import numpy as np
from numpy import dot

def test3(y, c):

    N = len(c)
    C = len(set(c))
    noty = np.mod(y+1,2)

    # create C group masks of 1 (if user is belongs to group) and 0. 
    G = np.zeros((C,N))
    for group in range(C):
        G[group] = [1 if c_== group else 0 for c_ in c]

    # Groups (src) * interaction_matrix * Groups_transposed (dst)
    # Every row of Groups matrix is the mask of a group
    return dot(dot(G,y),G.T), dot(dot(G,noty),G.T)


1000 loops, best of 3: 810 µs per loop

Code Snippets

import numpy as np
from numpy import dot

def test3(y, c):

    N = len(c)
    C = len(set(c))
    noty = np.mod(y+1,2)

    # create C group masks of 1 (if user is belongs to group) and 0. 
    G = np.zeros((C,N))
    for group in range(C):
        G[group] = [1 if c_== group else 0 for c_ in c]

    # Groups (src) * interaction_matrix * Groups_transposed (dst)
    # Every row of Groups matrix is the mask of a group
    return dot(dot(G,y),G.T), dot(dot(G,noty),G.T)

Context

StackExchange Code Review Q#67232, answer score: 3

Revisions (0)

No revisions yet.