snippetModerate
How to determine likely connections in a social network?
Viewed 0 times
sociallikelydeterminehownetworkconnections
Problem
I am curious in determining an approach to tackling a "suggested friends" algorithm.
Facebook has a feature in which it will recommended individuals to you which it thinks you may be acquainted with. These users normally (excluding the edge cases in which a user specifically recommends a friend) have a highly similar network to oneself. That is, the number of friends in common are high. I assume Twitter follows a similar path for their "Who To Follow" mechanism.
Stephen Doyle (Igy), a Facebook employee suggested that the related newsfeed that uses EdgeRank formula which seems to indicate that more is to valued than friends such as appearance is similar posts. Another user suggested the Google Rank system.
Facebook states their News Feed Optimization as $\sum u_{e}w_{e}d_{e}$ where
$u_{e}$ = affinity score between viewing user and edge creator
$w_{e}$ = weight for this edge (create, comment, like, tag, etc)
$d_{e}$ = time decay factor based on how long ago the edge was created
Summing these items is supposed to give an object's rank which I assume as Igy hinted, means something in a similar format is used for suggested friends.
So I'm guessing that this is the way in which connections for all types are done in general via a rank system?
Facebook has a feature in which it will recommended individuals to you which it thinks you may be acquainted with. These users normally (excluding the edge cases in which a user specifically recommends a friend) have a highly similar network to oneself. That is, the number of friends in common are high. I assume Twitter follows a similar path for their "Who To Follow" mechanism.
Stephen Doyle (Igy), a Facebook employee suggested that the related newsfeed that uses EdgeRank formula which seems to indicate that more is to valued than friends such as appearance is similar posts. Another user suggested the Google Rank system.
Facebook states their News Feed Optimization as $\sum u_{e}w_{e}d_{e}$ where
$u_{e}$ = affinity score between viewing user and edge creator
$w_{e}$ = weight for this edge (create, comment, like, tag, etc)
$d_{e}$ = time decay factor based on how long ago the edge was created
Summing these items is supposed to give an object's rank which I assume as Igy hinted, means something in a similar format is used for suggested friends.
So I'm guessing that this is the way in which connections for all types are done in general via a rank system?
Solution
What you're looking for is a heuristic. No algorithm can say, given a graph of friends as the only input, whether two individuals not directly connected are friends or aren't; the friendship/acquaintance relation isn't guaranteed to be transitive (we can assume symmetry, but that might even be a stretch in real life). Any good heuristic will therefore need to be based on an understanding of how people interact, rather than some mathematical understanding of the nature of graphs of relations (although we will need to quantify the heuristic in these terms).
Suggesting friends of friends with equal probability is a relatively cheap but inaccurate heuristic. For instance, my father has friends, but I wouldn't say I'm friends with any of them (although I'd probably say I'm a friend of my father's for the purposes of, e.g., a social network). Having a person at a relatively close distance doesn't necessarily make them a great candidate.
Suggesting people to whom you have a great many extended connections also seems like a poor choice in general, because this will tend to lead to exponential growth of friends of people who pull ahead early on (the seven degrees of separation from Kevin Bacon game is an example of this).
I suggest a circuit-based model. Assume that each link is a resistor of resistance $R$. Then the best candidate for a new friend might be the individual with the lowest equivalent resistance. Here's a poorly-executed ASCII graphics example:
Say we want to find new friends for
According to this heuristic,
EDIT: So what's my social motivation for this? Well, this might be a rough model of how hard it is to get in touch with, and subsequently communicate possibly significant amounts of information through, intermediaries (friends). In CS terms (rather than physics terms), this might be construed as bandwidth between two nodes in a graph. Extensions of this system would be to allow different kinds of links between people with different weights (resistance, bandwidth, etc.) and proceed as above.
Suggesting friends of friends with equal probability is a relatively cheap but inaccurate heuristic. For instance, my father has friends, but I wouldn't say I'm friends with any of them (although I'd probably say I'm a friend of my father's for the purposes of, e.g., a social network). Having a person at a relatively close distance doesn't necessarily make them a great candidate.
Suggesting people to whom you have a great many extended connections also seems like a poor choice in general, because this will tend to lead to exponential growth of friends of people who pull ahead early on (the seven degrees of separation from Kevin Bacon game is an example of this).
I suggest a circuit-based model. Assume that each link is a resistor of resistance $R$. Then the best candidate for a new friend might be the individual with the lowest equivalent resistance. Here's a poorly-executed ASCII graphics example:
_____
/ \
a---c f
| | /
b d---e
| \ |
g h iSay we want to find new friends for
a. a's current friends are b, c, and f. We evaluate the net equivalent resistance between a and each of d, e, g, h, and i:pair resistance
(a,d) 6/7
(a,e) 13/7
(a,g) 7/4
(a,h) 1/1
(a,i) infAccording to this heuristic,
d is the best candidate friend, followed closely by h. g is the next best bet, followed closely by e. i can never be a candidate friend by this heuristic. Whether you find the results of this heuristic to be representative of real human social interactions is what's important. Computationally speaking, this would involve finding a subgraph containing all paths between two individuals (or, perhaps interestingly, some meaningfully selected truncation of this), then evaluating the equivalent resistance between the source and sink nodes.EDIT: So what's my social motivation for this? Well, this might be a rough model of how hard it is to get in touch with, and subsequently communicate possibly significant amounts of information through, intermediaries (friends). In CS terms (rather than physics terms), this might be construed as bandwidth between two nodes in a graph. Extensions of this system would be to allow different kinds of links between people with different weights (resistance, bandwidth, etc.) and proceed as above.
Code Snippets
_____
/ \
a---c f
| | /
b d---e
| \ |
g h ipair resistance
(a,d) 6/7
(a,e) 13/7
(a,g) 7/4
(a,h) 1/1
(a,i) infContext
StackExchange Computer Science Q#261, answer score: 11
Revisions (0)
No revisions yet.