patternMinor
Convergence of recursive formula in TextRank
Viewed 0 times
recursiveformulatextrankconvergence
Problem
I am reading a paper about the TextRank algorithm in keyword extraction and they mention the recursive formula:
$$ \displaystyle S(V_{i}) = (1 - d) + d \ast \sum_{j \in In(V_{i})} \frac{1}{|Out(V_{j})|} \; S(V_{j}) $$
where $ In(V_{i}) $ are the vertices that point to $ V_{i} $ and $ Out(V_{i}) $ are the vertices that $ V_{i} $ points to.
The paper claimed that this algorithm will converge for any arbitrarily small positive threshold and any arbitrary initial values by testing with experiments. However, I am attempting to prove it mathematically that this algorithm will always converge regardless of the choice of the threshold as well as the initial values, but haven't succeeded at this point. My approach is to prove that if $ E_{ij} $ is the error at vertex $ i $ at the $ j $ iteration, then the sequence $ E_{i1}, E_{i2}, \dots, E_{in} $ is a decreasing sequence. Intuitively, it means that the error rate approaches $ 0. $
Any hint or suggestion is appreciated.
$$ \displaystyle S(V_{i}) = (1 - d) + d \ast \sum_{j \in In(V_{i})} \frac{1}{|Out(V_{j})|} \; S(V_{j}) $$
where $ In(V_{i}) $ are the vertices that point to $ V_{i} $ and $ Out(V_{i}) $ are the vertices that $ V_{i} $ points to.
The paper claimed that this algorithm will converge for any arbitrarily small positive threshold and any arbitrary initial values by testing with experiments. However, I am attempting to prove it mathematically that this algorithm will always converge regardless of the choice of the threshold as well as the initial values, but haven't succeeded at this point. My approach is to prove that if $ E_{ij} $ is the error at vertex $ i $ at the $ j $ iteration, then the sequence $ E_{i1}, E_{i2}, \dots, E_{in} $ is a decreasing sequence. Intuitively, it means that the error rate approaches $ 0. $
Any hint or suggestion is appreciated.
Solution
Your process is a Markov chain, so I suggest you try applying the theory of Markov chains. There's a rich and powerful theory that has been built up over time, which might be applicable here.
In particular, if we let $S$ denote the column vector $S = (S(V_1),S(V_2),\dots,S(V_n))$, then there is a matrix $n\times n$ $M$ such that each iteration maps $S$ to $MS$. If the underlying graph is strongly connected, it looks like this Markov chain is irreducible, ergodic, aperiodic, and all its states are recurrent. From this you should be able to deduce that it has a stationary distribution $\pi$ and that the stationary distribution is the unique limiting distribution of the chain, hence the algorithm eventually converges. You should check all of this yourself carefully; I haven't checked any of this.
The rate of convergence will be dependent on the magnitude of the second-largest eigenvalue of the Markov chain.
Look at the analysis of PageRank for a detailed, worked example applying these concepts to a similar algorithm. This analysis can be found in many textbooks and algorithms courses.
In particular, if we let $S$ denote the column vector $S = (S(V_1),S(V_2),\dots,S(V_n))$, then there is a matrix $n\times n$ $M$ such that each iteration maps $S$ to $MS$. If the underlying graph is strongly connected, it looks like this Markov chain is irreducible, ergodic, aperiodic, and all its states are recurrent. From this you should be able to deduce that it has a stationary distribution $\pi$ and that the stationary distribution is the unique limiting distribution of the chain, hence the algorithm eventually converges. You should check all of this yourself carefully; I haven't checked any of this.
The rate of convergence will be dependent on the magnitude of the second-largest eigenvalue of the Markov chain.
Look at the analysis of PageRank for a detailed, worked example applying these concepts to a similar algorithm. This analysis can be found in many textbooks and algorithms courses.
Context
StackExchange Computer Science Q#59777, answer score: 2
Revisions (0)
No revisions yet.