Graph fibrations, graph isomorphism, and PageRank

Paolo Boldi; Violetta Lonati; Massimo Santini; Sebastiano Vigna

RAIRO - Theoretical Informatics and Applications (2006)

  • Volume: 40, Issue: 2, page 227-253
  • ISSN: 0988-3754

Abstract

top
PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local isomorphism of in-neighbourhoods, much in the same way a covering projection is a local isomorphism of neighbourhoods. We show that a deep connection relates fibrations and Markov chains with restart, a particular kind of Markov chains that include the PageRank one as a special case. This fact provides constraints on the values that PageRank can assume. Using our results, we show that a recently defined class of graphs that admit a polynomial-time isomorphism algorithm based on the computation of PageRank is really a subclass of fibration-prime graphs, which possess simple, entirely discrete polynomial-time isomorphism algorithms based on classical techniques for graph isomorphism. We discuss efficiency issues in the implementation of such algorithms for the particular case of web graphs, in which O(n) space occupancy (where n is the number of nodes) may be acceptable, but O(m) is not (where m is the number of arcs).

How to cite

top

Boldi, Paolo, et al. "Graph fibrations, graph isomorphism, and PageRank." RAIRO - Theoretical Informatics and Applications 40.2 (2006): 227-253. <http://eudml.org/doc/249717>.

@article{Boldi2006,
abstract = { PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local isomorphism of in-neighbourhoods, much in the same way a covering projection is a local isomorphism of neighbourhoods. We show that a deep connection relates fibrations and Markov chains with restart, a particular kind of Markov chains that include the PageRank one as a special case. This fact provides constraints on the values that PageRank can assume. Using our results, we show that a recently defined class of graphs that admit a polynomial-time isomorphism algorithm based on the computation of PageRank is really a subclass of fibration-prime graphs, which possess simple, entirely discrete polynomial-time isomorphism algorithms based on classical techniques for graph isomorphism. We discuss efficiency issues in the implementation of such algorithms for the particular case of web graphs, in which O(n) space occupancy (where n is the number of nodes) may be acceptable, but O(m) is not (where m is the number of arcs). },
author = {Boldi, Paolo, Lonati, Violetta, Santini, Massimo, Vigna, Sebastiano},
journal = {RAIRO - Theoretical Informatics and Applications},
keywords = {Graph fibrations; PageRank; Markov chain with restart.; graph fibrations; Markov chain with restart},
language = {eng},
month = {7},
number = {2},
pages = {227-253},
publisher = {EDP Sciences},
title = {Graph fibrations, graph isomorphism, and PageRank},
url = {http://eudml.org/doc/249717},
volume = {40},
year = {2006},
}

TY - JOUR
AU - Boldi, Paolo
AU - Lonati, Violetta
AU - Santini, Massimo
AU - Vigna, Sebastiano
TI - Graph fibrations, graph isomorphism, and PageRank
JO - RAIRO - Theoretical Informatics and Applications
DA - 2006/7//
PB - EDP Sciences
VL - 40
IS - 2
SP - 227
EP - 253
AB - PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local isomorphism of in-neighbourhoods, much in the same way a covering projection is a local isomorphism of neighbourhoods. We show that a deep connection relates fibrations and Markov chains with restart, a particular kind of Markov chains that include the PageRank one as a special case. This fact provides constraints on the values that PageRank can assume. Using our results, we show that a recently defined class of graphs that admit a polynomial-time isomorphism algorithm based on the computation of PageRank is really a subclass of fibration-prime graphs, which possess simple, entirely discrete polynomial-time isomorphism algorithms based on classical techniques for graph isomorphism. We discuss efficiency issues in the implementation of such algorithms for the particular case of web graphs, in which O(n) space occupancy (where n is the number of nodes) may be acceptable, but O(m) is not (where m is the number of arcs).
LA - eng
KW - Graph fibrations; PageRank; Markov chain with restart.; graph fibrations; Markov chain with restart
UR - http://eudml.org/doc/249717
ER -

References

top
  1. P. Boldi, M. Santini and S. Vigna, PageRank as a function of the damping factor, in Proc. of the Fourteenth International World Wide Web Conference. ACM Press. Chiba, Japan (2005) 557–566.  
  2. P. Boldi and S. Vigna, Fibrations of graphs. Discrete Math.243 (2002) 21–66.  
  3. P. Boldi and S. Vigna, The WebGraph framework I: Compression techniques, in Proc. of the Thirteenth International World Wide Web Conference. ACM Press, Manhattan, USA (2004) 595–601.  
  4. A. Cardon and M. Crochemore, Partitioning a graph in O(|A|log2|V|). Theoret. Comput. Sci.19 (1982) 85–98.  
  5. D.G. Corneil and C.C. Gotlieb, An efficient algorithm for graph isomorphism. J. Assoc. Comput. Mach.17 (1970) 51–64.  
  6. L. Eldén, The eigenvalues of the google matrix. Technical Report LiTH-MAT-R-04-01, Department of Mathematics, Linköping University, 2004. Available at .  URIarXiv:math/0401177
  7. G.H. Golub and C. Greif, Arnoldi-type algorithms for computing stationary distribution vectors, with application to PageRank, Technical Report SCCM-04-15, Stanford University Technical Report (2004).  
  8. M. Gori, M. Maggini and L. Sarti, Exact and approximate graph matching using random walks. IEEETPAMI: IEEE Trans. Pattern Anal. Machine Intelligence27 (2005) 1100–1111.  
  9. J.L. Gross and T.W. Tucker, Topological Graph Theory. Series in Discrete Mathematics and Optimization, Wiley-Interscience (1987).  
  10. A. Grothendieck, Technique de descente et théorèmes d'existence en géométrie algébrique, I. Généralités. Descente par morphismes fidèlement plats. Seminaire Bourbaki190 (1959–1960).  
  11. T.H. Haveliwala, Efficient computation of PageRank. Technical Report 31, Stanford University Technical Report, October 1999. Available at .  URIhttp://dbpubs.stanford.edu/pub/1999-31
  12. T.H. Haveliwala, Topic-sensitive pagerank, in The eleventh International Conference on World Wide Web Conference. ACM Press (2002) 517–526.  
  13. T.H. Haveliwala and S.D. Kamvar, The second eigenvalue of the Google matrix. Technical Report 20, Stanford University Technical Report, March 2003. Available at .  URIhttp://dbpubs.stanford.edu/pub/2003-20
  14. P. Híc, R. Nedela and S. Pavlíková, Front-divisors of trees. Acta Math. Univ. Comenian. (N.S.)61 (1992) 69–84.  
  15. J.E. Hopcroft, An nlogn algorithm for minimizing states in a finite automaton, in Theory of Machines and Computations, edited by Z. Kohavi and A. Paz. Academic Press (1971).  
  16. M. Iosifescu, Finite Markov Processes and Their Applications. John Wiley & Sons (1980).  
  17. S.D. Kamvar, T.H. Haveliwala, C.D. Manning and G.H. Golub, Exploiting the block structure of the web for computing PageRank. Technical Report 17, Stanford University Technical Report, March 2003. Available at .  URIhttp://dbpubs.stanford.edu/pub/2003-17
  18. S.D. Kamvar, T.H. Haveliwala, C.D. Manning and G.H. Golub, Extrapolation methods for accelerating PageRank computations, in Proceedings of the twelfth international conference on World Wide Web. ACM Press (2003) 261–270.  
  19. T. Kato, Perturbation Theory for Linear Operators. Springer-Verlag, second edition (1976).  
  20. L. László, Random walks on graphs: A survey, in Combinatorics, Paul Erdős is Eighty, Vol. 2, Bolyai Society Mathematical Studies, 1993, in Proceedings of the Meeting in honour of P. Erdős, Keszthely, Hungary 7 (1993) 1–46.  
  21. C. Pan-Chi Lee, G.H. Golub and S.A. Zenios, A fast two-stage algorithm for computing PageRank and its extensions. Technical Report SCCM-03-15, Stanford University Technical Report (2003). Available at .  URIhttp://www-sccm.stanford.edu/pub/sccm/sccm03-15_2.pdf
  22. D. Lind and B. Marcus, An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge UK (1995).  
  23. B.D. McKay, Practical graph isomorphism. Congressus Numerantium30 (1981) 45–87.  
  24. C.D. Meyer, Matrix analysis and applied linear algebra. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2000).  
  25. M. Nasu, Constant-to-one and onto global maps of homomorphisms between strongly connect graphs. Ergod. Th. & Dynam. Sys.3 (1983) 387–413.  
  26. N. Norris, Universal covers of graphs: Isomorphism to depth n - 1 implies isomorphism to all depths. Discrete Appl. Math.56 (1995) 61–74.  
  27. L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank citation ranking: Bringing order to the web. Technical Report 66, Stanford University, 1999. Available at .  URIhttp://dbpubs.stanford.edu/pub/1999-66
  28. D.M. Cvetković, M. Doob and H. Sachs, Spectra of Graphs. Academic Press (1978).  
  29. J.P. Schweitzer. Perturbation theory and finite markov chains. J. Appl. Probab.5 (1968) 401–413.  
  30. E. Seneta, Non-negative matrices and Markov chains. Springer–Verlag, New York (1981).  
  31. S.H. Unger, GIT – A heuristic program for testing pairs of directed line graphs for isomorphism. Comm. ACM7 (1964) 26–34.  
  32. S. Vigna, A guided tour in the topos of graphs. Technical Report 199-97, Università di Milano, Dipartimento di Scienze dell'Informazione, 1997. Available at http://vigna.dsi.unimi.it/ftp/papers/ToposGraphs.pdf.  
  33. K. Yosida, Functional Analysis. Springer-Verlag, (1980), Sixth Edition.  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.