Motivation

It is not an exaggeration to claim that studying subgraphs has been an active thread of research since the early days of graph theory: Euler paths and cycles, Hamilton paths and cycles, matchings, cliques, neighborhoods of vertices typically sketched via the degree sequence are special types of subgraphs. Subgraphs play a central role in network science. I have worked on the following subgraph-related problems, focusing on the development of scalable randomized algorithms for big graph data.
  • Counting the copies of a fixed subgraph H, especially when H is 3-clique (triangle).
    Triangles play a key role in social networks, because friends of friends tend to become friends themselves.
  • Efficient approximate computation of all-pairs shortest paths.
  • Dense subgraphs and communities.
The above algorithms have been used in mining large-scale networks, extracting useful patterns both in static and dynamic graphs. These patterns are useful in anomaly detection and in evaluating random graph models which claim to mimick real-world networks.

Publications

  • Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning

    [Co-authors: Mihail N. Kolountzakis, Gary L. Miller, Richard Peng]
    Internet Mathematics
    Invited Paper

  • Colorful Triangle Counting and a MapReduce Implementation

    [Co-authors: Rasmus Pagh]
    To appear, Information Processing Letters

  • Spectral Counting of Triangles via Element-Wise Sparsification and Triangle-based Link Recommendation

    [Co-authors: Petros Drineas, Eirinaios Michelakis, Ioannis Koutis, Christos Faloutsos]
    In Advances in Social Networks Analysis and Mining, Springer
    Invited Book Chapter

  • Large Scale Graph Mining With MapReduce: Diameter Estimation and Eccentricity Plots of Massive Graphs with Mining Applications
    In Social Network Mining, Analysis and Research Trends: Techniques and Applications , IGI
    Invited Book Chapter

  • Radius Plots for Mining Tera-byte Scale Graphs: Algorithms, Patterns, and Observations

    [Co-authors: U Kang, Ana Paula Appel, Christos Faloutsos, Jure Leskovec]
    In SIAM International Conference on Data Mining (SDM 2010)
    Special Issue

  • DOULION: Counting Triangles in Massive Graphs With a Coin

    [Co-authors: U Kang, Gary L. Miller, Christos Faloutsos]
    In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD 2009)

  • HADI: Fast Diameter Estimation and Mining in Massive Graphs with Hadoop

    [Co-authors: U Kang, Ana Paula Appel, Christos Faloutsos, Jure Leskovec]
    CMU ML Tech Report CMU-ML-08-117, 2008

  • Counting Triangles Using Projections

    In Knowledge and Information Systems (KAIS)
    Invited Journal Paper

  • Triangle Sparsifiers

    [Co-authors: Mihail N. Kolountzakis, Gary L. Miller]
    Journal of Graph Algorithms And Applications

  • Fast Counting of Triangles in Large Real Networks, without counting: Algorithms and Laws

    In IEEE International Conference on Data Mining (ICDM 2008)
    Special Issue

Code

The code is written in both C++ and JAVA. Feel free to drop me a line if you want it. Send an email to graphminingtoolbox@gmail.com.