MESH: Minnesota Engine for Scalable Evolving Hypergraph Analysis

June 2018: Source code for TensorFlow-based decentralized distributed deep learning framework released. Please check it out here.
June 2017: Source code for MESH v1.0 released. Please check it out here.

Project Vision and Overview

Hypergraph diagram

Rapid increase in the amount and richness of online interactions, through social networking sites, massive online multi-player games and virtual worlds, collaborative and group communication tools, and social media, is creating Social Networking data at unprecedented scales. This includes data about the users, their characteristics (i.e., profile information), connections between them (i.e., the social graph), and their behavior (both individual users and interactions). The analysis of this data using sophisticated computational techniques is the rapidly emerging area of Computational Social Science, which has the potential to revolutionize the study of social sciences. Graph theory has been a powerful tool to model relationships between entities and study their properties. However, while graphs can easily model dyadic interactions between a pair of actors, they are ill-suited to model a group/team of multiple actors with an identity of its own. It has been proposed that the right mathematical object for modeling group interactions is not a (simple) graph, but a hypergraph, which can easily represent multi-actor relations and groups via hyperedges. Although properties of hypergraphs have been studied extensively, there has not been commensurate attention on hypergraph algorithms.

Given the new requirements of Social Network Analysis, this creates an exciting opportunity for developing computational techniques for hypergraph analysis. Specifically, various types of network centrality measures must be re-examined, since the notions of degree of a node and shortest path between a pair of nodes can change when hyperedges are present. Furthermore, graph properties such as minimum spanning tree, all-pairs of shortest paths, can change, and algorithms for them must be re-examined. Finally, many of these networks constantly evolve over time, e.g., through changes in membership, generation and dissemination of new information, and creation of new groups and teams. As a result, new algorithms must be developed for phenomena such as network information flow and link prediction. The efficient implementation and execution of hypergraph algorithms also leads to several system-level challenges. First, a common element of many hypergraph analysis problems is the enormous size and high growth rate of data sets. Thus, these analysis algorithms need to be scalable both in terms of their memory/storage usage as well as computation by utilizing distributed resources. At the same time, the system must be able to support the efficient computation of continuously evolving data. Further, many hypergraph operations are more complex than corresponding graph operations, due to the presence of richer information and relationship structure, so that the system must provide efficient support for hypergraph representation and computation.

To address many of the challenges outline above, we are building an analysis framework called MESH: Minnesota Engine for Scalable evolving Hypergraph analysis. MESH will provide a framework of algorithms and system components to support scalable analysis of evolving hypergraph structures. The MESH algorithms will be designed to model and compute several common data-driven questions related to group dynamics in real-world networks. At the same time, the MESH system-level techniques will be designed to support the computation of these algorithms in a distributed and scalable manner. These algorithms and systems mechanisms will be designed to support the efficient analysis of time-evolving dynamic structures in the data.

Approach

Goals

We are designing our MESH framework with several goals in mind:

  • Support a rich API for ease of expressing hypergraph algorithms.
  • Provide scalability across machines and to large datasets.
  • Provide flexibility to support different datasets and algorithms.
  • Characterize and optimize for real-world hypergraphs.
  • Design new algorithms to capture real-world group phenomena.

MESH Prototype

MESH diagram

We are building a MESH prototype on the Apache Spark/GraphX open-source framework. The prototype consists of a concise but expressive hypergraph API which can be used to implement various hypergraph algorithms. MESH extends the "think like a vertex" paradigm of graph processing systems to hypergraphs by elevating hyperedges to first-class status (in addition to vertices): they can maintain their state, carry out computation, and send messages just as vertices do in traditional graph computation. We are exploring two key implementation issues: representation and partitionining. We are implementing different hypergraph partitioning algorithms and representations within the MESH framework to understand their performance and tradeoffs. We will study the scalability and performance of these techniques and other optimizations across a number of real-world datasets as well as algorithms, to understand their efficacy for real applications.

Decentralized Distributed Deep Learning in TensorFlow

We have Tensorflow-based decentralized distributed deep learning framework. You can find the details here.

Sponsor

This material is based upon work supported by the National Science Foundation under Grant No. IIS-1422802 ( NSF Award Abstract). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.