"Partitioning Social Networks for Fast Retrieval of Time-dependent Queries"

Speaker:

David J Stein

Date and Time:

January 27, 2012 - 11:40am - 12:00pm

Presentation Abstract:

Online social network (OSN) queries require retrievals of multiple small records generated by different users in the network, and the set of records to be retrieved is time dependent. Current implementation of hash-based partitioning results in accesses at a large number of servers, which significantly degrades response time. Partitioning the OSN friendship graph is difficult as its power-law degree distribution leads to many crosspartition edges. Naive replication requires extra storage that is orders of magnitude larger. We point out that the objective of partitioning is to keep the two-hop neighborhood of a user in one partition, instead of the one-hop network usually considered. Two-hop neighborhoods are the basic units of retrieval in OSN and can be much larger than one-hop networks. We propose to partition not only the spatial network of social relations, but also in the time dimension so that users who have communicated in a given period are grouped together. We build an activity prediction graph to keep in one partition newly created data that are highly likely to be accessed together. We use a static partitioning method based on KMETIS, and a dynamic local partitioning method that requires only a small amount of data movement across partitions. The partitioning results are tested with emulation of Facebook page downloads, and show that the static algorithm achieves 5.6 times better data locality than hash-based partitioning and the dynamic algorithm achieves 6.4 times better locality while keeping the number of movements small. Almost all queries are kept in at most 3 partitions for both algorithms.