GSOC: Modular Decomposition(Undirected Graphs)

GSOC Project

This post is a part of Google Summer of Code, 2017 project. I got an opportunity to develop a python module for implementing linear time complexity algorithm which computes modular decomposition of undirected graphs. The module is to be integrated in SageMath library and was developed under the guidance of my mentors David Coudert and Dmitrii Pasechnik. The implementation is based on the algorithm presented in this research paper and is the first open-source python implementation of the algorithm.
There was a ticket created on Sage Trac server for integrating the module in the Sage source code. The ticket associated with the project is 23487. The project has fixed a longstanding stopgap in Sage. public/gsoc17_t23487 is the public branch created for working on providing the modular decomposition code and here is the link to the diff for that branch. Commits made on this branch are provided at this link. Both the diff and commits are also available in the “Branch” section of the ticket. Other than that the major module used is available on the public repository I have created on my Github Account.

What is Modular Decomposition?

Modular Decomposition is decomposition of a graph =(V,E) into modules, where module is a subset of the set Vof vertices, and a generalisation the concept of  connected components of  graphs. Let X be a module. For any vertex v, where v ∉ X (v does not belong to X) v is either connected to or disconnected from every vertex of X. Another way to define module is to say that every vertex in the module X has same set of neighbours in the graph outside of X. In particular, V itself, and single-element  subsets of V are modules.
There are three types of modules:
  1. Parallel: X forms a parallel module if the induced on X subgraph is disconnected.
  2. Series: X forms a series module if the induced on X subgraph is connected but its complement is disconnected.
  3. Prime: X forms a prime module if both the induced on X subgraph and its complement are connected.
A module can be a subset of another module. This gives rise to the concept of modular decomposition. The graph can be represented as a hierarchy of modules represented as modular decomposition tree. Each node in this tree represents a maximal module (set of all vertices in the subtree rooted at the node) and children of the node are maximal modules which are subsets of this module. Let C be the set of vertices in the parent module and C1, C2, ..., Cn be the set of vertices in the children of the parent module. C is actually a set of vertices in the graph which satisfy the properties of module described above. Let P = {C1, C2, ..., Cn}. Then for i!=j, Ci and Cj are disjoint and Ui=1 to nCi = P. This property makes P a modular partition. Further for i = 1 to n, Ci is maximal i.e. any proper subset of the modular partition P containing Ci is not a module. E.g. let  n = 5 and {C1, C2, C4} form a module then C1 is not a maximal module as {C1, C2, ..., C5} can be replaced by {{C1, C2, C4}, C3, C5} to give a modular partition. The module at the root of the modular decomposition tree is V. Its children are maximal modules which are proper subsets of vertices. Finally the leaves are modules consisting of  single vertices.

Modular Decomposition Algorithm

The modular decomposition algorithm published in the paper was implemented. The algorithm is simpler and more practical than other linear time algorithms published to compute modular decomposition. But it is very tricky to implement this algorithm due to the linear complexity. The algorithm uses recursion to compute the modular decomposition of a graph as the problem can be divided into recursive subproblems. A parallel module is formed if the graph is disconnected. Therefore in a parallel module case the recursive subproblem is computation of modular decomposition tree for the connected components. The below procedure is followed if the graph is connected and it runs in four stages:-
  1. Recursive Subproblem- A source vertex is chosen. The vertices at different distance from the source vertex are separated. Modular decomposition is computed on the subgraph induced by vertices at a particular distance. The modular decomposition trees computed are combined into a forest.
  2. Refinement- The forest is refined based on a set of rules. Refinement is done based on a procedure mentioned in the research paper.
  3. Promotion- The nodes with no children and with a single child are promoted in this stage. Nodes with no children are removed and ones with a single child are deleted and their child is promoted. Such nodes are generated after the refinement stage.
  4. Assembly- This stage takes as input a forest which contains various components. The components are then merged based on a set of rules to form a module. The components merged are removed from the forest and replaced by the module thus giving a hierarchy of modules. Finally a single module is generated which is the modular decomposition tree.

GSOC Experience

It was a wonderful experience contributing to SageMath. I actually contributed a module which required adding a lot of test cases and certainly improved my coding style. The review process of SageMath is great and I really benefitted from it. GSOC gave me an opportunity to know and understand the development cycle for open source organizations and I am grateful to have applied for it. My mentors David Coudert and Dmitrii Pasechnik are very knowledgeable and their guidance was instrumental in bringing the project to completion.

Project Status

The project can be divided into four phases and is currently in the integration phase. There are a few changes which are required for better code readability but otherwise the module is functioning properly. Below are the four phases in which the project can be divided:-
  1. Reading and understanding modular decomposition- I had to read about the algorithm I had to implement for modular decomposition and about modular decomposition in general.
  2. Implementation of algorithm- This naturally was the longest phase. I had to implement the modular decomposition algorithm and for reference I used the java implementation of the same algorithm by Marc Tedder, one of the authors of the algorithm. The java implementation was very useful as it was difficult to understand the algorithm by just referring the research paper.
  3. Testing the algorithm- The algorithm was tested on various graphs and the errors found were rectified. There were some errors for trivial cases like empty graphs and then some errors occurred when resulting modular decomposition tree contained series module nodes with children as series modules and parallel module nodes with children as parallel modules.
  4. Integration of the algorithm- This is where integration of the module into the Sage Library needs to be done. The module was reviewed by my mentors and other Sage community members. There were some important changes made in this phase which made the code more readable and improved the structure of the code.

Acknowledgements

I would like to take this opportunity to thank my mentors David Coudert and Dmitrii Pasechnik for their invaluable support during the course of the project. They guided me in every stage of the project. Further I would like to thank them and Yann Laigle-Chapuy for taking out their time to review the code and mentioning some important changes. At last I would like to thank GSOC for providing the platform for students to be active in open source development. It is through gsoc that I found out about SageMath and got the opportunity to contribute to it.

References

  1. Tedder, M., Corneil, D., Habib, M., & Paul, C. (n.d.). Simple, Linear-time Modular Decomposition.
  2. Modular decomposition. (2017, July 24). Retrieved August 29, 2017, from https://en.wikipedia.org/wiki/Modular_decomposition
  3. Tedder, M. (n.d.). Simpler, Linear-Time Modular Decomposition via Recursive Factorizing Permutations. Retrieved August 29, 2017, from http://www.cs.toronto.edu/~mtedder
  4. Linear time implementation of Modular Decomposition for Undirected Graphs. (n.d.). Retrieved August 29, 2017, from https://trac.sagemath.org/ticket/23487
  5. Undirected-Modular-Decomposition. Retrieved August 29, 2017, from https://github.com/lokeshj1703/Undirected-Modular-Decomposition

Comments