Mining Structured Data
What is "Gaston"?

Gaston is a unified GrAph, Sequences and Tree extractiON algorithm. Given a database of graphs, GASTON searches for all frequent subgraphs of that database.


June 8, 2005: GASTON was presented at the 7th ICCS conference by Jeroen Kazius. More information can be found here.
May 7, 2005: A new update of GASTON is available at the download page.
How does Gaston work?

Gaston finds all frequent subgraphs by using a level-wise approach in which first simple paths are considered, then more complex trees and finally the most complex cyclic graphs. It appears that in practice most frequent graphs are not actually very complex structures; Gaston uses this quickstart observation to organize the search space efficiently. To determine the frequency of graphs, Gaston employs an occurrence list based approach in which all occurrences of a small set of graphs are stored in main memory. You can find more information here.

Can I use Gaston?

Surely. You can download the code here. The code is distributed under the GPL license.

E-mail:snijssen at