Marvin Meeng

Scientific programmer at the LIACS (Leiden Institute of Advanced Computer Science) since March 2009, under the supervision of Arno Knobbe. Though now mainly occupied with Data Mining, I obtained my MSc in Cognitive Artificial Intelligence at the Philosophy department of Utrecht University. Being oriented towards philosophical subjects related to the human mind initially, focus shifted towards more practical areas related to machine learning later on. Resulting in a thesis about a Dual Purkinje Image Eyetracker at the Experimental Psychology division first, and my current employment after that. Currently the main research areas and projects are:

Multi-Relational Data Mining
MRDM is the art of mining useful knowledge from structured data stored in relational databases. Because structured data requires a database which contains multiple related tables, traditional single-table techniques cannot be used. The field of MRDM is concerned with generalising common Data Mining concepts to a multi-table setting. The Data Mining software package Safarii is the result of Arno's thesis, and still relevant to all our projects. As a result the software is undergoing continuous development.

NCSB-NBIC Systems Bioinformatics
Through our involvement in NBIC (Netherlands Bioinformatics Centre), part of the NCSB (Netherlands Consortium for Systems Biology), we are currently participating in two projects:

CMSB A cooperation with the CMSB (Centre for Medical Systems Biology), a joint activity of Leiden University Medical Center, Leiden University, Free University Medical Center and Free University Amsterdam, Erasmus MC Rotterdam and TNO Leiden. The aim is to use the result of their text mining facilities in combination with our subgroup discovery expertise in order to develop an alternative form of gene set enrichment. Traditionally microarray data analyses suffer from a number of drawbacks. First, because of the high number of genes measured and the, often, limited number of subjects, genes may appear to be relevant just because of sheer randomness. Second, interpreting the results requires incorporating background knowledge, preferably from various sources. Often Gene Ontology annotations are used for enriching the results, but these are both not up to date because of the slow curation procedure and rather limited by design. Using alternative methods we try to undercut both afore mentioned problems.

LACDR A cooperation with LACDR (Leiden/Amsterdam Centre for Drug Research). Together with the group of Andreas Bender we analyse the StARlite database (now ChEMBLdb), containing numerous tables all describing various aspects of drug compounds. This being a multi-table database creates a perfect opportunity for analyses with the Safarii software package mentioned above. Results should yield a better insight into which aspects of compound-structure or compound-properties are essential in successfully targeting the various drug-targets of interest to medicinal chemistry.


Marvin Meeng