Projects & PhD students

Current projects

  • DoSSIER: Domain Specific Systems for Information Extraction and Retrieval (co-applicant), 2020-2024. Funded by the European Union’s Horizon 2020 Innovative Training Networks (ITN) programme under the Marie Skłodowska-Curie actions
  • RISE_SMA: Social Media Analytics for Society and Crisis Communication (co-applicant), 2019-2022. Funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement
  • Curriculum Development in Data Science and Artificial Intelligence / DS&AI, 2019-2021. Funded by the Erasmus+ programme, Key Action 2: Cooperation for innovation and the exchange of good practices
  • Digital tools for knowledge extraction for (rare) cancers (co-applicant), Voucher project funded by the Ministry of Health. With 4 cancer patient communities, in collaboration with TNO.
  • SmartFile: from keyboard to patient, and the follow-up project, ‘Learning from registration’, (co-applicant), in collaboration with Hogeschool Codarts, the startup company ‘SmartFile’, 10 sports physiotherapy practices and the Dutch Association for Physical Therapy in Sport Healthcare (Funded by RAAK-SIA)

Current projects with bachelor and master students are listed here.

PhD projects supervised

  • Juan Bascur Cifuentes: Interactive visual browsing and retrieval of scientific literature (2019-now, CWTS and Data Science Research Programme, Leiden University)
  • Anne Dirkson: Knowledge Discovery and Data Mining from patient experience repositories (2018-now, LIACS and Data Science Research Programme, Leiden University)
  • Hugo de Vos: PolicyDoc: text mining from European Union documents (2017-now, Data Science Research Programme, Leiden University)
  • Alex Brandsen: Digging in Documents. Utilizing text mining to access the hidden knowledge in archeological grey literature (2017-now, Data Science Research Programme, Leiden University)
  • Gineke Wiggers: Measuring Relevance and Relations of Dutch Legal Publications (2017-now, Data Science Research Programme, Leiden University)
  • Maya SappelliKnowledge Work in Context: User Centered Knowledge Worker Support (2011-2016, Institute for Computing and Information Sciences, Radboud University)
  • Eva D’hondtCracking the Patent: Using phrasal features to aid patent classification (2009-2014, Centre for Language Studies, Radboud University)

Past projects

  • The reach of junk news on Facebook, in collaboration with Nieuwscheckers.
  • Empowerment of patients (co-applicant), pilot project for hypothesis generation based on text mining from patient forums, in collaboration with TNO and LUMC (funded by SIDN)
  • DISCOSUMO: Discussion Thread Summarization for Mobile Devices (financed by NWO Creative Industries), with Tilburg University and Sanoma Media BV (2015-2019)
  • Wisdom of the crowds, Patient empowerment in online support communities (casus 3) (co-applicant), with RIVM and (2016)
  • PFM: Patient forum mining (co-applicant), with TNO and, financed by SIDN (2016)
  • QUINN (main applicant): Query Updates for News moNitoring (financed by a COMMIT valorization grant), with TNO and LexisNexis (2015)
  • SWELL: smart reasoning systems for well-being at work and at home (financed by COMMIT), with TNO, Philips, Noldus, University of Twente, Roessingh R&D, Innovalor, and Sense/Almende (2012-2016)
  • RemBench (main applicant): A Digital Workbench for Rembrandt Research (financed by CLARIN-NL), with Huygens ING and RKD (2013-2014)
  • Rembrandt Documents: A new digital infrastructure for accessing, analysing, and interpreting original written and printed documents related to the life and art of the world-renowned Dutch painter Rembrandt van Rijn (1606-1669), with Huygens ING and the Rembrandthuis Museum (2012-2013).
  • PoliticalMashup: Automatic classification of political texts, with University of Amsterdam (2012-2013)
  • Route 66 explorative text mining, for De Baak (2012)
  • ComPoli: Communicatie en revalidatie digiPoli, with Sint Maartenskliniek (2011-2012)
  • Extracting Factoids from Dutch texts (main applicant), financed by a Google European Digital Humanities Award (2011-2012)
  • Inventarisatie TST en Onderwijs, for Nederlandse TaalUnie (2011)
  • Feasibility study speech synthesis, for Dedicon (2011)
  • TM4IP: Text Mining for Intellectual Property, financed by MatrixWare (2009-2011)
  • In Search of the Why, PhD project (2005-2009)