Empowerment of patients: mining on-line communities for knowledge

Project synopsis

One in three people in the Netherlands will be diagnosed with cancer at some point in their lives. And cancer remains the nation’s biggest cause of death. The internet lets patients obtain information and get support from fellow patients through on-line discussion forums, with benefical effect on well-being ( (van Uden-Kraan, 2009; Batenburg en Das, 2014). In an earlier SIDN Fund-supported project called Patient Forum Miner (2016), advanced text analysis software was developed to automatically analyse forum posts and develop hypotheses. The idea of using patients’ experiences as a knowledge resource to complement curated medical data has attracted a lot of interest at several oncological conferences and at the KNAW conference on citizen science. SIDN Fund is now supporting a PhD research project, in which hypothesis generation on the basis of patient forum text mining will be scientifically validated.



Project team: Wessel Kraaij, Gerard van Oortmerssen, Suzan Verberne (Leiden University), Hans Gelderblom (LUMC), Stephan Raaijmakers (TNO).

Key publications:

  • G. van Oortmerssen, S. Raaijmakers, M. Sappelli, E. Boertjes, S. Verberne, N. Walasek and W. Kraaij, Analyzing cancer forum discussions with text mining, Proceedings of  RICHMEDSEM 2017
  • Mies C. van Eenbergen, Lonneke V. van de Poll-Franse, Emiel Krahmer, Suzan Verberne, Floortje Mols. A systematic review of content analysis for online cancer communities. Submitted to Journal of Medical Internet Research (JMIR)
  • Suzan Verberne, Antal van den Bosch, Sander Wubben, Emiel Krahmer (2017). Automatic summarization of domain-specific forum threads: collecting reference data.   In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval (CHIIR ’17). ACM, New York, NY, USA, 253-256. DOI: https://doi.org/10.1145/3020165.3022127
  • Suzan Verberne, Maya Sappelli, Djoerd Hiemstra, Wessel Kraaij (2016). Evaluation and analysis of term scoring methods for term extraction. Information Retrieval, Springer. doi:10.1007/s10791-016-9286-2
  • Peter Reichardt, Michael Leahy, Xavier Garcia del Muro, Hans Gelderblom et al., “Quality of Life and Utility in Patients with Metastatic Soft Tissue and Bone Sarcoma: The Sarcoma Treatment and Burden of Illness in North America and Europe (SABINE) Study,” Sarcoma, vol. 2012, Article ID 740279, 11 pages, 2012. doi:10.1155/2012/740279
  • R.B. Trieschnigg, D. Hiemstra, F.M.G. de Jong, and W. Kraaij. A Cross-lingual Framework for Monolingual Biomedical Information Retrieval. In Proceedings of the 19th ACM conference on Information and knowledge management (CIKM ’10), New York, pages 10, 2010. ACM
  • Wessel Kraaij, Marc Weeber, Stephan Raaijmakers, and Rob Jelier. MeSH based feedback, concept recognition and stacked classification for curation tasks. In Proceedings of TREC 2004, 2005. NIST
  • W. Kraaij, M. Spitters, and A. Hulth. Headline extraction based on a combination of uni- and multi-document summarization techniques. In Proceedings of the ACL workshop on Automatic Summarization/Document Understanding Conference (DUC 2002), June 2002. ACL.


Funded by: SIDN Fonds,