About this benchmark
This is the homepage of The Insurance Company (TIC) Benchmark. This datamining benchmark dataset is ideally suited for testing your datamining algorithms or using it as a case for datamining lab sessions. The data was supplied by Sentient Machine Research. The main question is:
Can you predict who would be interested in buying a caravan insurance policy and give an explanation why?
The CoIL Challenge 2000 (the official TIC format)
The CoIL Challenge was a datamining competition organized by the the Computational Intelligence and Learning Cluster, a network of excellence sponsored by the EU. It was held in the period of March-May 2000, in total 43 solutions were submitted. Winners were Charles Elkan for the prediction task and Nick Street and YongSeog Kim for the description task.
To learn more about the CoIL Challenge, see:
- The CoIL Challenge 2000 report containing a full overview of the challenge tasks and results, a data description, short papers and extended abstracts on 29 solutions (ps & pdf).
- A front cover article "Lessons about Self-learning", from Synergy 3, Autumn 2000, on lessons learned from the Challenge.
- The original problem- and data descriptions
- The data used for the challenge; this is the 'official' TIC Benchmark format and it has been posted in the KDD archive at Irvine. It can only be used for non-commercial research and education purposes, so not for commercial education nor demo purposes.
Please quote this reference to refer to the TIC Benchmark / CoiL Challenge 2000 data:
P. van der Putten and M. van Someren (eds) . CoIL Challenge 2000: The Insurance Company Case. Published by Sentient Machine Research, Amsterdam. Also a Leiden Institute of Advanced Computer Science Technical Report 2000-09. June 22, 2000. See http://liacs.leidenuniv.nl/~puttenpwhvander/tic.html
An in depth discussion of the challenge appeared in a special issue of the Machine Learning journal, so this can be used as an alternative reference:
P. van der Putten and M. van Someren. A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000. Machine Learning, October 2004, vol. 57, iss. 1-2, pp. 177-195, Kluwer Academic Publishers
An extended and updated version of this paper appeared as a chapter in my phd thesis:
Peter van der Putten. On Data Mining in Context: Cases, Fusion and Evaluation. PhD Thesis, Leiden Institute of Advanced Computer Science (LIACS), Leiden University. January 19, 2010.
Another interesting perspetive was given in this KDD paper by the winner of the prediction task, Charles Elkan:
Magical Thinking in Data Mining: Lessons From CoIL Challenge 2000 Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (KDD'01).
The Benelearn-99 CompetitionNote: this is a slightly adapted version of the CoIL Challenge, the 'official' format.
Reference: P. van der Putten and M. van Someren (eds). The Benelearn 1999 Competition. SWI, University of Amsterdam, November 2, 1999.
- "UI management sciences team wins data mining competition", Press Release University of Iowa, Henry B. Tippie College of Business
- SQL, Data Mining, & Genetic Programming, an article by Brian Conolly on applying GA's in SQL to the TIC case. Dr. Dobb's Journal, April 2004.