Congratulations! For the next 3 weeks you are employed as all-round Data Science expert of the largest Leiden-based gaming business "Thunderstorm Entertainment". One of their games is CookieDestroyer, a popular game played by people of all ages and genders worldwide, primarily making money from selling in-game "coins" that allow the gamer to progress in the game more quickly. Your job is to provide the company's management with information on the status, growth and trends with respect to the monetization of CookieDestroyer. To do this, you will use data from the company's sales logs, containing all purchases of coins made by the customers. Ultimately, your data analysis, visualization, interpretation and tools should allow company management to make decisions on future steps to expand the game.
For each part of the assignment, the number of points awarded for a 100% perfect answer is listed between brackets and sums to a total of 100 points. You should answer each question as precisely as possible; not addressing parts of the question means that fewer points are awarded. Your assignment grade (between 1 and 10, bounds included) is computed by dividing your number of points by 10 and rounding it to the nearest half. If you get an insufficient grade for the assignment, you can retake the assignment by meeting the assignment retake deadline. Please do not be late with handing in your work. If you are late with handing in your work, it means that you failed the assignment and that you are automatically using the retake deadline for the assignment. Retake assignment grades have 2 points subtracted from the total. You are allowed to work in teams consisting of exactly two people. For each question, clearly describe how you obtained your answer, and write down any non-trivial assumptions. All practical exercises can be done on the student workstations. Be sure to hand in digitally:
Questions or remarks? Preferably ask them during one of the weekly lectures or lab sessions. In case of urgent questions outside these hours, contact one of the course assistants via e-mail, or ask the lecturer in person.
The goal of the assignment is to:
The report that you hand in for this assignment should contain a short introduction to the data, the company and the dashboard, as well as the answers to the strategic questions.
The data for this assignment comes in three files: sales, methods and countries. The main data table to be studied is sales, which contains a few hundred thousand sales records spanning a time period from 2010 to 2015. It has the following attributes:
Furthermore, there are tables methods and countries which map the methodId and ipCountry fields in the sales table to (anonymized) payment method names and country names.
Each table is available in .sql format and in .csv format.
The files can be found here:
sales .csv .sql
methods .csv .sql
countries .csv .sql For some additional instructions, see below.
The files can also be found in the shared UNIX folder /vol/share/groups/liacs/scratch/DSPM/.
A simple way to load data from file.sql into MySQL is via the command line, for example as follows:
mysql -h mysql.liacs.leidenuniv.nl -u username -p username < file.sql
Username is your ULCN username, password is your MySQL password. The second username is to select the database name, which is equal to your username.
The web-based dashboard consists of various (at least four) widgets that should each visualize (in a different way) the following aspects of the data:
You can use any programming language, scripting language, markup language, or framework (as long as it is open-source), but here are some hints.
Not sure what to use? Go with
what you learned in your first year the "easy mode":
- Data in the MySQL database (perhaps create some relevant indexes),
- One HTML page with the dashboard, styled using Twitter Bootstrap.
- One PHP script that serves relevant data in JSON-format.
You could start with the skeletoncode.
The following questions should be answered by querying the data or using the dashboard. For each question, always elaborately motivate your answers based on the data, for example by giving queries or instructions to use the dashboard.
Good luck with the assignment! Ask questions. A lot, if you have to. The deadline is posted on the course website.