This course deals with computer science (CS) aspects of social network analysis (SNA), and is open to all students in the master computer science programme at Leiden University.
If you want to participate and are in a different programme, then you should contact the lecturer in advance.
Lectures: Fridays from 11:00 to 12:45 (Sep. 7 - Dec. 7)
in
Snellius room 174
Lab sessions: Fridays from 9:00 to 10:45
in
room 306/308
Prerequisites: a CS bachelor with courses on Algorithms, Data Structures and Data Mining
Literature: provided papers and book chapters (free and digitally available)
Examination: based on presentation, paper, programming, peer review and participation (no exam)
Study points: 6 ECTS
Lecturer: dr. Frank Takes - f.w.takes@liacs.leidenuniv.nl, room 157b
Assistant lecturer: Anna Latour MSc - a.l.d.latour@liacs.leidenuniv.nl, room 123
Course assistant: Antonio Barata MSc - a.p.pereira.barata@liacs.leidenuniv.nl, room 150
Student assistant: Hanjo Boekhout Bsc - h.d.boekhout@umail.leidenuniv.nl
Network with 1458 nodes and 1948 edges.
Date | First slot (9:00-10:45) | Second slot (11:00-12:45) | |
---|---|---|---|
1. | Sep 7, 2018 | No activities | Lecture 0: Course organization Lecture 1: Introduction and small world phenomenon |
2. | Sep 14, 2018 | Gephi tutorial Work on Assigment 1 |
Lecture 2: Advanced concepts & centrality (including course project information) |
3. | Sep 21, 2018 | NetworkX tutorial Work on Assigment 1 |
Lecture 3: Network projection & community detection Lecture 3.1: Presentation 0 |
4. | Sep 28, 2018 | Work on Assigment 1 | Lecture 4: Networks on the web |
Oct 1, 2018 | Deadline for Assignment 1: Click here to upload your assignment | ||
5. | Oct 5, 2018 | DS Lab, Course project planning Work on Assigment 2 |
Lecture 5: Network dynamics & Processes on networks |
6. | Oct 12, 2018 | Work on Assigment 2 |
Lecture 6: Network motifs & Network science applications |
7. | Oct 19, 2018 | Work on Assigment 2 |
Student presentation 1 (11:00): Network sampling Student presentation 2 (12:00): Community detection 2 |
8. | Oct 26, 2018 | Course project paper writing Work on Assigment 2 |
Student presentation 3 (11:00): Influence spread and virality 1 Student presentation 4 (room 174, 12:00): Top-k closeness centrality Student presentation 5 (room 313, 12:00): Shortest paths 1 |
Oct 29, 2018 | Deadline for Assignment 2: Click here to upload your assignment | ||
9. | Nov 2, 2018 | Work on course project paper |
Student presentation 6 (room 174, 11:00): Influence spread and virality 2 Student presentation 7 (room 174, 12:00): Betweenness centrality Student presentation 8 (room 408, 11:00): Neighborhoods Student presentation 9 (room 408, 12:00): Diameter computation |
Nov 9, 2018 | Deadline for the first half of the course project paper (for the peer review session) | ||
10. | Nov 9, 2018 | Work on course project | Peer review session |
11. | Nov 16, 2018 | Work on course project |
Student presentation 10 (room 174, 11:00): Counting triangles Student presentation 11 (room 174, 12:00): Visualization algorithms 2 Student presentation 12 (room 408, 11:00): Community detection 1 Student presentation 13 (room 408, 12:00): Link prediction |
Nov 19, 2018 | Deadline (optional) for preliminary version of course project paper: Click here to upload | ||
12. | Nov 23, 2018 | Work on course project code and experimental pipeline for parameter tuning, data processing, results, plotting, etc. |
Student presentation 14 (room 174, 11:00): Shortest paths 2 Student presentation 15 (room 174, 12:00): Graph compression Student presentation 16 (room 408, 11:00): De-anonymization of networks Student presentation 17 (room 408, 12:00): Network data errors |
Nov 30, 2018 | Deadline for a substantial amount of code / experimental pipeline of the course project (for the code review session) | ||
13. | Nov 30, 2018 | Code review session |
Plenary student presentation 21 (room directly followed by course evaluation from 11:45 to 12:00 Student presentation 19 (room 174, 12:00): Personalized PageRank Student presentation 22 (room 408, 12:00): Visualization algorithms 1 |
Dec 7, 2018 | Work on course project paper | No lecture | |
Dec 12, 2018 | Deadline for the final version of the course project paper: Click here to upload | ||
Dec 19, 2018 | Retake deadline for assignments (hand in via e-mail to the lecturer) | ||
Dec 19, 2018 | Evaluation meetings in room 157b: 9:00-9:30 Closeness centrality 9:30-10:00 Betweenness centrality 10:00-10:30 Community detection 1 10:30-11:00 Closeness top-k 11:00-11:30 Virality 2 11:30-12:00 Network data errors 12:00-12:30 Community detection 2 12:30-13:00 Graph compression 13:00-13:30 Personalized Pagerank 13:30-14:00 Sampling 14:00-14:30 Shortest paths 2 15:30-16:00 Triangles 16:00-16:30 Visualisation 2 |
||
Dec 20, 2018 | Evaluation meetings in room 157b: 10:30-11:00 De-anonymization 11:30-12:00 Diameter computation 12:00-12:30 Link prediction 14:00-14:30 Neighborhoods 15:00-15:30 Shortest paths 1 15:30-16:00 Visualisation 1 16:00-16:30 Virality 1 |
||
Dec. 21, 2018 | Course end. Grades will be submitted to the student administration | ||
Jan. 31, 2019 | Deadline for all remaining course project retakes and assignments |
The main goal of this lab session is to in pairs of teams learn from the other team in terms of how to organize code, experiments and data for the course project.
Ask a course assistant to be assigned to a team upon entering the computer room. We start at 9.00.
The main goal of this lab session is to get started with the course project paper.
Course project
During the lab session three weeks ago, you have made a project planning.
One upcoming deadline is the first version of your paper for the peer review session.
Assignment 2
Take the opportunity to ask final questions about Assignment 2.
The main goal of this lab session is to get started with both the course project and with Assignment 2, and to get to know the data science lab.
Data Science Lab
The data science lab website provides necessary information and documentation.
Become familiar with the lab and how to run code on it, and how to place data within the lab. Remember: your homedirectory is for your own code, /local is for local storage on the current machine you are are on, and /data is for storing data across data science lab machines. You may want to use the lab's facilities for course assignments and your course project.
Course project
Below are some topics you can discuss and investigate together with your project team partner.
Assignment 2
Get started with the practical part of Assignment 2.
Note that the large Twitter dataset are also available in the data science lab in the folder /data/SNACS/.
We will again use the UNIX/Linux workstations in Snellius room 306/304 for this lab session.
There is again no deliverable for this lab session.
The main goal of this lab session is to become familiar with NetworkX (a Python package to analyze networks for research purposes).
All relevant information on NetworkX can be found in the NetworkX online documentation.
Python instructions to use NetworkX: NetworkX should be installed on all machines.
Unfortunately, the past years have learned us that machines sometimes exhibit diversity in consistency in package availability on these machines. In some years, instructions to get NetworkX running could be found in /vol/share/software/datascience/2016-README.
If you run into issues, it may be productive to forget about preinstalled packages and use Anaconda and install required packages yourself.
To use Python in your own environment:
wget http://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh
bash Anaconda3-4.3.0-Linux-x86_64.sh
Accept questions with 'yes'.
source anaconda3/bin/activate
You should now have your own environment in which you can install packages as you wish.
Looking for a challenge? Check out Graph-Tool, a python graph analysis toolkit that leaves the hard computation to parallel (OpenMP) C++ code. Or snap, which is entirely written in C++ (although there is now also a Python plugin).
We will use the UNIX/Linux Ubuntu workstations in the Snellius computer rooms for this lab session.
This ICT FAQ may provide links to more information on how to use these workstations.
If you really must, you can, at the risk of bugs and problems that course personnel can probably
not help you solve, attempt to use Windows or the even more unstable version of Gephi for Macintosh.
The main goal of this lab session is to become familiar with Gephi (experimental beta-software to visualize networks for research purposes) and its input format. At the end of this session you should be able to import and visualize raw network data with labeled nodes and labeled and/or weighted edges (directed or undirected), and you should understand how to map edge and node size and color to structural network properties such as the node degree. You should be able to export computed node data and be able to export a vector graphic PDF of your network. There is no deliverable for this lab session.
Teams work on a course project for 60% of the course grade, is about a certain topic related to social network analysis, and consists of:
Instructions for the course project paper are available.
This list of projects is shown below. Take a look at the paper, before choosing a topic. E-mail the lecturer at f.w.takes@liacs.leidenuniv.nl with your topic preference and names of the two team members. First come, first serve. Your topic is not confirmed until you receive an e-mail from me. You have until September 28 to e-mail your preferred topic. After that, a final division of topics and formation of teams will be made on October 6 during the lecture. Prefix [Chosen] indicates a topic is no longer available.
This list is still updated regularly as teams are formed and topics are chosen.
Note: scientific papers (ACM, Elsevier, etc.) can often only be opened from within the university domain (or from home via SSH/VPN/etc.).
IEEE Explore papers can often be opened by looking them up via computer.org.
Alternative links and preprints of papers can often be found through Google Scholar by typing in the title of the paper.
Some students have expressed interest in additional reading material to help freshen up on skills and knowledge required for this course.
See the e-Studyguide for a more general description.
Topics include: SNA from a CS perspective (graph representation, complexity issues, examples), Graph Structure (power law, small world phenomenon, clustering coefficient, hierarchies), Paths and Distances (neighborhoods, radius, diameter), Spidering and Sampling (BFS, forest fire, random walks), Graph Compression (graph grammars, bitwise tricks, encryption, hashing), Centrality (degree centrality, closeness centrality, betweenness centrality, rating and ranking), Centrality and Webgraphs (HITS, PageRank, structure of the web), Community Detection (spectral clustering, modularity), Visualization (force-based algorithms, Gephi, NodeXL), Graph Models (random graphs, preferential attachment), Link Prediction (structure, semantics, prediction algorithms, graph mining), Contagion (diffusion of information, spreading activation, gossipping) and Privacy and Anonymity ((de-)anonymizing graphs, ethical aspects, privacy issues) and various other topics that have been added over the years but are not yet in the list above.