This course deals with computer science (CS) aspects of social network analysis (SNA), and is open to all students in the master computer science programme at Leiden University.
If you want to participate in the 2022 edition and are in a different programme, then you should contact the lecturer in advance.
Lectures: Fridays from 11:00 to 12:45 in Gorlaeus room C1 (except Oct 14 in Lipsius 011)
Lab sessions: Fridays from 9:00 to 10:45 in Snellius rooms 302/304 and 306/308
Prerequisites: a CS bachelor with courses on Algorithms, Data Structures and Data Mining
Literature: provided papers and book chapters (free and digitally available)
Examination: based on presentation, paper, programming, peer review and participation (no exam)
Brightspace link: 2223-S1 Social Network Analysis for Computer Scientists
Study points: 6 ECTS
Lecturer: dr. Frank Takes (f.w.takes@liacs.leidenuniv.nl, room 157b)
Assistants:
Hanjo Boekhout MSc
(h.d.boekhout@liacs.leidenuniv.nl, room 126),
Rachel de Jong MSc
(r.g.de.jong@liacs.leidenuniv.nl, room 126),
Yasmin Kareem BSc and Marton Menyhert BSc.
Network with 1458 nodes and 1948 edges.
Date | Lecture (11:00-12:45) | Lab session (9:00-10:45) | ||
1. | Fri Sep 9, 2022 |
Lecture 0: Course information Lecture 1: Introduction |
No lab session in the first week | |
2. | Fri Sep 16, 2022 | Lecture 2: Advanced concepts and centrality | Introduction to Gephi Work on Assignment 1 |
|
3. | Fri Sep 23, 2022 | No lecture | Introduction to NetworkX Work on Assignment 1 |
|
4. | Fri Sep 30, 2022 | Lecture 3: Network projection and community structure | Work on Assignment 1 | |
Mon Oct 3, 2022 | Deadline for Assignment 1 (hand in via Brightspace) | |||
5. | Fri Oct 7, 2022 | Lecture 4: Structure of the web and propgation-based centrality Example presentation |
Course project planning
Work on Assignment 2 | |
6. | Fri Oct 14, 2022 | Lecture 5: Network evolution and walks | Data Science Lab Work on Assignment 2 |
|
7. | Fri Oct 21, 2022 | Lecture 6: Recent advances in network science | Work on Assignment 2 | |
Mon Oct 24, 2022 | Deadline for Assignment 2 (hand in via Brightspace) | |||
Session 1 (9:00-10:05) | Session 2 (10:20-11:25) | Session 3 (11:40-12:45) | ||
8. | Fri Oct 28, 2022 | Course project team session | Track A (405, Frank)
21. Graph compression 31. Link prediction |
Track C (405, Frank)
33. Link prediction 53. Shortest paths |
Track B (313, Hanjo)
52. Shortest paths 42. Network embeddings |
Track D (313, Hanjo)
9. Anonymity in networks 39. Network motifs |
|||
9. | Fri Nov 4, 2022 | Work on course project | Track B (402, Frank)
32. Link prediction 12. Centrality estimation |
Track D (402, Frank)
19. Community detection 29. Influence spread and virality |
Track A (313, Hanjo)
26. Influence spread and virality 41. Network embeddings |
Track C (313, Hanjo)
8. Anonymity in networks 13. Centrality estimation |
|||
Thu Nov 10, 2022 | Deadline for having a first version of the course project paper ready in PDF for peer review |
|||
10. | Fri Nov 11, 2022 | Peer review session (we start at 11:00 in Gorlaeus C1) |
||
Session 1 (9:00-10:05) | Session 2 (10:20-11:25) | Session 3 (11:40-12:45) | ||
11. | Nov 18, 2022 | Work on course project | Track C (402, Frank)
43. Network embeddings 28. Influence spread and virality |
Track A (402, Frank)
1. Anomaly detection 11. Centrality estimation |
Track D (313, Hanjo)
49. Sampling from networks 35. Link prediction |
Track B (313, Hanjo)
27. Influence spread and virality 47. Sampling from networks |
|||
12. | Nov 25, 2022 | Track D (402, Hanjo)
34. Link prediction 20. Community detection |
Track D (402, Frank)
44. Network embeddings [no-show] 54. Shortest paths |
Track B (402, Frank)
17. Community detection 2. Anomaly detection |
Work on course project | Track C (313, Hanjo)
3. Anomaly detection 58. Visualization algorithms |
Track A (313, Hanjo)
16. Community detection |
||
Nov 25, 2022 | Optional deadline for preliminary course project paper feedback from course staff (hand in via Brightspace) |
|||
Dec 1, 2022 | Deadline for having a substantial amount of code ready for peer review |
|||
13. | Dec 2, 2022 | Code review session (we start at 11:00 in Snellius 313) |
||
Session 1 (9:00-10:05) | Session 2 (10:20-11:25) | Session 3 (11:40-12:45) | ||
14. | Dec 9, 2022 | Work on course project | Track A (402, Frank)
46. Sampling from networks 51. Shortest paths |
Track C (402, Frank)
18. Community detection 48. Sampling from networks |
Track B (313, Hanjo)
7. Anonymity in networks 22. Graph compression - [no show] |
Track D (313, Hanjo)
14. Centrality estimation |
|||
Dec 18, 2022 |
Deadline for final course project paper (hand in via Brightspace) |
|||
Dec 20, 2022 | Retake deadline for assignments | |||
Dec 23, 2022 | Course end | |||
Jan 31, 2023 | Course project retake deadline |
Code review (week 12).
The main goal of this code review session is to is conduct a review of another team's code and experimental setup, and to receive feedback on one's own code. The mandatory deliverable for today is a contribution to the SNACS Collaborative Code Review 2022 Best Practices document (see link below).
What to do? Follow the instructions in the code review session slides.
We use the same division into teams as for the peer review in week 10. In case team pairs are not complete, contact the course staff in Gorlaeus C1 after the introductory lecture.
Please find the team you are matched with, and ask help from the assistants if needed. Each line lists two (or once, three) teams that are matched up.
1 - Anomaly detection - Track A (2) | 2 - Anomaly detection - Track B (2) |
3 - Anomaly detection - Track C (2) | 4 - Anomaly detection - Track D (2) |
6 and 7 - Anonymity in networks - Track B (2) | 58 - Visualization algorithms - Track C (2) |
8 - Anonymity in networks - Track C (2) | 9 - Anonymity in networks - Track D (2) |
11 - Centrality estimation - Track A (2) | 12 - Centrality estimation - Track B (2) |
13 - Centrality estimation - Track C (2) | 14 - Centrality estimation - Track D (1) |
16 - Community detection - Track A (2) | 17 - Community detection - Track B (2) |
18 - Community detection - Track C (2) | 19 - Community detection - Track D (2) |
21 - Graph compression - Track A (2) | 22 - Graph compression - Track B (2) |
26 - Influence spread and virality - Track A (2) | 27 - Influence spread and virality - Track B (2) |
28 - Influence spread and virality - Track C (2) | 29 - Influence spread and virality - Track D (2) |
31 - Link prediction - Track A (2) | 32 - Link prediction - Track B (2) |
33 - Link prediction - Track C (2) | 34 - Link prediction - Track D (2) |
20 - Community detection (2) - Track D (2) | 35 - Link prediction (2) - Track D (2) |
37 - Network motifs - Track B (1) | 39 - Network motifs - Track D (2) |
41 - Network embeddings - Track A (2) | 42 - Network embeddings - Track B (2) |
43 - Network embeddings - Track C (2) | 44 - Network embeddings - Track D (2) |
46 - Sampling from networks - Track A (2) | 47 - Sampling from networks - Track B (2) |
48 - Sampling from networks - Track C (2) | 49 - Sampling from networks - Track D (2) |
51 - Shortest paths - Track A (2) | 52 - Shortest paths - Track B (2) |
53 - Shortest paths - Track C (2) | 54 - Shortest paths - Track D (2) |
The main goal of this lab session is to ensure that your team is all set for making serious progress writing the course project paper in the coming weeks.
Course project paper
Earlier, you have made a project planning.
One upcoming deadline is the first version of your paper for the peer review session (November 11).
As always: please ask course staff for help if you are still unsure about certain aspects of your project.
The main goal of this lab session is to get to know the data science lab.
If you are working remotely, learn how-to set up remote access to the LIACS Research and Education Laboratory (REL)
Data Science Lab
The data science lab website provides necessary information and documentation. The emailadres to contact in case of access issues or other technical problems is rel@liacs.leidenuniv.nl.
Become familiar with the lab and how to run code on it, and how to place data within the lab. This may be handy for Assignment 2 and/or the course project. Remember:
Think of setting up passwordless login, consider using ProxyJump to avoid the gateway, and perhaps use sshfs to mount your remote homedirectory. Some IDEs also offer all of this functionality.
Assignment 2
Continue with the practical part of Assignment 2.
The main goal of this lab session is to get started with both the course project and with Assignment 2.
Course project
Below are some topics you can discuss and investigate together with your project team partner.
Done? Get started with the practical part of Assignment 2.
The main goal of this lab session is to become familiar with
NetworkX (a Python package to analyze networks for research purposes).
All relevant information on NetworkX can be found in the NetworkX online documentation.
For this lab session, you need a working Python environment. For this, there are two options:
Instructions for today: Lab session on NetworkX
Done? Proceed with Exercise 2 of Assignment 1.
Looking for a challenge? Check out these three alternatives (that you
can also use instead of NetworkX throughout the course, if you prefer
(but for which there is less help available)):
About social network analysis tools and packages. There exist different tools and package for social network analysis. In this course, we will introduce you to two of them, with complementary advantages:
Learning goals. The main goal of this lab session is to become familiar with Gephi (experimental beta-software to visualize networks for research purposes) and its input format. At the end of this session you should be able to:
There is no deliverable for this lab session, but you are assumed to know the tool afterwards. Practice more at home if needed.
Instructions for today: Lab session on Gephi.
The following steps and tutorials will help you get to know Gephi.
Done? Get started with the practical part of Assignment 1. You can download the smaller datafiles medium.tsv and large.tsv. If you want to analyze huge.tsv, you will have to get it from the shared folder in the ISSC Linux or LIACS DS lab environment, as stated in the assignment.
Teams work on a course project for 60% of the course grade. The project is about a certain topic (see list below) related to social network analysis, and the project consists of:
This list of project topics is shown below. Please choose a topic (so, choose a number from 1 to 12) with your team (consisting of two students). Register your choice in Brightspace. Topics can be chosen up to 4 or 5 times; we will split the group into various parallel tracks.
A next step (somewhere around week 6) is that you choose which paper on the chosen topic you present.
If you are retaking the course because you failed the project last year, you cannot choose the same topic as last year.
Note: scientific papers (ACM, Elsevier, etc.) can often only be opened from within the university domain (or from home via university SSH/Citrix/VPN/etc.). IEEE Explore papers can often be opened by looking them up via computer.org. Alternative links and preprints of papers can often be found through Google Scholar by searching for "Title of the paper". Contact course staff if you have tried all of these options and are still not able to access the paper (do not pay!).
Some students have expressed interest in additional reading material to help freshen up on skills and knowledge required for this course.
See the e-Studyguide for a more general description.
Topics include: SNA from a CS perspective (graph representation, complexity issues, examples), Graph Structure (power law, small world phenomenon, clustering coefficient, hierarchies), Paths and Distances (neighborhoods, radius, diameter), Spidering and Sampling (BFS, forest fire, random walks), Graph Compression (graph grammars, bitwise tricks, encryption, hashing), Centrality (degree centrality, closeness centrality, betweenness centrality, rating and ranking), Centrality and Webgraphs (HITS, PageRank, structure of the web), Community Detection (spectral clustering, modularity), Visualization (force-based algorithms, Gephi), Graph Models (random graphs, preferential attachment), Link Prediction (structure, semantics, prediction algorithms, graph mining), Contagion (diffusion of information, spreading activation, gossipping) and Privacy and Anonymity ((de-)anonymizing graphs, ethical aspects, privacy issues) and various other topics that have been added over the years but are not yet in the list above.
The course was also given in 2014, 2015, 2016, 2017, 2018, 2019, 2020 and 2021.