This course deals with computer science (CS) aspects of social network analysis (SNA), and is open to all students in the master computer science programme at Leiden University. If you want to participate in the 2023 edition and are in a different programme, then you should contact the lecturer in advance.
Lectures: Fridays from 11:00 to 12:45 in Gorlaeus room C1
Lab sessions: Fridays from 9:00 to 10:45 in Snellius rooms 302/304, 306/308, etc.
Prerequisites: a CS bachelor with courses on Algorithms, Data Structures and Data Mining
Literature: provided papers and book chapters (free and digitally available)
Examination: based on presentation, paper, programming, peer review and participation (no exam)
Brightspace link: 2324-S1 Social Network Analysis for Computer Scientists
Study points: 6 ECTS
Lecturer: dr. Frank Takes (f.w.takes@liacs.leidenuniv.nl, room 157b)
Assistants:
Hanjo Boekhout MSc
(room 126),
Rachel de Jong MSc
(room 152),
Marton Menyhert MSc, Shaoxuan (Anthony) Zhang, Chao Zhao.
Network with 1458 nodes and 1948 edges.
Date | Lecture (11:00-12:45) | Lab session (9:00-10:45) | ||
1. | Fri Sep 8, 2023 | No lecture | No lab session | |
2. | Fri Sep 15, 2023 |
Lecture 0: Course information Lecture 1: Introduction |
Introduction to Gephi Work on Assignment 1 |
|
3. | Fri Sep 22, 2023 | Lecture 2: Advanced concepts and centrality Lecture 2.5: Course project |
Introduction to NetworkX Work on Assignment 1 |
|
4. | Fri Sep 29, 2023 | Lecture 3: Network projection and community detection | Work on Assignment 1 | |
Mon Oct 2, 2023 |
Deadline for Assignment 1 (AoE; hand in via Brightspace) |
|||
5. | Fri Oct 6, 2023 | Lecture 4: Structure of the web and propagation-based centrality Example presentation |
Course project planning session Work on Assignment 2 | |
6. | Fri Oct 13, 2023 | Lecture 5: Network evolution and model extensions | Data Science Lab introduction Work on Assignment 2 |
7. | Fri Oct 20, 2023 | No lecture | Work on Assignment 2 |
Mon Oct 23, 2023 |
Deadline for Assignment 2 (AoE; hand in via Brightspace) |
|||
Session 1 (11:00-11:50) | Session 2 (12:10-13:00) | Lab session (9:00-10:45) | ||
8. | Fri Oct 27, 2022 | Track A (Snellius 174, Frank)
1. Anomaly detection 33. Network embeddings |
Track C (Snellius 174, Frank)
7. Anonymity in networks 15. Community detection |
Course project paper writing tutorial Work on course project |
Track B (Snellius 412, Hanjo)
18. Core/periphery structure 47. Visualization algorithms |
Track D (Snellius 412, Hanjo)
32. Network motifs 44. Shortest paths |
|||
9. | Fri Nov 3, 2023 | Track A (Snellius 174, Frank)
13. Community detection 21. Influence spread and virality |
Track C (Snellius 174, Frank)
11. Centrality estimation 31. Network motifs |
Work on course project |
Track B (Snellius 412, Hanjo)
6. Anonymity in networks 14. Community detection |
Track D (Snellius 412, Hanjo)
12. Centrality estimation 24. Influence spread and virality |
|||
Fri Nov 10, 2023 |
Deadline for a first (printed) draft version of the Course Project paper (for today's peer review session) |
|||
10. | Fri Nov 10, 2023 | Peer review session (starts 11:00 in Gorlaeus C1) | Work on course project | |
Session 1 (11:00-11:50) | Session 2 (12:10-13:00) | Lab session (9:00-10:45) | ||
11. | Fri Nov 17, 2023 | Track A (Snellius 174, Frank)
_ 45. Temporal networks |
Track C (Snellius 174, Frank)
27. Link prediction 35. Network embeddings |
Work on course project |
Track B (Snellius 412, Hanjo)
2. Anomaly detection 30. Network motifs |
Track D (Snellius 412, Hanjo)
4. Anomaly detection 16. Community detection |
Extra Track (402; 10:15): 18. Core/periphery structure |
||
12. | Fri Nov 24, 2023 | Work on course project | Track C (Snellius 412, Hanjo)
19. Graph compression 23. Influence spread and virality |
Track A (Snellius 412, Hanjo)
9. Centrality estimation 17. Core/periphery structure |
Extra Track A (Snellius 401, Hanjo); 10:00
25. Link prediction 50. Graph compression |
Track D (Snellius 174, Frank)
8. Anonymity in networks 28. Link prediction |
Track B (Snellius 174, Frank)
10. Centrality estimation 42. Shortest paths |
Extra Track D (Snellius 174, Frank); 10:00
51. Network embeddings 40. Sampling from networks |
||
Fri Nov 26 |
Deadline for a preliminary version of the Course Project paper for course staff feedback (AoE; hand in via Brightspace) |
|||
Fri Dec 1, 2023 |
Deadline for completing a substantial part of the Course Project code & experimental pipeline (for today's code review session) |
|||
13. | Fri Dec 1, 2023 | Code review session; see instructions | Work on course project | |
Session 1 (11:00-11:50) | Session 2 (12:10-13:00) | Lab session (9:00-10:45) | ||
14. | Fri Dec 8, 2023 | Track C (Snellius 412, Hanjo)
3. Anomaly detection 43. Shortest paths |
Track A (Snellius 412, Hanjo)
5. Anonymity in networks 29. Network motifs |
Work on course project |
Track D (Snellius 174, Frank/Rachel)
20. Graph compression 48. Visualization algorithms |
Track B (Snellius 174, Frank/Rachel)
26. Link prediction 38. Sampling from networks |
|||
Session 1 (10:00-10:50) | Session 2 (11:10-12:30) | Lab session (9:00-10:45) | ||
15. | Fri Dec 15, 2023 | Track C (Snellius 313, Hanjo)
39. Sampling from networks 46. Temporal networks |
Track A (Snellius 412, Hanjo)
41. Shortest paths 36. Network embeddings 45. Temporal networks |
Work on course project |
Track D (Snellius 174, Frank)
52. Influence spread and virality 53. Visualization algorithms |
Track B (Snellius 174, Frank)
22. Influence spread and virality 37. Sampling from networks |
|||
Dec 17, 2023 |
Deadline for final Course Project paper and accompanying code (AoE; hand in via Brightspace) |
|||
Dec 19, 2023 | Deadline for retake assignment to replace failed assignment(s) (AoE; hand in via email) | |||
Dec 22, 2023 | Course end | |||
Jan 31, 2024 | Course project retake deadline |
Code review (week 13).
The main goal of this code review session is to is conduct a review of another team's code and experimental setup, and to receive feedback on one's own code.
What to do? Follow the instructions in the code review session slides.
We use the same division into teams as for the peer review in week 10 (see below).
In case team pairs are not complete, contact the course staff in Snellius 174 after the introductory lecture.
Deliverable. The mandatory deliverable for today is a contribution to the collaborative document (see link below).
Please find the team you are matched with, and ask help from the assistants if needed. Each line lists two teams that are matched up.
1. Anomaly detection | 2. Anomaly detection |
3. Anomaly detection | 4. Anomaly detection |
5. Anonymity in networks | 6. Anonymity in networks |
7. Anonymity in networks | 8. Anonymity in networks |
9. Centrality estimation | 10. Centrality estimation |
11. Centrality estimation | 12. Centrality estimation |
13. Community detection | 14. Community detection |
15. Community detection | 16. Community detection |
17. Core/periphery structure | 18. Core/periphery structure |
19. Graph compression | 20. Graph compression |
21. Influence spread and virality | 22. Influence spread and virality |
23. Influence spread and virality | 24. Influence spread and virality |
25. Link prediction | 26. Link prediction |
27. Link prediction | 28. Link prediction |
29. Network motifs | 30. Network motifs |
31. Network motifs | 32. Network motifs |
33. Network embeddings | 34. Network embeddings 51. Network embeddings |
37. Sampling from networks | 38. Sampling from networks |
39. Sampling from networks | 40. Sampling from networks |
41. Shortest paths | 42. Shortest paths |
43. Shortest paths | 44. Shortest paths |
45. Temporal networks | 46. Temporal networks |
47. Visualization algorithms | 48. Visualization algorithms |
50. Graph compression | 52. Influence spread and virality |
35. Network embeddings | 53. Visualization algorithms |
The main goal of this lab session is to ensure that your team is all set for making serious progress writing the course project paper in the coming weeks.
Course project paper
Earlier, you have made a project planning.
One upcoming deadline is the first version of your paper for the peer review session in 2 weeks.
As always: please ask course staff for help if you are still unsure about certain aspects of your project.
Need to freshen up or improve academic writing skills? See Academic writing: a practical guide and the The academic phrasebank. There is also the Leiden Science Skills platform.
The main goal of this lab session is to get to know the data science lab.
If you are working remotely, learn how-to set up remote access to the LIACS Research and Education Laboratory (REL)
Data Science Lab
The data science lab website provides necessary information and documentation. The emailadres to contact in case of access issues or other technical problems is rel@liacs.leidenuniv.nl.
Become familiar with the lab and how to run code on it, and how to place data within the lab. This may be handy for Assignment 2 and/or the course project. Remember:
Think of setting up passwordless login, consider using ProxyJump to avoid the gateway, and perhaps use sshfs to mount your remote homedirectory. Some IDEs also offer all of this functionality.
Assignment 2
Continue with the practical part of Assignment 2.
The main goal of this lab session is to get started with both the course project and with Assignment 2.
Course project
Below are some topics you can discuss and investigate together with your project team partner.
Done? Get started with Assignment 2.
The main goal of this lab session is to become familiar with
NetworkX (a Python package to analyze networks for research purposes).
All relevant information on NetworkX can be found in the NetworkX online documentation.
For this lab session, you need a working Python environment. For this, there are two options:
Instructions for today: Lab session on NetworkX
Done? Proceed with Exercise 2 of Assignment 1.
Looking for a challenge? Check out these three alternatives (that you
can also use instead of NetworkX throughout the course, if you prefer
(but for which there is less help available)):
Note: as a result of lecture 1 not going through, you might not be familiar with all terms on concepts just yet. Ask questions from the assistants where needed, and otherwise wait for the upcoming lecture for more explanation and details.
About social network analysis tools and packages. There exist different tools and package for social network analysis. In this course, we will introduce you to two of them, with complementary advantages:
Learning goals. The main goal of this lab session is to become familiar with Gephi (experimental beta-software to visualize networks for research purposes) and its input format. At the end of this session you should be able to:
There is no deliverable for this lab session, but you are assumed to know the tool afterwards. Practice more at home if needed.
Instructions for today: Lab session on Gephi.
The following steps and tutorials will help you get to know Gephi.
Done? Get started with the practical part of Assignment 1. You can download the smaller datafiles medium.tsv and large.tsv. If you want to analyze huge.tsv, you will have to get it from the shared folder in the ISSC Linux or LIACS DS lab environment, as stated in the assignment.
Teams work on a course project for 60% of the course grade. The project is about a certain topic (see list below) related to social network analysis, and the project consists of:
This list of project topics is shown below. Please choose a topic (so, choose a number from 1 to 12) with your team (consisting of two students). Register your choice in Brightspace. Topics can be chosen up to 4 or 5 times; we will split the group into various parallel tracks.
A next step (somewhere around week 6) is that you choose which paper on the chosen topic you present.
If you are retaking the course because you failed the project last year, you cannot choose the same topic as last year.
Note: scientific papers (ACM, Elsevier, etc.) can often only be opened from within the university domain (or from home via university SSH/Citrix/VPN/etc.). IEEE Explore papers can often be opened by looking them up via computer.org. Alternative links and preprints of papers can often be found through Google Scholar by searching for "Title of the paper". Contact course staff if you have tried all of these options and are still not able to access the paper (do not pay!).
Some students have expressed interest in additional reading material to help freshen up on skills and knowledge required for this course.
See the Studyguide / prospectus for a more general description.
Topics include: SNA from a CS perspective (graph representation, complexity issues, examples), Graph Structure (power law, small world phenomenon, clustering coefficient, hierarchies), Paths and Distances (neighborhoods, radius, diameter), Spidering and Sampling (BFS, forest fire, random walks), Graph Compression (graph grammars, bitwise tricks, encryption, hashing), Centrality (degree centrality, closeness centrality, betweenness centrality, rating and ranking), Centrality and Webgraphs (HITS, PageRank, structure of the web), Community Detection (spectral clustering, modularity), Visualization (force-based algorithms, Gephi), Graph Models (random graphs, preferential attachment), Link Prediction (structure, semantics, prediction algorithms, graph mining), Contagion (diffusion of information, spreading activation, gossipping) and Privacy and Anonymity ((de-)anonymizing graphs, ethical aspects, privacy issues) and various other topics that have been added over the years but are not yet in the list above.
The course was also given in 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 and 2022.