This course deals with computer science (CS) aspects of social network analysis (SNA), and is open to all students in the master computer science programme at Leiden University.
- The final course project paper can be handed in via Dropbox. The deadline of December 12 is final and strict.
- A short guide on copying assignment files to your machine via SCP is available.
- A new version of Gephi was recently released. It requires a different input format, which may not be compatible with that of the assignment data. Therefore, keep using Gephi 0.9.1.
Lectures: Fridays from 11:00 to 12:45 (Sep. 8 - Dec. 1)
Snellius room B02
Lab sessions (not every week): Fridays from 9:00 to 10:45 in room 302/304
Lecturer: dr. Frank Takes - firstname.lastname@example.org, room 157b
Teaching assistants: Hanjo Boekhout BSc (email@example.com) and
Jesper van Engelen BSc (firstname.lastname@example.org, Snellius room 120)
Prerequisites: a CS(-related) bachelor; mainly courses on Algorithms, Data Structures and Data Mining
Literature: provided papers and book chapters (free and digitally available)
Examination: based on presentation, paper, programming, peer review and participation (no exam)
Study points: 6 ECTS
Network with 1458 nodes and 1948 edges.
|Date||First slot (9:00-10:45)||Second slot (11:00-12:45)|
|1.||Sep 8, 2017||Lecture 0: Course organization
Lecture 1: Introduction and small world phenomenon
|2.||Sep 15, 2017||Lab session: Gephi
Work on Assignment 1
|Lecture 2: Advanced network concepts & Distance-based centrality|
|3.||Sep 22, 2017||Lab session: NetworkX
Work on Assignment 1
|Lecture 3: Network formation & Community detection|
|3.5.||Sep 29, 2017||Work on Assignment 1||No lecture. Work on Assignment 1|
|Oct 2, 2017||Deadline for Assignment 1: Click here to upload your assignment|
|4.||Oct 6, 2017||Work on Assignment 2||Lecture 4: Structure of the web & Propagation-based centrality
Example presentation: Diameter computation
|5.||Oct 13, 2017||Work on Assignment 2||Lecture 5: Dynamic network models|
|6.||Oct 20, 2017||Work on Assignment 2||
Student presentation 1: Disease spread
|7.||Oct 27, 2017||Work on Assignment 2||Student presentation 3: Signed link prediction
Student presentation 4: Network motifs 1
|Deadline for Assignment 2: Click here to upload your assignment|
|8.||Nov 3, 2017||Student presentation 5: Community detection 2
Student presentation 6: Counting triangles
|Student presentation 7: Network visualization 2
Student presentation 8: Network de-anonymization
|Nov 10, 2017||Deadline for the first three chapters of the course project paper (for the peer review session)|
|9.||Nov 10, 2017||Student presentation 9: Network visualization 1
Student presentation 10: Betweenness centrality 1
|Peer review session|
|10.||Nov 17, 2017||Student presentation 11: Diameter computation
Student presentation 12: Influence spread 1
|Student presentation 13: Betweenness centrality 2
Student presentation 14: Closeness centrality 2
|Nov 24, 2017||Deadline for a substantial amount of code / experimental pipeline of the course project (for the code review session)|
|11.||Nov 24, 2017||Student presentation 15: Shortest paths 1
|Code review session: Best practices in a collaborative document|
||Deadline (optional) for preliminary version of course project paper: Click here to upload|
|12.||Dec 1, 2017||Student presentation 17: Personalized PageRank
Student presentation 18: Sampling methods
|Student presentation 19: Minimum weight cycles
Student presentation 20: Shortest paths 2
|13.||Dec 8, 2017||Student presentation 21: Link prediction
Student presentation 22: Graph compression
|Student presentation 23: Influence spread 2
Student presentation 24: Network dissolution and decline
|Dec 12, 2017||Deadline for the final version of the course project paper: Click here to upload|
|Dec 13 and 15, 2017||Course project evaluation sessions with student teams, from 9.00 to 17.00 in slots of 30 minutes.|
|Dec. 13||Dec. 15|
|9:00 Betweenness centrality 1||10:00 Minimum weight cycles|
|9:30 Betweenness centrality 2||10:30 Neighborhoods|
|10:00 Closeness centrality||11:00 Network dissolution and decline|
|10:30 Community detection|
|11:00 Counting Triangles|
|11:30 Network motifs||13:00 Personalized PageRank|
|12:00 Disease Spread||13:30 Sampling Methods|
|13:30 Graph Compression||14:00 Shortest paths 1|
|14:00 Influence spread 1||14:30 Shortest paths 2|
|14:30 Influence spread 2||15:00 Signed link prediction|
|15:00 Link prediction||15:30 Visualization 1|
|Dec. 22, 2017||Course end. Grades will be submitted to the student administration|
We will again use the UNIX/Linux workstations in Snellius room 302/304 for this lab session.
There is again no deliverable for this lab session. The main goal of this lab session is to become familiar with NetworkX (a Python package to analyze networks for research purposes).
All relevant information on NetworkX can be found in the NetworkX online documentation.
Python instructions to use NetworkX: NetworkX should be installed on all machines, but you may have to enter the right "environment". See /vol/share/software/datascience/2016-README. If you run into issues, it may be productive to try Anaconda.
Looking for a challenge? Check out Graph-Tool, a python graph analysis toolkit that leaves the hard computation to parallel (OpenMP) C++ code. Or snap, which is entirely written in C++ (although there is now also a Python plugin).
We will use the UNIX/Linux Ubuntu workstations in Snellius room 302/304 for this lab session.
If you are not familiar with these workstations, then some information is provided by the ISSC, see their pages on the
SSH access. Also see this
If you really must, you can, at the risk of bugs and problems that course personnel can probably not help you solve, attempt to use Windows or the even more unstable version of Gephi for Macintosh.
There is no deliverable for this lab session. The main goal of this lab session is to become familiar with Gephi (experimental beta-software to visualize networks for research purposes) and its input format. At the end of this session you should be able to visualize raw network data with labeled nodes and labeled and/or weighted edges (directed or undirected), and you should understand how to map edge and node size and color to structural network properties.
Students with non-CS backgrounds have expressed interest in additional reading material to help freshen up on skills and knowledge required for this course.
Teams work on a course project for 60% of the course grade. For this project they pick one of the topics listed below. Each topic has an associated list of papers which have to be studied. Papers on a particular topic all consider different algorithms or methods for solving roughly the same problem. The goal of the project is to make a comparison of at least two of these papers based on experiments using real social network data. So the project consists of:
Suggestions for topics are listed below, and you are encouraged to pick one of these topics (as a team). You are also allowed to suggest your own networks-related problem (topic) in the field of social network analysis, assuming you are have found at least two scientific CS papers that suggest different techniques or algorithms for solving this problem (so, addressing this same topic).
If you choose a topic, then you are also encouraged to look up additional papers in this problem domain to study and compare to the paper that you presented, as the papers to study are merely "simple" suggestions. For some topics, there exists an overview/review paper that discusses many more papers that study the same (type of) problem.
Note: scientific papers (ACM, Elsevier, etc.) can often only be opened from within the university domain (or from home via SSH/VPN/etc.).
IEEE Explore papers can often be opened by looking them up via computer.org. Alternative links and preprints of papers can often be found through Google Scholar.
This list of projects is shown below. Take a look at the papers, before choosing a topic. E-mail email@example.com with your topic preference and names of the two team members. First come, first serve. Your topic is not confirmed until you receive an e-mail from me. You have until September 29 to e-mail your preferred topic. After that, a final division of topics and formation of teams will be made on October 6 during the lecture. Prefix [Chosen] indicates a topic is no longer available.
This list will be updated regularly as teams are formed and topics are chosen.
See the e-Studyguide for a more general description.
Topics include: SNA from a CS perspective (graph representation, complexity issues, examples), Graph Structure (power law, small world phenomenon, clustering coefficient, hierarchies), Paths and Distances (neighborhoods, radius, diameter), Spidering and Sampling (BFS, forest fire, random walks), Graph Compression (graph grammars, bitwise tricks, encryption, hashing), Centrality (degree centrality, closeness centrality, betweenness centrality, rating and ranking), Centrality and Webgraphs (HITS, PageRank, structure of the web), Community Detection (spectral clustering, modularity), Visualization (force-based algorithms, Gephi, NodeXL), Graph Models (random graphs, preferential attachment), Link Prediction (structure, semantics, prediction algorithms, graph mining), Contagion (diffusion of information, spreading activation, gossipping) and Privacy and Anonymity ((de-)anonymizing graphs, ethical aspects, privacy issues).