News:
- The textual comments from the 2015 evaluation forms are available here.
- All work has been graded. The final grades have been sent to the student administration. You can also come pick up your grade until Friday December 18 in room 157b.
Lectures: Fridays from 11:15 to 13:00 in the fall semester of 2015 in
Snellius room 403 402
Lab sessions (not every week): Fridays from 9:00 to 10:45 in
Snellius room 302/304
Lecturer: dr. Frank Takes - f.w.takes@liacs.leidenuniv.nl, Snellius room 157b
Student assistant: Anna Latour - a.l.d.latour@umail.leidenuniv.nl
Prerequisites: basic knowledge of Algorithms and Data Mining
Literature: provided papers and perhaps some free textbooks
Examination: based on presentation, paper, programming, peer review and participation (no exam)
Study points: 6 ECTS
Network with 1458 nodes and 1948 edges.
Date | Lab session | Topic | |
---|---|---|---|
1. | Sep 4, 2015 | No lab session | Lecture 1: Introduction and small world phenomenon |
2. | Sep 11, 2015 | Gephi & Assignment 1 | Lecture 2: Network topology |
3. | Sep 18, 2015 | NetworkX & Assignment 1 | Lecture 3: Distance-based centrality measures Presentation 0: Determining the diameter of small world networks |
4. | Sep 25, 2015 | Work on Assignment 1 | Lecture 4: Web structure and propagation-based centrality |
5. | Oct 2, 2015 | Work on Assignment 2 | Deadline for Assignment 1 Lecture 5: Network evolution and Community detection |
Oct 9, 2015 | No lab session | No lecture | |
6. | Oct 16, 2015 | Work on Assignment 2 Presentation 1: Virality and Outbreak Detection (1) (Team: NetEx) |
Presentation 2: Sampling Methods (Team: NS) Presentation 3: Shortest Paths (1) (Team: /) |
7. | Oct 23, 2015 | Work on Assignment 2 |
Presentation 4: Personalized PageRank (Team: Team 0) Presentation 6: Shortest Paths (2) (Team: WY) |
8. | Oct 30, 2015 | Presentation 7: Neighborhoods (Team: Omega) Presentation 8: Betweenness Centrality (Team: :(){:|&;:) |
Guest Lecture by André Vermeij (Kenedict Analytics) |
8. | Deadline for Assignment 2 | ||
9. | Nov 6, 2015 | Presentation 9: Graph Partitioning (Team: m=1) Presentation 10: Link Prediction (Team: HJ) |
On-site code review in teams |
10. | Nov 13, 2015 | Presentation 11: Visualization (Team: n=2) Presentation 12: Graph Compression (Team: Team 2) |
On-site peer review in teams |
Nov 20, 2015 | No lab session | No lecture | |
Nov |
Optional deadline for intermediary paper "check" | ||
11. | Nov 27, 2015 | Presentation 13: Virality & Outbreak Detection (2) (Team: DC) Presentation 14: Closeness Centrality (Team: "Group") |
Presentation 15: Diameter (Team: Team 1) Presentation 16: Link prediction in signed networks (Team: "Homework") |
Dec |
Deadline for course project paper | ||
12. | Dec 4, 2015 | Course evaluation sessions (according to schedule) | |
13. | Dec 11, 2015 | Course evaluation sessions (according to schedule) |
There is no lecture on October 9 and no lecture on November 20.
- Assignment 2 asks you to apply some centrality measures. You should motivate how you apply these measures: to the giant component or the full network, and to the directed or undirected version of the network.
- Assignment 2 can also be handed in via Dropbox
- Assignment 1 datafiles are also available via http due to ongoing share access issues: medium.in and large.in (use Save-As option in your browser!).
- A clarification: exercise 3.4 should be read as "Judging from the datafile, could this network be undirected?" If you interpreted it differently, just write down your assumptions to avoid any confusion.
- If you want, you can reuse the LaTeX/TikZ source of the graph in part 2 of Assignment 1.
- Assignments can also be handed in via Dropbox (for example, when you have large visualization PDFs that cannot be e-mailed).
We will again use the UNIX/Linux workstations in Snellius room 302/304 and 305 for this lab session.
There is no deliverable for this lab session. The main goal of this lab session is to become familiar with NetworkX (a Python package to analyze networks for research purposes).
All relevant information on NetworkX can be found in the NetworkX online documentation.
Installation issues: it turned out that many machines unexpectedly did not have NetworkX installed. On the student workstations, this can be solved by installing it in a virtual environment or adding the NetworkX source folder to your homedirectory. The latter may again result in all kinds of issues. In such cases, it may be nice to try Anaconda.
Alternative tool: check out Graph-Tool, a python graph analysis toolkit that leaves the hard computation to parallel (OpenMP) C++ code.
We will use the UNIX/Linux workstations in Snellius room 302/304 for this lab session.
If you really must, you can attempt to use Windows, fully at your own risk (you will at least need to edit gephi.conf so that it points to the correct Windows JRE version 6 or 7 (and not 8)).
Some information on the workstations is provided by the ISSC, see their pages on the
workstations,
SSH access and
ICT FAQ on remote access to Linux systems with a configuration similar to the workstations.
There is no deliverable for this lab session. The main goal of this lab session is to become familiar with Gephi (experimental beta-software to visualize networks for research purposes) and its input format.
Teams work on a course project for 50% of the course grade. For this project they pick one of the topics listed below. Each topic has an associated list of papers which have to be studied and compared. One paper has to be presented. So the project consists of:
Suggestions for topics are listed below, and you are encouraged to pick one of these topics (as a team). You are also allowed to suggest your own problem (topic) in the field of social network analysis, assuming you are have found at least three scientific papers that suggest different techniques or algorithms for solving this problem (so, addressing this same topic). Please choose a topic together with your team mate before September 25, and inform the lecturer via e-mail of your choice. Also feel free to suggest a free date slot for your presentation. Your topic choices will be processed on a first-come-first-serve basis.
If you choose a topic, then you are also encouraged to look up additional papers in this problem domain to study and compare to the paper that you presented, as the papers to study are merely "simple" suggestions. For some topics, there exists an overview/review paper that discusses many more papers that study the same (type of) problem.
Scientific papers (ACM, Elsevier, etc.) can often only be opened from within the university domain (or from home via SSH/nxClient/VPN/etc.).
IEEE Explore papers can be opened by looking them up via computer.org.
Alternative links and preprints of papers can often be found through Google Scholar.
This course deals with the computer science aspects of social network analysis. With topics such as big data and data science becoming more and more popular, the study of large datasets of networks (or graphs), is becoming increasingly important. Examples of such networks include webgraphs, communication and collaboration networks and perhaps most notably (online) social networks (such as Facebook and Twitter). With millions of nodes and possible billions of links, traditional graph algorithms are often too complex and unable to solve trivial algorithmic and data mining related problems. Typical tasks in this field include community detection, clustering, outlier detection, link prediction but also more fundamental problems such as efficient retrieval, storage, and compression of graph data and computational problems such as computing shortest paths and other descriptive graph properties. Also see the description in the e-Studyguide.
Topics include: SNA from a CS perspective (graph representation, complexity issues, examples), Graph Structure (power law, small world phenomenon, clustering coefficient, hierarchies), Paths and Distances (neighborhoods, radius, diameter), Spidering and Sampling (BFS, forest fire, random walks), Graph Compression (graph grammars, bitwise tricks, encryption, hashing), Centrality (degree centrality, closeness centrality, betweenness centrality, rating and ranking), Centrality and Webgraphs (HITS, PageRank, structure of the web), Community Detection (spectral clustering, modularity), Visualization (force-based algorithms, Gephi, NodeXL), Graph Models (random graphs, preferential attachment), Link Prediction (structure, semantics, prediction algorithms, graph mining), Contagion (diffusion of information, spreading activation, gossipping) and Privacy and Anonymity ((de-)anonymizing graphs, ethical aspects, privacy issues).
The course was also given in 2014, see SNACS 2014