home

SNE Master Research Projects 2017 - 2018

http://uva.nl/
2004-2005 2005-2006 2006-2007 2007-2008 2008-2009 2009-2010 2010-2011 2011-2012 2012-2013 2013-2014 2014-2015 2014-2015 2015-2016 2016-2017 2017-2018 2018-2019
Contact TimeLine Projects LeftOver Projects Presentations-rp1 Presentations-rp2 Objective Process Tips Project Proposal

Contact

Cees de Laat, room: C.3.152
And the OS3 staff.
Course Codes:

Research Project 1 53841REP6Y
Networking Research Project 2 53842NRP6Y    
Security Research Project 2 53842SRP6Y

TimeLine

RP1 (January):
  • Wednesday Sep 28, 2017, 10h15-13h00: Introduction to the Research Projects.
  • Wednesday 22 nov, 2017, 10h15-13h00: Detailed discussion on chosen subjects for RP1.
  • Monday Jan 8th - Friday Feb 2th 2018: Research Project 1.
  • Friday Jan 12th: (updated) research plan due.
  • Monday Jan 22, 2018, 16h00, progress meeting (not mandatory).
  • Monday Feb 5, 2018 13h00-17h00: Presentations RP1 in B1.23 at SP 904.
  • Tuesday Feb 6, 2018 11h00 - 17h00: Presentations RP1 in B1.23 at SP 904.
RP2 (June):
  • Wednesday May 16, 2018, 13h00-15h00, B1.23 Detailed discussion on chosen subjects for RP2.
  • Monday Jun 4th - Friday Jun 29, 2018: Research Project 2.
  • Friday Jun 8th: (updated) research plan due.
  • Tuesday Jul 3 2018, 12h00-17h00: presentations in C0.110 @ SP904.
  • Thursday Jul 5 2018, 12h00-17h00: presentations in C0.110 @ SP904.

Projects

Here is a list of student projects. Find here the left over projects this year: LeftOvers.
In a futile attempt to prevent spam "@" is replaced by "=>" in the table.
Color of cell background:
Project available Presentation received. Confidentiality was requested.
Currently chosen project. Report received. Blocked, not available.
Project plan received. Completed project. Report but no presentation
Outside normal rp timeframe

wordle-s.png


title
summary
supervisor contact

students
R

P
1
/
2
1

Security By Default; A Comparative Security Evaluation of Default Configuration.

Weak default configuration settings are a serious problem on the Internet. Operators forget to alter these settings, often leading to serious security issues. In this project, you will build a structured system that analyses and compares the default configuration options for popular network services across multiple Linux distributions with a gray-box approach.

The underlying research-questions are:
  • What role do distributors play in the quality of default configuration for Internet services, i.e., are there significant differences between distributors, and, which factors influence the distributors security performance.
  • What is a suitable metric to describe the security posture of default configurations for Internet services.
In addition, this project contains a significant engineering challenge. You will build (and, if you do not yet know how to do it learn to build ) an automated infrastructure in which you can automatically, incrementally, deploy and test Internet services on different Linux distributions. You will conduct this work mainly from the OS3 labs at UvA.
Tobias Fiebig <t.fiebig=>tudelft.nl>
Ralph Koning <R.Koning=>uva.nl>

Bernardus Jansen <bernardus.jansen@os3.nl>
R

P
2
5

Low-level writing to NTFS file systems.

Windows abstracts the interaction with harddrives by providing APIs to talk to the filesystem. The Win32 API provides methods to read/write data to an NTFS-formatted disk. These methods call the necessary kernel functions, which in turn call the file system driver. (Also see https://stackoverflow.com/a/11252104 what is happening behind the scenes when writing to the disk.)
 
Additionally, SACLs and DACLs (security/discretionaty access control lists) can be set on file system objects that determine permissions for objects. When opening files for reading/writing, SACLs and DACLs are normally enforced and access to files may be denied based on the access rights set for the object.
 
Anti-virus (AV) products can hook Win32 API functions that are responsible for filesystem writes to identify when malware is dropped to the disk. Data that is written to the disk can easily be flagged for scanning by an AV using a write hook. This can be accomplished using a File System Filter driver for example on kernel level, or using a FileSystemWatcher from userspace.
 
We are looking to investigate whether it is possible to use lower-level Windows methods to interact with the file system (e.g. by opening the raw disk device and reading/writing to it) both from userspace and kernelspace perspective. Talking to the raw disk device may mean that SACLs and DACLs are not enforced.
 
Work in this area has been done by Joe Bialek for reading raw NTFS structures from the disk device (see "Invoke-NinjaCopy"), which can be used to read locked or sensitive files while also bypassing ACLs. We are interested in whether the same or similar techniques can be used to write data back to the file system to specific files (or creating arbitrary new files). This would mean that API hooks could potentially be bypassed that listen for file system changes.
Cedric Van Bockhaven <cvanbockhaven=>deloitte.nl>
Rick van Gorp <Rick.vangorp=>os3.nl>
R

P
2
8

A Blockchain based Version Control System that incorporates the Developer’s Environment.

Version Control Systems (VCS) have been around since the mid ’80s. The purpose of a VCS is to manage changes made to data over time. The best known examples are Github, Gitlab and Bitbuck, whose options go far beyond mere version control. So far, the main focus of VCS’s has been on source code. When it comes to different types of data, e.g. multi-dimensional arrays, current solutions still fall short.

This has led companies such as Airbus to store satellite data without a proper way to revise the different versions, let alone securely verify that this data has not been tempered with. Another issue is the fact that even when different versions of the same data are available, the environments used to process this data are subject to continuous changes, and are rarely stored alongside the data. This can create issues when verifying certain steps of the data processing. This is a dilemma that not only Airbus faces, but researchers in general.

This research will focus on the viability of a VCS based on Blockchain that incorporates the Developer’s Environment. Blockhain has been chosen for the immutability, security, and inherent reproducibility that it brings. Similar proposals have been made, however none has been implemented so far. And the proposals that do exist focus on the decentralization of data that Blockchain offers for redundancy. None incorporates the Developer’s Environment with the aspiration of making data processing and research as a whole, more reproducible. Due to the limited time span of this research, the focus will be on satellite data specifically. The goal is to create a proof-of-concept.

Sjaak Koot <s.koot=>airbusds.nl>

Sandino Moeniralam <Sandino.Moeniralam=>os3.nl>
R

P
2
14

Microsoft Office Upload Center Cache Files in Forensic Investigations.

The Microsoft Office suit uses a file cache for several reasons, one of them is delayed uploading and caching of documents from a sharepoint server. In these cache files office partial or complete documents that have been opened on a computer might be available. Also the master database in the file cache folder contains document metadata from sharepoint sites. In this project you are asked to research the use of the office file cache and deliver a POC for extraction and parsing of metadata from the database file, also decode or parse document contents from the cachefiles (.FSD).
Yonne de Bruijn <yonne.debruijn=>fox-it.com>
Rick van Gorp <Rick.vanGorp=>os3.nl>
Kotaiba Alachkar <kotaiba.alachkar=>os3.nl>
R

P
1
16

Network Functions Virtualization.

SURFnet provides fixed wide area network connectivity services to the research and education sector in the Netherlands. The campus domain is the responsibility of each institution themselves. An emerging trend among primarily smaller institutions is to outsource ICT and network operations. SURFnet could offer virtualized network functions using NFV technology for these institutions replacing hardware in the campus domain. This could unburden these institutions from lengthy procurement processes for the hardware, free them from vendor lock-in and pave the way for pay-per-use licensing models proving to be more cost-efficient than the current situation.

In this context the research assignment would address the following questions:
  • Which network functions within campus networks are suitable to be virtualized, for example a router, firewall, DNS, DHCP, load balancers etc?
  • Which technical aspects need to be considered if SURFnet would decide to provide one or more of these virtualized functions?
  • Does the distance of the virtualized platform from the campus affect the performance of the virtualized function? Is this performance dependent on the function itself?
  • How should redundancy be arranged?
  • Is it feasible to just virtualize one function or are they so inter-dependent with other network functions in the campus domain that eventually a virtualized solution should be offered for all network functions within a campus network?
Marijke Kaat <marijke.kaat=>surfnet.nl>

Bernardus Jansen <bjansen=>os3.nl>
R

P
1
19

An analysis of the scale-invariance of graph algorithms: A case study.

There exist sets of graphs available for research. However, these sets usually exclude big graphs. Therefore a scaling mechanism has been developed that allows experiments to be run on larger (or smaller) graphs.

It is known that the relative performance of different algorithms or implementations of algorithms depends on characteristics of the graph. When scaled in such a way as to preserve as much of these characteristics as possible, is the relative performance of different algorithms preserved as well?
Merijn Verstraaten <M.E.Verstraaten=>uva.nl>

Tim van Zalingen <Tim.vanZalingen=>os3.nl>
R

P
2
20

Opcode statistics for detecting compiler settings.

The Binary Analysis Tool is an open source tool that can automate analysis of binary files by fingerprinting them. For Executable Linkable Formats (ELF) files this is done by extracting string constants, function names and variable names from the various ELF sections. Sometimes compiler optimisations move the string constants to different ELF sections and extraction will fail in the current implementation.

Your task is to find out if it is possible by looking at the binary to see if optimisation flags that cause constants of ELF sections to be moved were passed to the compiler and reporting them. The scope of this project is limited to Linux.

Armijn Hemel - Tjaldur Software Governance Solutions
Armijn Hemel <armijn=>tjaldur.nl>

Kenneth van Rijsbergen <Kenneth.vanRijsbergen=>os3.nl>
R

P
2
22

Optimal network design of SURFnet8, using TI-LFA and Segment Routing.

As there are multiple links between nodes, if only one link fails, the next router will react in unwanted ways until the new IGP topology hasn't converged:
  • How do different implementations of TI-LFA compare if node failure, link failure, or both happen at the same time, on multiple nodes?
  • Is fate-sharing necessary for ALL links that share the same optical path, or TI-LFA would be sufficient to provide efficient backup path coverage?
  • Up to how many SID's / router are necessary in TI-LFA on the global scale to achieve 100% backup coverage?
  • How does IGP costs effect the above parameters?
Marijke Kaat <marijke.kaat=>surfnet.nl>
Wouter Huisman <wouter.huisman=>surfnet.nl>

Peter Prjevara <Peter.Prjevara=>os3.nl>
Fouad Makioui <fouad.makioui=>os3.nl>
R

P
2
23

DDoS Defense Mechanisms for IXP Infrastructures.

In the modern internet era, the Internet Exchange Points play a significant role on transferring bits in a more efficient and cost effective way. As a result of this, more and more huge players establish interconnections on the peering LANs and try to offload to them as much traffic as possible. With the recent increase of DDoS attacks since the beginning of the year, we face a big need from the community to provide an efficient solution that not only detects the attack in the peering LAN, but also tries to mitigate it in a smart way without shutting down the service to the member/customer.

In this project, initially  the student has to study the problem and understand the nature of the peering LAN and its special requirements. After that, the student has to design a solution that can detect the DDoS attack traversing the IXP network and filters out the malicious traffic from the legitimate one, resulting on continuous service delivery not only to big networks but also to small ones. It would be ideal for the upcoming solution to be general enough to fit not only the AMS-IX platform but also to any L2 peering LAN that exists, making it as an important contribution to the peering community.
Stavros Konstantaras <stavros.konstantaras=>ams-ix.net>

Lennart van Gijtenbeek <Lennart.vanGijtenbeek=>os3.nl>
Tim Dijkhuizen <tim.dijkhuizen=>os3.nl>
R

P
2
30

Content-based Classification of Fraudulent Webshops.

Many criminal activities focusing on the Dutch market are domain name bound. Examples include operating web shops which are fraudulent or selling illegal wares. The number of which are reported to be many thousands [1]. Identifying these domains could be very beneficial to the online safety of many. SIDN is the registry governing the .nl. ccTLD and hence maintains the canonical zone file for .nl. Using this zone file, SIDN has generated a data set of all websites operating on the .nl. domain [2]. This set includes at least DNS, TLS and HTTP information.
This research is interested in the possibilities of classifying fraudulent or malicious .nl. domains, by correlating NLP and similar methods, using the dataset provided by SIDN. The research will investigate and compare different features and methodologies in order to find out which one is the most effective.

What exactly is defined as fraudulent or malicious will have yet to be defined. This will be further defined in the project proposal.

[1] https://www.consumentenbond.nl/nieuws/2018/consumentenbond-laat-850-foute-webwinkels-offline-halen
[2] https://www.sidnlabs.nl/a/weblog/crawling-nl
Marco Davids <marco.davids=>sidn.nl>
Maarten Wullink <maarten.wullink=>sidn.nl>

Sjors Haanen <sjors.haanen=>os3.nl>
Mick Cox <Mick.Cox=>os3.nl>
R

P
2
35

Bypassing Phishing Filters.

Over the past decades, email has become one of the major means of communication. Because of the prominent role email has taken in the present society, email security has become a key aspect of the modern Internet. A common type of email threat are `phishing emails'. Phishing emails use fraudulent social engineering techniques to elicit sensitive information from unsuspected users. As phishing attacks exploit human weaknesses, mitigation of this type of attack is difficult. To minimize the chance that a user becomes victim of a phishing attack, anti-spam tools include phishing detection solutions to prevent a malicious email from reaching the users inbox. The goal of this research is to determine whether techniques exist that allow us to bypass these anti-phishing solutions.
Rick van Galen <vanGalen.Rick=>kpmg.nl>
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>

Shahrukh Zaidi <Shahrukh.Zaidi=>os3.nl>
R

P
2
37

Pentest Accountability By Analyzing Network Traffic & Network Traffic Metadata.

During security tests, it is often difficult to achieve great accountability of actions. Systems may be disrupted by a security test, or may be disrupted by unrelated bugs and administration within the organization. To prove accountability of certain actions, one must keep good records of pentest activities. One such method is to simply log and analyze network traffic. But is it feasible to do this? Does one log all network traffic, or only meta-information? And is it feasible to do this given storage requirements?
Rick van Galen <vanGalen.Rick=>kpmg.nl>

Marko Spithoff <mspithoff=>os3.nl>
Henk van Doorn <Henk.vanDoorn=>os3.nl>
R

P
1
42

An Analysis of Atomic Swaps on and between Ethereum Blockchains.

Blockchain technology is getting much attention triggered by the popularity of the bitcoin cryptocurrency. Ethereum (https://ethereum.org/) is a blockchain-based computer that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third party interference. However, the unlimited openness of Ethereum poses risks. For example, bad actors can permanently put illegal content or applications on such a blockchain. This risk, and the associated legal liability will refrain legitimate businesses from running applications or supporting such an infrastructure. Permissioned blockchains, see e.g.
allow for certain parties to have more control in who can do what and, therefore, can help mitigate this risk. The hypothesis is that such permissioned blockchains can retain many of the benefits of blockchain technology.

In this project, you will investigate the hypothesis by
  • Performing a brief risk analysis, identifying the most prominent risks of permissionless blockchains
  • Performing a brief analysis of the main (quantifiable) benefits of permissionless blockchains
  • Developing permission requirements for managing the identified risks and relating those to the (potential) loss of benefits (e.g., openness, censorship resistance).
  • Implementing a permissioned blockchain (for example by using technology provided by Eris (https://erisindustries.com/) or Tendermint (http://tendermint.com/)).
  • Demonstrating the functionality of the system with a test application
  • Evaluating the system
Oskar van Deventer <oskar.vandeventer=>tno.nl>
Maarten Everts <maarten.everts=>tno.nl>

Peter Bennink <Peter.Bennink=>os3.nl>
Lennart van Gijtenbeek
<lgijtenbeek=>os3.nl>
R

P
1
50

Virtual infrastructure partitioning and provisioning under nearly real-time constraints.

A complex cloud application often requires resources from different data centres or providers, e.g., because of the geographical location of some specific components, particular physical elements in the Internet of Things or a sensor network, or because of limits on the available resources for optimizing system performance or for balancing workloads. Instead of letting cloud providers do the provisioning, some developers need to plan infrastructure directly, and oversee the provisioning in order to optimize system performance or cost based on their own requirements and understanding of their application. Mapping a complex infrastructure on different data centre or providers basically involves several steps: partitioning the graph of the infrastructure, provisioning sub-graphs, and connecting the interstitial network. This project focuses on the first phase of the problem: how to effectively partition an infrastructure graph based on the constraints of data centres, application characteristics, and locations of non-Cloud components. This project therefore focuses on the graph-partitioning problem.

The students will:
  1. Review the state of the art of the problem and the existing algorithms.
  2. Evaluate the key algorithms based on characteristics of specific applications, cloud providers and quality of service (QoS) constraints.
  3. Test a prototype with the parallel provisioning components developed by a researcher in another concurrent project.
Zhiming Zhao <z.zhao=>uva.nl>
Arie Taal <a.taal=>uva.nl>

Andrey Afanasyev <aafanasyev=>os3.nl>
R

P
1
53

Feasibility of Cryptocurrency on Mobile devices.

Blockchain technology is currently something that requires a steady source of internet and power, since to be synchronized with the grander blockchain requires frequent receiving  and processing of blockchain data to keep in sync. This currently prevents blockchain technology to be effective in mobile devices, somewhat limiting its use.
  • How would practical blockchain on mobile actually look?
  • And how can this be accomplished?
  • What are the relevant security aspects for this?
This goal of this research is to provide a literary overview of the different aspects of making blockchain tech practical on mobile.
Rick van Galen <vanGalen.Rick=>kpmg.nl>

Sander Lentink <Sander.Lentink=>os3.nl>
Anas Younis <Anas.younis=>os3.nl>
R

P
1
57

Capability analyses of mdtmFTP.

Internet transport protocol: mdtmFTP is a middleware solution developed by Fermilab to transfer large volumes of data, that may be contained in lost of small files, across long distances using the concept of a Data Transfer Node. At KLM a DTN has been connected to Netherlight allowing experiments with national and international institutes. A research project deploying DTN's to share Big Data is for example the Pacific Research Platform project in which UvA participates. This project should evaluate the capabilities of mdtmFTP across short and long distance and compare it with other of FTP implementations (e.g GridFTP). It should also investigate if the middleware could be adopted to serve other Big Data type applications, e.g. allow data replication in a Hadoop File System across distance.

For more info on mdtmFTP see:
Leon Gommans <Leon.Gommans=>klm.com>

Kees de Jong <kees.dejong=>os3.nl>
R

P
1
61

 Deanonymisation in Ethereum Using Existing Methods for Bitcoin.

Intelligence collected from a large number of sources help to provide context and insight in various scenarios, for example:
  • Contextual querying in (Forensic) Investigations
  • Activity of malicious actors are tracked and subsequently turned into indicators of compromise that can be used to detect and counter malicious activity.
The decentral and anonymous Bitcoin currency is exploited by actors with malicious intentions. The goal is to research the metadata that is available on a node within the Bitcoin network, and to develop code that structures and provides a real-time feed of this metadata.
Arno Bakker <Arno.Bakker=>os3.nl>
Tim Dijkhuizen <tim.dijkhuizen=>os3.nl>
Robin Klusman <robin.klusman=>os3.nl>
R

P
1
62

Breaking CAPTCHAs on the Dark Web.

The darkweb contains among others tons of information about illegal activity, which might be interesting from an intelligence perspective. The intelligence can be used to monitor activity related to specific high profile organizations, or specific threat actors. Since there are a lot of different types of websites, with sometimes unique subscriber requirements it is hard to scrape these websites. In some cases an existing member has to vouch for a new member, users have to post at least once a month a message on the website (otherwise they will be banned), you have to pay in bitcoins to get access etc.
The goal of this research project is to come up with a theoretical framework for scraping (potentially) interesting darkweb websites, taking into account the different kind of subscription models and subscriber requirements. For this research project it is not required to develop a PoC.
Yonne de Bruijn <yonne.debruijn=>fox-it.com>
Dirk Gaastra <Dirk.Gaastra=>os3.nl>
Kevin Csuka <kevin.csuka=>os3.nl>
R

P
1
64

Improving Machine Learning Based Intrusion Detection for SCADA Systems using case specific information.

Interconnected ICS/SCADA systems around the world are exposed to risk due to lack of security countermeasures or misconfiguration issues. This project aims to regularly perform online scanning on the country i.e. (Netherlands) to identify permanent or mistakenly interconnected ICS/SCADA systems by recognizing default ICS ports, vendors’ interfaces and online search engines’ results.

The research questions are:

  • "What information can be used to complement the information generated by ML algorithms, to improve the efficiency and accuracy of a ML based Intrusion and Anomaly Detection Systems, and make it useful for both linear and non-linear processes?"
  • "How can this information be best combined with the ML algorithm?"

Required area of expertise: Hacking

More info: http://werkenbijdeloitte.nl/cyber-graduate

Dima van de Wouw
<DvandeWouw=>deloitte.nl>

Peter Prjevara <Peter.Prjevara=>os3.nl>
R

P
1
66

Framework for profiling Critical Path related Algorithms.

Critical path based algorithms are effective means of scheduling tasks with deadlines, but it is difficult to determine which algorithm variants work in what scenarios. Customization of the critical algorithms for different input graph and deadline distributions is needed. The purpose of this project is to determine which variants are most appropriate for which graph structures. The work will be done in the context of EU projects SWITCH[1]; initial algorithms developed in SWITCH will be used as part of the test.

The student will:
  1. Review the state of the art, and identify a set of properties to characteristics application graph (workflows)
  2. Prepare a set of critical path related algorithms or strategies for testing
  3. Collect workflow graphs with certain characteristics
  4. Schedule experiments for different configurations and collect results
  5. Classify the results and discover the correlation among algorithm configuration and application characteristics
  6. (Optional) Prototype software tool for automating such profiling 
Reference:
  1. http://www.switchproject.eu
  2. Wang, J., Taal, A., Martin, P., Hu Y., Zhou, H., Pang, J., de Laat, C., Zhao, Z. (2017) Planning Virtual Infrastructures for Time Critical Applications with Multiple Deadline Constraints, International journal of Future Generation Computer System, volume 75, page 365-375. paper
More info: Arie Taal, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>

Henri Trenquier <Henri.Trenquier=>os3.nl>
R

P
1
71

Containerized Workflow Scheduling.

Operating system (OS) containers, are becoming increasingly popular among the cloud and DevOps community with emerging open source container management technologies (e.g., Docker). Orchestrator tools such as Kubernetes and Swarm can automate deployment, scale, and manage containerized applications, which have been adopted by a lot of enterprises, such as eBay, PHILIPS, SAMSUNG.
This research concerns an in-depth analysis of the schedulers in Kubernetes and Swarm. Their schedulers significantly impact availability, performance, and capacity of the container cluster. For example, they can ensure that containers are only placed on nodes that have sufficient free resources, it tries to balance out the resource utilization of nodes, etc.

The students will:
  1. Review the state of the art of container orchestration and deployment scheduling technologies
  2. Investigate how many kinds of schedulers these two systems can support
  3. Compare the performance of different schedulers to understand the characteristics of those schedulers
  4. Implement a new scheduler to enhance certain properties of the system
Reading material:
  1. Kubernetes: https://kubernetes.io/
  2. SWARM: https://docs.docker.com/swarm/
  3. Deployment scheduling: Hu, Y., Wang, J., Zhou, H., Martin, P., Taal, A., de Laat, C., and Zhao, Z. (2017) Deadline-aware Deployment for Time Critical Applications in Clouds, proceedings of the Euro-Par 2017 Conference in Santiago de Compostela, August 30- September 1, 2017 https://doi.org/10.1007/978-3-319-64203-1_25
More info: Yang Hu, Spiros Koulouzis Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>

Isaac Klop <Isaac.Klop=>os3.nl>
R

P
1
76

Improving Semantic Quality of Topic Models for Forensic Investigations.

E-mails have been the most popular means of communi- cation in companies since the birth of the Internet. When a company is accused of running fraudulent schemes, it has to hand over the registered communications. E-mails as the oldest communication channel, will probably represent the largest dataset for forensic investigations. Natural Language Processing (NLP) can be used as a tool to analyse and extract meaningful information out of large sets of written infor- mation. Latent Dirichlet Allocation (LDA)[1] is a statistical model that gathers files into topics based on the frequency of words apparition. However, the quality of given results is hard to estimate.

Following the introduction, the research question is defined as:
  • How to enhance the efficiency of NLP in email forensics?
In order to answer the main research question, the following sub-questions are defined:
  • How does the chosen number of topics influence the accuracy of the results?
  • How to rate the accuracy of Topic Modelling?
  • What additional information could be used to improve the quality of the topics?
REFERENCES
[1] David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent dirichlet allocation. ”Journal of Machine Learning Research 3”, 2003.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>
Carlos Martinez Ortiz <c.martinez=>esciencecenter.nl>

Henri Trenquier <Henri.Trenquier=>os3.nl>
R

P
2
80

Categorizing container escape methodologies in multi-tenant environments.

A relatively new class of vulnerabilities are container escape vulnerabilities. Many applications and environments require an attacker that is able to get access to a Docker (or rkt, LXC, or FreeBSD jails) container to not be able to get a shell on the container host system. As container technology matures, several different techniques have been identified. However, container escapes are complicated matters. Some container escapes are possible against container technologies. Other container escapes are possible only against certain container technologies that have certain privileges enabled for containers.  Yet other container escapes are not considered security bugs in the software, as they are a consequence of the way the administrator has configured the container.

The aim of this research would be to perform literature research towards container escapes and create a systemic categorization of these security bugs. This would allow
  1. a systematic way of thinking about container escapes and
  2. identify some key recommendations for container technology developers.
Rick van Galen <vanGalen.Rick=>kpmg.nl>

Rik Janssen <rik.janssen=>os3.nl>
R

P
1
82

Detection of Brouwser Fingerprinting by Static JavaScript Code Classification.

Modern web browsers send a wealth of information to webservers, which can be used to track and identify clients. The aggregate of this information is referred to as a browser fingerprint. Web browsers like Tor browser and Brave have built-in protections against so-called 'fingerprintability', but the most common browsers for end-users do not have such protections. The uniqueness of a browser has implications for user privacy, since a unique browser can be tracked across the internet even if the user does not log into any website. Even cloned machines behind firewalls are vulnerable to clock skew or other hardware characteristics analysis.
This raises certain concerns for users:
  • How unique are default browser installations - on a fresh OS?
  • Or a fresh browser on an in-use OS?
  • How much does a user need to modify their browser or OS before they are considered unique (or close to unique) by a remote server?
  • Is it likely that a typical user would make these modifications?
  • Furthermore, what are the implications for users on less customizable platforms, such as mobile devices?
As a starting point, the EFF has a tool called Panopticlick (based on the idea of the panopticon) which analyses information sent by a web browser and informs the user how unique they are.
Aidan Barrington <Barrington.Aidan=>kpmg.nl>

Sjors Haanen <sjors.haanen=>os3.nl>
Tim van Zalingen <Tim.vanZalingen=>os3.nl>
R

P
1
83

Large Scale Netflow Information Management.

Netflow data enables powerful tooling to see what is happening on networks and perform network forensics. SURFcert (*) currently uses nfsen/ndump to process this data. This tool stores all the data in separate files for every 5 minutes time frame. Which makes it cumbersome to perform certain queries. Such as which IP addresses have recently been active on the network. The combination of the current date with other tools seem much more promising. Like ELK, Hadoop+Impala or Google BigQuery. We would like to know what is the most promising tool to get more relevant information from the current data. Including building a proof of concept setup.

(*) https://www.surf.nl/en/services-and-products/surfcert/index.html
Marijke Kaat <marijke.kaat=>surfnet.nl>
Wim Biemolt <Wim.Biemolt=>surfnet.nl>

Shahrukh Zaidi <szaidi=>os3.nl>
Adrien Raulot <Adrien.Raulot=>os3.nl>
R

P
1
84

Using Fault Injection to weaken RSA public key verification.

Embedded systems often rely on the RSA cryptosystem for their security. For example, RSA signatures may be used to verify the authenticity of code being executed on the system. A common use case of this is secure boot.
One of the ways that the RSA cryptosystem may be attacked is by flipping bits in the public modulus. By flipping certain chosen bits, it can make it easier to decompose the modulus into its prime factors. With this knowledge it becomes possible to create a valid private key for this new modulus. This private key can then be used to create valid signatures under the new modulus.

Using Fault Injection attacks it is possible to introduce faults into the data being used by an embedded system. This makes the above outlined RSA attack feasible. However, when using Fault Injection it is usually not possible to flip arbitrary bits. This can lead to difficulty in attaining the desired faults.

This project is about exploring the possibilities of combining the above attacks. Given the limitations imposed by Fault Injection, is this a practical way of attacking the RSA cryptosystem?

Some of the other questions that we hope to answer are:
  • How can an RSA public modulus be modified in a way that is beneficial to an attacker?
  • Under a given fault model, which faults yield moduli that are factorable?
  • Can we efficiently create valid private keys for this modified modulus?
  • Is it practical to apply this attack against RSA?
Ronan Loftus <loftus=>riscure.com>

Ivo van der Elzen <ivo.vanderelzen=>os3.nl>
R

P
2
85

Plug-and-play Raspberry Pi-based video filtration system; A novel approach to reversible PII anonymization in videostreams using commodity hardware.

The NI-1772C are cameras that are used frequently in healthcare settings. Often it is required that the video streamed frames are sent after removing all metadata and facial characteristics are anonymized too. This research project aims at building a lightweight solution that investigates novel methods of video stream data and facial information anonymization.

This project can be of great value if implement rightly.

The supervisor is available full time over Skype for consultation for students.
Junaid Chaudhry <chaudhry=>ieee.org>
Swann Scholtes <swann.scholtes=>os3.nl>
Chris Kuipers <chris.kuipers=>os3.nl>
R

P
1
86

Probabilistic Passphrase Cracking.

Passwords are a popular method for authenticating users, but users are required to use more and more complex passwords in order to stay ahead of attackers' computational capacity. Using phrases instead is not necessarily stronger, but random sequences of common words are easier to remember than strings of random characters.

This research project will expand on existing work in the field of passphrase cracking. The goal is to create practical tooling to crack this kind of password. Radically Open Security furthermore has an interest in integrating this software in their own tooling, so it must be easy to call through a library.
Melanie Rieback <melanie=>radicallyopensecurity.com>

Luc Gommans <luc.gommans=>os3.nl>
R

P
1
88

Automated Analysis of AWS Infrastructures.

Companies are moving their infrastructure towards the cloud making use of infrastructure as a service solutions (IaaS) such as provided by Amazon.
Amazon Web Services (AWS) has support for fine-grained access control for their services through Identity and Access Management (IAM). For each service, it is possible to attach rights for other services that are allowed to interact, where they are positioned in the network, etc.
 
Corporate Windows networks traditionally make use of directory services for management of access rights for users and systems. Redteams that simulate cyber attacks on corporate infrastructure use specialized tooling to map relationships between Active Directory (AD) objects to identify attacks paths (e.g. easy overview which regular user has local administrative access to a certain system that is located within a management organizational unit). Bloodhound is an example of such a tool that can map relationships within AD objects.
 
We would like to investigate whether it is possible to map relationships between different AWS services to index access rights on a network level just like Bloodhound (using the Amazon APIs). This allows redteams to identify which systems they potentially need to breach to move laterally through the network to form an attack path. At the same time, blueteams that are looking to protect their networks know where potential pivotal points are in the network that they need to reinforce.
Cedric Van Bockhaven <cvanbockhaven=>deloitte.nl>

Peter Bennink <Peter.Bennink=>os3.nl>
R

P
2
89

How to spot the blue team; Red Team Infrastructure Security.

The goal of the project is to be able to know when the blue team is onto the red team’s attack or not. If an analysis takes place, how can we know it is done by the blue team? Simple monitoring and logging is not sufficient as you don’t want to raise events by legit Command and Control (C2) traffic, nor by random scans on the Internet. If an analysis takes place, how can the red team make this analysis as hard as possible?

To realize the goal of the project, the following research question is drawn up:
  • How to secure a red team specific infrastructure?
This research question will be answered with use of the following sub-questions:
  • How does a typical red team infrastructure look like?
  • How can a blue team’s analysis be detected?

The Python code of the Proof of Concept script can be seen at the following Github page:

Configuration files used can be seen in the following GitHub page (also present in the paper in Appendix):

Marc Smeets <marc=>outflank.nl>

Rick Lahaye <rick.lahaye=>os3.nl>
R

P
2
91

Predicting intermittent network device failures based on network metrics from multiple data sources.

Surfnet actively monitors their infrastructure and stores all metrics in influxDB and Splunk. Using SNMP, Surf monitors the state of the individual components and stores them in influxDB, syslogs are stored in Splunk. Although this data is saved, no further processing is done or predictions based on the metrics are being made.

Surfnet wants to further improve the pro-active stance to prevent and predict disruptions and identify anomalies within the infrastructure. This includes preventing outages, predicting events and advancing capacity management. The monitoring metrics consist of unstructured data. The goal is to use current and historical metrics to predict future events. Extending upon this research, machine learning algorithms could be used to predict anomalies based on the historic data.
Marijke Kaat <marijke.kaat=>surfnet.nl> Peter Boers <peter.boers=>surfnet.nl>

Henk van Doorn <Henk.vanDoorn=>os3.nl>
Chris Kuipers <chris.kuipers=>os3.nl>
R

P
2
92

DoS on a Bitcoin Lightning Network channel.

Bitcoins lightning network has been proposed as a layer 2 technology on top of its blockchain to facilitate higher transaction throughput. According to the lightning network paper authors it will help scale to "billions of transactions per day with the computational power available on a modern desktop computer today. (2016)".

This research investigates a potential vulnerability in its design, by which all bitcoin in a channel can be claimed in case of a successful DoS attack. The research will provide:
  1. an exploration of this attack in theory;
  2. a simulated attack using bitcoin's testnet and the Lightning Network Daemon to test its practical feasibility;
  3. a discussion about potential solutions and their impact on bitcoin's supposed properties, e.g. that of being trust-less, decentralized and permission less.
Oskar van Deventer <oskar.vandeventer=>tno.nl>
Maarten Everts <maarten.everts=>tno.nl>

Willem Rens <Willem.Rens=>os3.nl>
R

P
2
93

Privacy Analysis of DNS Resolver Solutions.

As the first step in virtually any Internet communication, the DNS is a vital yet often often overlooked component when considering users’ privacy. New protocols such as DN-over-TLS and DNS-over-HTTPS have brought encryption to the communication between a host and its recursive resolvers. New public services allow independence from the resolvers provided by the ISP. However, simply using these new technologies doesn’t necessarily improve privacy. For instance, switching to a public DNS provider exposes all DNS queries to this provider.

The project will research various possible DNS resolver setups and analyse their implications for privacy. From this analysis, it will derive recommendations for how to setup DNS resolution under various privacy requirements.
Martin Hoffmann <martin=>NLnetLabs.nl>
Ralph Dolmans <ralph=>NLnetLabs.nl>

Jeroen van Heugten <Jeroen.vanHeugten=>os3.nl>
R

P
2
95

Targeted GPS spoofing.

This work investigates whether it is possible to directionally spoof one GPS receiver over a distance. The main research question is defined as follows:
  • Is it possible to limit GPS spoofing to a single receiver?
To answer this main research question, the following supporting sub research questions are defined:
  1. Can a spoofed GPS signal be contained within a radius of 10 meters without the use of a Faraday cage?
  2. Is it possible to direct spoofed GPS signals using a directional antenna?
  3. Does the GPS receiver still compute an accurate position when dividing the spoofed GPS signal over two transmitters?
Spoofing GPS signals is known to work, but other GPS receivers that are in range are also affected. If the impact to other receivers can be limited, GPS spoofing could be used in a variety of applications, such as moving a drone that is blocking the landing of an air ambulance. By transmitting parts of the GPS signal from multiple geographically dispersed directional antennas, one could potentially limit the impact of the GPS spoofing attack to a single GPS receiver that would be present at the intersection of the signals. Only at the intersection of the directional signals, a GPS receiver would be able to see enough of the spoofed satellites to compute the spoofed location.

The researchers performed a number of experiments to investigate whether this technique could be used in practice. The directional antenna used did not direct the signal sufficiently and leaked a large amount of signal on the side. Transmitting part of the signal from different antennas worked; however, due to synchronization issues, the receiver would have an error in its position of between 250 meters and 18 kilometers. A more directional antenna and more precise time synchronization are required for successful use in practice.
Ralph Moonen <ralph.moonen=>secura.com>

Luc Gommans <luc.gommans=>os3.nl>
Bart Hermans <bart.hermans=>os3.nl>
R

P
2
96

Verifying email security techniques for Dutch organizations.

Email has become important for organizations to commu- nicate and exchange (sensitive) information. Email security hasn’t been taken into account the during the original design of email protocols. Therefore, different techniques have emerged to secure email communication and to val- idate emails. Many governments have defined guidelines that require or strongly recommends to implement these techniques to improve email security [2] [4]. Some studies have shown that not every organization or mail provider has adopted or implemented these techniques [3]. This research investigates which and how many of these tech- niques have been adopted or implemented for organizations within the Netherlands.
Therefore, a list of Dutch organizations has to be created. The list will be scraped and parsed from open sources. Furthermore, the parsed list should also contain the number of employees and type of industry per organization.

The main research question defined for this project is as follows:
  • How many email security techniques have been adopted or implemented within dutch organizations
In order to answer the research question, I have defined the following sub-questions:
  1. Which techniques do exist to secure email?
  2. What is the most feasible way to generate a list of Dutch organizations that also contains the number of employees and type of industry per organization?
  3. How can you determine per organization whether or not these techniques have been adopted or imple- mented?
  4. What type of industry has adopted the most and the least email security techniques?
  5. Is there a distinction between small, medium and large organizations in terms of adoption of email se- curity technique?
George Thessalonikefs <george=>nlnetlabs.nl>
Ralph Dolmans <ralph=>NLnetLabs.nl>

Vincent van Dongen <vincent.vandongen=>os3.nl>
R

P
2
97

Blockchain-based Sybil Attack Mitigation: A Case Study of the I2P Network.

Providing perfectly anonymous communication over the Internet is an important, yet difficult to achieve. Among other anonymity networks, I2P, the Invisible Internet Project, aims to offer this high level of invisibility and provide anonymity for all practical purposes. Contrary to the well-known and - researched TOR project, the number of active users of I2P is manageable and analyses of its security rare. Although the fact that I2P network run in a completely decentralized fashion has several advantages, such as better scalability and no trusted central party. However, there are also security risks associated within this environment.
The main issue that will be addressed in this research is the mitigation of Sybil attacks. These attacks take advantage of the peer-to-peer nature of I2P by creating a lot of identities so that more control is gained over the network. Using existing blockchain mechanisms, these kinds of attacks could possibly be mitigated.

Our main research questions is defined as follows:
  • How can existing blockchain mechanisms be used to mitigate Sybil attacks on the I2P network?
The following sub-questions are formulated from the main research question:
  • What are the existing blockchain mechanisms to prevent abuse of the network?
  • What is the feasibility of using blockchain mechanisms to prevent Sybil attacks against I2P?
  • How does the proposed solution extend to other anonymity networks?
Henri Hambartsumyan <HHambartsumyan=>deloitte.nl>
Vincent van Mieghem (vvanmieghem@deloitte.nl)

Dirk Gaastra <Dirk.Gaastra=>os3.nl>
Kotaiba Alachkar <kotaiba.alachkar=>os3.nl>
R

P
2
98

Tor: Finding the Hidden Shallots.

The need for privacy and anonymity has been on the rise partially due to the increase in cybercrime. To supply to this need, overlay network such as The Onion Router (TOR) have been created. The TOR network provides its users with privacy and anonymity by encrypting its traffic in layers, using a technique known as onion- routing. This technique ensures each node is only capable of decrypting its own layer. The onion-routing technique only provides privacy and anonymity protection to the client. For the server to also be protected, the TOR project introduced the concept of hidden services. Hidden services work by using rendezvous connection points to talk on the server’s behalf. For a client to know where to find these rendezvous points it needs the onion link to requests descriptors. These descriptors are published by the server to a distributed hash table called Hidden Directory servers. This project aims to take advantage of the low requirements needed to become a Hidden directory server to have access to hidden service requests and publications. This information will then be used to extract intelligence from hidden services to enable their identification and classification. This project would allow law enforcement to find illegal activities and help monitoring solutions identify cyber threats.

Rik van Duijn <Rik.vanDuijn=>dearbytes.nl>
Leandro Velasco <Leandro.Velasco=>dearbytes.nl>

Joao Marques <Joao.Marques=>os3.nl>

R

P
2

Presentations-rp2

I hereby would like to invite you to the annual RP2 presentations, where the SNE students will be presenting their research. Considering the wide variety of presentations the day promises to be very interesting and we hope you will join us.
Program (Printer friendly version: HTML, PDF): The event is stretched over two days: Tuesday July 3 and Thursday July 5, 2018.
Tuesday July 3, 2018, Auditorium C0.110, FNWI, Sciencepark 904, Amsterdam.
Time D #RP Title Name(s) LOC
RP #stds
12h40

Welcome, introduction. Cees de Laat


12h45 25 91
Predicting intermittent network device failures based on network metrics from multiple data sources. Henk van Doorn, Chris Kuipers SURFnet 2
2
13h10 20 88
Attack path mapping on Amazon Web Services.
Peter Bennink Deloitte 2
1
13h30 20 5
NinjaWrite to NTFS drives on Windows.
Rick van Gorp Deloitte 2
1
13h50 20 35 Bypassing Phishing Filters. Shahrukh Zaidi KPMG 2
1
14h10 20
(bio) break



14h30 25 97
Mitigating Sybil Attacks on the I2P Network Using Blockchain.
Dirk Gaastra, Kotaiba Alachkar Deloitte 2
2
14h55 25 95
Targeted GPS spoofing.
Luc Gommans, Bart Hermans Secura 2
2
15h20 25 23
DDoS Defense Mechanisms for IXP Infrastructures. Lennart van Gijtenbeek, Tim Dijkhuizen AMSiX 2
2
15h45 15

(bio) break



16h00 20 1 Security By Default; A Comparative Security Evaluation of Default Configuration. Bernardus Jansen TUDelft/UvA 2
1
16h20 20 93 Privacy Analysis of DNS Resolver Solutions. Jeroen van Heugten NLNetLabs 2
1
16h40 20 96 Verifying email security techniques for Dutch organizations. Vincent van Dongen NLNetLabs 2
1
17h00

*
End



Thursday July 5, 2018, Auditorium C0.110, FNWI, Sciencepark 904, Amsterdam.
Time D #RP Title Name(s) LOC
RP #stds
12h40

Welcome, introduction. Cees de Laat


12h45 25 30 Content-based Classification of Fraudulent Webshops. Sjors Haanen, Mick Cox SIDN 2
2
13h10 20 92 DoS on a Bitcoin Lightning Network channel. Willem Rens TNO 2
1
13h30 20 84 Using Fault Injection to weaken RSA public key verification.     Ivo van der Elzen Riscure 2
1
13h50 20 27 Large scale Log Analytics. Marcel den Reijer VANCIS 2
1
14h10 20
bio break



14h30 25 22 Optimum Implementation of TI-LFA and Segment Routing on SURFnet 8.
Peter Prjevara, Fouad Makioui SURFnet 2
2
14h55 20 98
Tor: Finding the Hidden Shallots.
Joao Marques Dearbytes 2
1
15h15 20 76 Topic Models of Large Document Collections.     Henri Trenquier UvA 2
1
15h35 25
break



16h00 20 72 Scientific workflow optimization using system logs; Provenance data integration for workflows. Alexander Blaauwgeers UvA 1
1
16h20 20 71 Container deployment scheduling in Kubernetes/Swarm.     Isaac Klop UvA 1
1
16h40 20 19 Investigating the scale-invariance of graph algorithm performance. Tim van Zalingen UvA 2
1
17h00

*
End



Presentations-rp1

Program (Printer friendly version: HTML, PDF) : Monday feb 5th 2018, 13h00 - 16h40 in B.1.23 at Science Park 904 NL-1098XH Amsterdam.
(all presentations are 20 minutes for single and 25 minutes for pairs of students, yellow = requested specific day/time.)

Time D #RP Title Name(s) LOC
RP #stds
13h00

Welcome, introduction. Cees de Laat


13h05 20
20
Analysing ELF binaries to find compiler switches that were used.
Kenneth van Rijsbergen
Tjaldur
2
1
13h25 20
16
Network Functions Virtualization.
Bernardus Jansen
SURFnet
1
1
13h45 25 83
Netflow Information Management.
Shahrukh Zaidi, Adrien Raulot
SURFnet
1
2
14h10 20
bio break



14h30 20 89 Protecting Red Team infrastructures. Rick Lahaye OutFlank 2
1
14h50 20 50 Virtual infrastructure partitioning and provisioning under nearly real-time constraints.
Andrey Afanasyev
SNE
1
1
15h10 20 64
Improving Machine Learning Based Intrusion Detection for SCADA Systems using case specific information.
Peter Prjevara
Deloitte
1
1
15h30 20
break



15h50 25 42 An Analysis of Atomic Swaps on and between Ethereum Blockchains. Peter Bennink, Lennart van Gijtenbeek
TNO 1
2
16h15 25 85
Raspberry Pi-based video filtration system.
Swann Scholtes, Chris Kuipers
JC
1
2
16h40

*
End



Tuesday feb 6th 2018, 10h10 - 16h20 in room B1.23
at Science Park 904 NL-1098XH Amsterdam.
Program:
Time D #RP Title Name(s) LOC RP #stds
10h10

Welcome, introduction. Cees de Laat


10h10 25 53
Feasibility of Cryptocurrency on Mobile devices.
Sander Lentink, Anas Younis
KPMG
1
2
10h35
25 37 Pentest Accountability By Analyzing Network Traffic & Network Traffic Metadata.
Marko Spithoff, Henk van Doorn
KPMG
1
2
11h00 15
bio/coffee break



11h15 25 82
Detection of Brouwser Fingerprinting by Static JavaScript Code Classification.
Sjors Haanen, Tim van Zalingen
KPMG
1
2
11h40 20 80
Container escape methodologies.
Rik Janssen
KPMG
1
1
12h00

Lunch



13h00 20 57
Capability testing of data transfer tools on a high latency 100 Gbit/s lightpath.
Kees de Jong
KLM
1
1
13h20 20 86
Practical Passphrase Cracking.
Luc Gommans
RadicallyOpenSecurity
1
1
13h40 20
bio/tea break



14h00 25 62
Breaking CAPTCHAs on the Dark Web.
Dirk Gaastra, Kevin Csuka
Fox-IT
1
2
14h25 25 61
 Deanonymisation in Ethereum Using Existing Methods for Bitcoin.
Tim Dijkhuizen, Robin Klusman
Fox-IT
1
2
14h50 20
bio break



15h10 25
14
Microsoft Office Upload Center Cache Files in Forensic Investigations.
Rick van Gorp, Kotaiba Alachkar
Fox-IT
1
2
15h35 20 66
Profiling critical path related algorithms for different graph families and deadlines distributions.
Henri Trenquier
UvA
1
1
15h55

*
End



 Out of normal schedule presentations: Room B1.23at Science Park 904 NL-1098XH Amsterdam. Program:
Date Time Place D #RP Title Name(s) LOC RP #stds


B1.23 20







B1.23 20







B1.23 20










*
End