SNE Master Research Projects 2018 - 2019

2004-2005 2005-2006 2006-2007 2007-2008 2008-2009 2009-2010 2010-2011 2011-2012 2012-2013 2013-2014 2014-2015 2015-2016 2016-2017 2017-2018 2018-2019
Contact TimeLine Projects LeftOver Projects Presentations-rp1 Presentations-rp2 Objective Process Tips Project Proposal


Cees de Laat, room: C.3.152
And the OS3 staff.
Course Codes:

Research Project 1 53841REP6Y
Research Project 2 53842REP6Y


RP1 (January):
  • Wednesday Nov 01, 2018, 10h15-13h00: Introduction to the Research Projects.
  • Wednesday Dec 05, 2018, 10h15-13h00: Detailed discussion on selections for RP1.
  • Monday Jan 7th - Friday Feb 1th 2019: Research Project 1.
  • Friday Jan 11th: (updated) research plan due.
  • Monday Jan 21, 2019, 16h00, progress meeting (not mandatory).
  • Monday Feb 4, 2019 15h00-17h00: Presentations RP1 in B1.23 at SP 904.
  • Tuesday Feb 5, 2019 10h00 - 17h00: Presentations RP1 in B1.23 at SP 904.
  • Sunday Feb 10, 24h00: RP1 - reports due
RP2 (June):
  • Wednesday May 22, 2019, 14h00-16h00, B1.23 Detailed discussion on chosen subjects for RP2.
  • Monday Jun 3th - Friday Jun 28, 2019: Research Project 2.
  • Friday Jun 7th: (updated) research plan due.
  • Monday Jun 17: come back day 16h00.
  • Wednesday Jul 3 2019, 12h00-17h00: presentations in C0.110 @ SP904.
  • Thursday Jul 4 2019, 12h00-17h00: presentations in C0.110 @ SP904.
  • Sunday Jul 7th 24h00: RP2 - reports due


Here is a list of student projects. Find here the left over projects this year: LeftOvers.
In a futile attempt to prevent spam "@" is replaced by "=>" in the table.
Color of cell background:
Project available Presentation received. Confidentiality was requested.
Currently chosen project. Report received. Blocked, not available.
Project plan received. Completed project. Report but no presentation
Outside normal rp timeframe project will be done in next block


supervisor contact



End-to-end automated email component testing.

Handling electronic mail in the modern age involves many different software components, as well as significant configuration skills and regular maintenance. This creates a large surface for human error. What is currently missing is an end-to-end automated email component test that system administrators running email systems can use to see if all the components in their actual setup are fully functional. The research question is defined as:
  • To what extent can we prove an e-mail server is properly setup via end-to-end component testing?
In order to answer the main research question, the following sub-questions are defined:
  • What are relevant e-mail server components?
  • Which features are missing in the current mail testing websites, that are required in an end-to-end system?
  • What tests can we run on those missing components.
Code can be found on: https://gitlab.os3.nl/Networking/pogo
Michiel Leenaars <michiel=>nlnet.nl>

Isaac Klop <Isaac.Klop=>os3.nl>
Kevin Csuka <kevin.csuka=>os3.nl>


Virtualization vs. Security Boundaries.

Traditionally, security defenses are built upon a classification of the sensitivity and criticality of data and services. This leads to a logical layering into zones, with an emphasis on command and control at the point of inter-zone traffic. The classical "defense in depth" approach applies a series of defensive measures applied to network traffic as it traverses the various layers.

Virtualization erodes the natural edges, and this affects guarding system and network boundaries. In turn, additional technology is developed to add instruments to virtual infrastructure. The question that arises is the validity of this approach in terms of fitness for purpose, maintainability, scalability and practical viability.
Jeroen Scheerder <Jeroen.Scheerder=>on2it.net>


Reducing Live Streaming Latency.

Live streaming latency refers to the delay between a camera capturing an event and that event being displayed to viewers. When using popular
live streaming platforms (e.g., Periscope, Meerkat, or YouTube Live),; streaming latencies of more than 30 seconds are not uncommon. For our live streaming application, we want to reduce the live streaming latency to a maximum of a few seconds.

For this project, the student is asked to:
  • configure a back end module that converts the RTMP-based video ingest to DASH-based output,
  • configure the DASH player for low-latency streaming, and
  • perform experiments to find the lowest acceptable streaming latency given different network conditions.
Omar Niamut <omar.niamut=>tno.nl>




Blockchain's Relationship with Sovrin for Digital Self-Sovereign Identities.

Summary: Sovrin (sorvin.org) is a blockchain for self-sovereign identities. TNO operates one of the nodes of the Sovrin network. Sovrin enables easy exchange and verification of identity information (e.g. “age=18+”) for business transactions. Potential savings are estimated to be over 1 B€ per year for just the Netherlands. However, Sovrin provides only an underlying infrastructure. Additional query-response protocols are needed. This is being studied in e.g. the Techruption Self-Sovereign-Identity-Framework (SSIF); project. The research question is which functionalities are needed in the protocols for this. The work includes the development of a datamodel, as well as an implementation that connects to the Sovrin network.
Oskar van Deventer <oskar.vandeventer=>tno.nl>


Qualitative analysis of Internet measurement methods and bias.

In the past year NLnet Labs and other organisations have run a number of measurements on DNSSEC deployment and validation.; We used the RIPE Atlas infrastructure for measurements, while other used Google ads where flash code runs the measurements.; The results differ as the measurement points (or observation points) differ: RIPE Atlas measurment points are mainly located in Europe, while Google ads flash measurements run global (or with some stronger representation of East-Asia).

Question is can we quantify the bias in the Atlas measurements or qualitative compare the measurements, so we can correlate the results of both measurement platforms.; This would greatly help interpret our results and the results from others based on the Atlas infrastructure. The results are highly relevant as many operational discussions on DNS and DNSSEC deployment are supported or falsified by these kind of measurements.
Willem Toorop <willem=>nlnetlabs.nl>


Building an open-source, flexible, large-scale static code analyzer.

Background information
Data drives business, and maybe even the world. Businesses that make it their business to gather data are often aggregators of client­side generated data. Client­side generated data, however, is inherently untrustworthy. Malicious users can construct their data to exploit careless, or naive, programming and use this malicious, untrusted data to steal information or even take over systems.
It is no surprise that large companies such as Google, Facebook and Yahoo spend considerable resources in securing their own systems against would­be attackers. Generally, many methods have been developed to make untrusted data cross the trust­boundary to trusted data, and effectively make malicious data harmless. However, securing your systems against malicious data often requires expertise beyond what even skilled programmers might reasonably possess.
Problem description
Ideally, tools that analyze code for vulnerabilities would be used to detect common security issues. Such tools, or static code analyzers, exist, but are either out­dated (http://rips­scanner.sourceforge.net/) or part of very expensive commercial packages (https://www.checkmarx.com/ and http://armorize.com/). Next to the need for an open­source alternative to the previously mentioned tools, we also need to look at increasing our scope. Rather than focusing on a single codebase, the tool would ideally be able to scan many remote, large­scale repositories and report the findings back in an easily accessible way.
An interesting target for this research would be very popular, open­source (at this stage) Content Management Systems (CMSs), and specifically plug­ins created for these CMSs. CMS cores are held to a very high coding standard and are often relatively secure. Plug­ins, however, are necessarily less so, but are generally as popular as the CMSs they’re created for. This is problematic, because an insecure plug­in is as dangerous as an insecure CMS. Experienced programmers and security experts generally audit the most popular plug­ins, but this is: a) very time­intensive, b) prone to errors and c) of limited scope, ie not every plug­in can be audited. For example, if it was feasible to audit all aspects of a CMS repository (CMS core and plug­ins), the DigiNotar debacle could have easily been avoided.
Research proposal
Your research would consist of extending our proof­of­concept static code analyzer written in Python and using it to scan code repositories, possibly of some major CMSs and their plug­ins, for security issues and finding innovative ways of reporting on the massive amount of possible issues you are sure to find. Help others keep our data that little bit more safe.
Patrick Jagusiak <patrick.jagusiak=>dongit.nl>
Wouter van Dongen <wouter.vandongen=>dongit.nl>

Ivar Slotboom <islotboom=>os3.nl>


Collaborative work with Augmented and Virtual Reality – Unity based network infrastructure.

Although the principles have been around some time, Augmented and Virtual Reality finally gets usable for the consumer market. Nowadays, the prominent game engines are used for development of Mixed Reality (AR+VR) applications. This research follows the vision, that different users with different devices should be able to connect to a common server and collaborate virtually by using either AR or VR head-mounted displays or mobile devices like smartphones.
Research question:
  • How does latency impact the quality collaboration of different visualization and device options?
There are existing network capabilities of Unity, existing AR/VR framework that can be built out of unity and existing connectors (which combine for example HTC Vive to Hololens).
The student is asked to:
  • Build a server infrastructure on which users can connect with different devices
  • Build a build-infrastructure for different devices
The software framework will be published under an open source license after the end of the project.
Doris Aschenbrenner <d.aschenbrenner=>tudelft.nl>


Sensor data streaming framework for Unity.

In order to build a Virtual Reality “digital twin” of an existing technical framework (like a smart factory), the static 3D representation needs to “play” sensor data which either is directly connected or comes from a stored snapshot. Although a specific implementation of this already exists, the student is asked to build a more generic framework for this, which is also able to “play” position data of parts of the infrastructure (for example moving robots). This will enable the research on virtually working on a digital twin factory.
Research question:
  • What are the requirements and limitations of a seamless integration of smart factory sensor data for a digital twin scenario?
There are existing network capabilities of Unity, existing connectors from Unity to ROS (robot operation system) for sensor data transmission and an existing 3D model which uses position data.
The student is asked to:
  • Build a generic infrastructure which can either play live data or snapshot data.
  • The sensor data will include position data, but also other properties which are displayed in graphs and should be visualized by 2D plots within Unity.
The software framework will be published under an open source license after the end of the project.
Doris Aschenbrenner <d.aschenbrenner=>tudelft.nl>


Research MS Enhanced Mitigation Experience Toolkit (EMET).

Every month new security vulnerabilities are identified and reported. Many of these vulnerabilities rely on memory corruption to compromise the system. For most vulnerabilities a patch is released after the fact to remediate the vulnerability. Nowadays there are also new preventive security measures that can prevent vulnerabilities from becoming exploitable without availability of a patch for the specific issue. One of these technologies is Microsoft’s Enhanced Mitigation Experience Toolkit (EMET) this adds additional protection to Windows, preventing many vulnerabilities from becoming exploitable. We would like to research whether this technology is efficient in practice and can indeed prevent exploitation of a number of vulnerabilities without applying the specific patch. Also we would like to research whether there is other impact on the system running EMET, for example a noticeable performance drop or common software which does not function properly once EMET is installed. If time permits it is also interesting to see if existing exploits can be modified to work in an environment protected by EMET.
Henri Hambartsumyan <HHambartsumyan=>deloitte.nl>


Triage software.

In previous research a remote acquisition and storage solution was designed and built that allowed sparse acquisition of disks over a VPN using iSCSI. This system allows sparse reading of remote disks. The triage software should decide which parts of the disk must be read. The initial goal is to use meta-data to retrieve the blocks that are assumed to be most relevant first. This in contrast to techniques that perform triage by running remotely while performing; a full disk scan (e.g. run bulk_extractor remotely, keyword scan or do a hash based filescan remotely).

The student is asked to:
  1. Define criteria that can be used for deciding which (parts of) files to acquire
  2. Define a configuration document/language that can be used to order based on these criteria
  3. Implement a prototype for this acquisition
"Ruud Schramp (DT)" <schramp=>holmes.nl>
"Zeno Geradts (DT)" <zeno=>holmes.nl>
"Erwin van Eijk (DT)" <eijk=>holmes.nl>


Network Functions Virtualization and Security.

The security threat landscape is ever changing with cyber-attacks becoming increasingly sophisticated and targeted. This accompanied by the fact that an increasing number of applications and data are moving into the cloud only further complicates the situation. The traditional approach of securing an organization with a firewall is probably not sufficient anymore.

The question to be addressed by the research assignment is to investigate this issue and suggest solutions for the research and education sector in the Netherlands.
  • How are the research and education institutions in the Netherlands securing themselves from cyber-attacks today?
  • What additional measures need to be taken and which functionality needs to be added to their infrastructure (IPS, IDS?)
  • Can Network Functions Virtualization play a role in providing (part) of the security functionality?
  • Would it be useful if the required (virtualized) functionality is centrally arranged on a common NFV platform instead of each research and education institution arranging this for themselves?
Richa Malhotra <richa.malhotra=>surfnet.nl>

Rik Janssen <Rik.Janssen=>os3.nl>


Technical feasibility of Segment Routing Traffic Engineering to steer traffic through VNFs.

Steering traffic to NVFs (Network Virtual Functions) in a network allows to deliver tailored services to end users, such as fire-walling and traffic inspection, as well as load balancing. In this project we look at the suitability of using segment routing to deliver the traffic to the NVFs. The project is carried out at SURFnet and it will use virtual and physical testbed for the validation of the concept.
Marijke Kaat <marijke.kaat=>surfnet.nl>
Eyle Brinkhuis <eyle.brinkhuis=>surfnet.nl>

Ronald van der Gaag <rgaag=>os3.nl>
Mike Slotboom <mslotboom=>os3.nl>


Comparison of security features of major Enterprise Mobility Management solutions

For years, Gartner has identified the major EMM (formarly known as MDM) vendors. These vendors are typically rated on performance and features; security often is not addressed in detail.
This research concerns an in-depth analysis of the security features of major EMM solutions (such as MobileIron, Good, AirWatch, XenMobile, InTune, and so forth) on major mobile platforms (iOS, Android, Windows Phone). Points of interest include: protection of data at rest (containerization and encryption), protection of data in transit (i.e. VPN), local key management, vendor specific security features (added to platform API's),
Paul van Iterson <vanIterson.Paul=>kpmg.nl>


Designing structured metadata for CVE reports.

Vulnerability reports such as MITRE's CVE are currently free format text, without much structure in them. This makes it hard to machine process reports and automatically extract useful information and combine it with other information sources. With tens of thousands of such reports published each year, it is increasingly hard to keep a holistic overview and see patterns. With our open source Binary Analysis Tool we aim to correlate data with firmware databases.

Your task is to analyse how we can use the information from these reports, what metadata is relevant and propose a useful metadata format for CVE reports. In your research you make an inventory of tools that can be used to convert existing CVE reports with minimal effort.

Armijn Hemel - Tjaldur Software Governance Solutions
Armijn Hemel <armijn=>tjaldur.nl>


Verification of Objection Location Data through Picture Data Mining Techniques.

Shadows in the open give out more information about the location of the objects in the pictures. According to the positioning, length, and reflection side of the shadow, verification of location information found in the meta data of a picture can be verified. The objective of this project is to develop such algorithms that find freely available images on the internet where tempering with the location data has been performed. The deliverable from this project are the location verification algorithms, a live web service that verifies the location information of the object, and a non-public facing database that contains information about images that had the location information in their meta-data, removed or falsely altered.
Junaid Chaudhry <chaudhry=>ieee.org>


Multicast delivery of HTTP Adaptive Streaming.

HTTP Adaptive Streaming (e.g. MPEG DASH, Apple HLS, Microsoft Smooth Streaming) is responsible for an ever-increasing share of streaming video, replacing traditional streaming methods such as RTP and RTMP. The main characteristic of HTTP Adaptive Streaming is that it is based on the concept of splitting content up in numerous small chunks that are independently decodable. By sequentially requesting and receiving chunks, a client can recreate the content. An advantage of this mechanism is that it allows a client to seamlessly switch between different encodings (e.g. qualities) of the same content.
There is a growing interest from both content parties as well as operators and CDNs to not only be able to deliver these chunks over unicast via HTTP, but to also allow for them to be distributed using multicast. The question is how current multicast technologies could be used, or adapted, to achieve this goal.
Ray van Brandenburg <ray.vanbrandenburg=>tno.nl>


Generating test images for forensic file system parsers.

Traditionally, forensic file system parsers (such as The Sleuthkit and the ones contained in Encase/FTK etc.) have been focused on extracting as much information as possible. The state of software in general is lamentable — new security vulnerabilities are found every day — and forensic software is not necessarily an exception. However, software bugs that affect the results used for convictions or acquittals in criminal court are especially damning. As evidence is increasingly being processed in large automated bulk analysis systems without intervention by forensic researchers, investigators unversed in the intricacies of forensic analysis of digital materials are presented with multifaceted results that may be incomplete, incorrect, imprecise, or any combination of these.

There are multiple stages in an automated forensic analysis. The file system parser is usually one of the earlier analysis phases, and errors (in the form of faulty or missing results) produced here will influence the results of the later stages of the investigation, and not always in a predictable or detectable manner. It is relatively easy (modulo programmer quality) to create strict parsers that bomb-out on any unexpected input. But real-world data is often not well-formed, and a parser may need to be able to resync with input data and resume on a best-effort basis after having reached some unexpected input in the format. While file system images are being (semi-) hand-generated to test parsers, when doing so, testers are severely limited by their imagination in coming up with edge cases and corner cases. We need a file system chaos monkey.

The assignment consists of one of the following (may also be spawned in a separate RP:
  1. Test image generator for NTFS. Think of it as some sort of fuzzer for forensic NTFS parsers. NTFS is a complex filesystem which offers interesting possibilities to trip a parser or trick it into yielding incorrect results. For this project, familiarity with C/C++ and the use of the Windows API is required (but only as much as is necessary to create function wrappers). The goal is to automatically produce "valid" — in the sense of "the bytes went by way of ntfs.sys" — but hopefully quite bizarre NTFS images.
  2. Another interesting research avenue lies in the production of /subtly illegal/ images. For instance, in FAT, it should be possible, in the data format, to double-book clusters (aking to a hard link). It may also be possible to create circular structures in some file systems. It will be interesting to see if and how forensic filesystem parsers deal with such errors.
"Wicher Minnaard (DT)" <wicher=>holmes.nl>
Zeno Geradts <zeno=>holmes.nl>


Various projects @ Deloitte.

Please follow the link below and look specifically for the one month projects. Inform me (CdL) which one you want to do an we create a separate project number for that.



(In)security of java usage in large software frameworks and middleware.

Java is used in almost all large software application packages. Examples such packages are middleware (Tomcat, JBoss and WebSphere) and products like SAP and Oracle. Goal of this research is to investigate on the possible attacks that exists on Java (e.g. RMI) used in such large software packages and develop a framework to securely deploy (or attack) those.
Martijn Sprengers <Sprengers.Martijn=>kpmg.nl>


Network Peering Dashboard for SURFnet.

SURFnet is the National Research and Education network and we among other services provide internet connectivity to research and higher education in the Netherlands. To do this in the best way we can we need tooling that enables us to get a good oversight of our external connectivity and all our peers.

The student we are looking for would help us implement and build a peering dashboard and have this dashboard interact with external information sources (such as http://peeringdb.com), our ticketing tool and our automation environment.

Research questions include what information should be presented in this tool that helps SURFnet provide the best external connectivity. Is there a way to propose peers available at the IXs SURFnet connects to that aren't peering yet. How to have an as redundant as possible setup, can a tool propose additional peers for the best redundancy.
Jac Kloots <jac.kloots=>surfnet.nl>
Marijke Kaat <marijke.kaat=>surfnet.nl>

David Garay; <david.garay=>os3.nl>


Usage Control in the Mobile Cloud.

Mobile clouds [1] aim to integrate mobile computing and sensing with rich computational resources offered by cloud back-ends. They are particularly useful in services such as transportation, healthcare and so on when used to collect, process and present data from physical world. In this thesis, we will focus on the usage control, in particular privacy, of the collected data pertinent to mobile clouds. Usage control[2] differs from traditional access control by not only enforcing security requirements on the release of data by also on what happens afterwards. The thesis will involve the following steps:
  • Propose an architecture over cloud for "usage control as a service" (extension of authorization as a service) for the enforcement of usage control policies
  • Implement the architecture (compatible with Openstack[3] and Android) and evaluate its performance.
[1] https://en.wikipedia.org/wiki/Mobile_cloud_computing
[2] Jaehong Park, Ravi S. Sandhu: The UCONABC usage control model. ACM Trans. Inf. Syst. Secur. 7(1): 128-174 (2004)
[3] https://en.wikipedia.org/wiki/OpenStack
[4] Slim Trabelsi, Jakub Sendor: "Sticky policies for data control in the cloud" PST 2012: 75-80
Fatih Turkmen <F.Turkmen=>uva.nl>
Yuri Demchenko <y.demchenko=>uva.nl>


Next generation Wi-Fi.

The next generation of Wi-Fi is the IEEE 802.11ax standard, also known as Wi-Fi 6. Key features introduced in order to achieve an increase in throughput and efficiency are OFDMA (Orthogonal Frequency-Division Multiple Access), MU-MIMO (Multi-User Multiple-Input and Multiple-Output), and higher modulation schemes such as 1024-QAM (Quadrature Amplitude Modulation).
This project aims to build a framework to test the physical layer of the IEEE 802.11ax standard. The first step is to build the test setup shown in the picture below and subsequently automate the tests using MATLAB. The setup will be built in the Faraday room in Schiphol-Rijk.

This research subject is not only building the setup, but also performing the measurements in order to address the following research questions regarding the key features introduced with IEEE 802.11ax:
·        What is the benefit of introducing 1024QAM modulation compared to 256QAM in terms of throughput?
·        What is the impact of introducing OFDMA by increasing the number of clients compared to the theoretical performance simulations (for example the MATLAB 802.11ax downlink OFDMA throughput simulation)?
We have 802.11ax reference boards for multiple vendors, the required MATLAB licenses, a spectrum analyser capable of capturing 160MHz, as well as a RF shielded room available in Schiphol-Rijk.
Jan-Willem van Bloem <jvanbloem=>libertyglobal.com>
Arjan van der Vegt <avdvegt=>libertyglobal.com>

Daan Weller <Daan.Weller=>os3.nl>
Raoul Dijksman <Raoul.Dijksman=)os3.nl>


Automated asset identification in large organizations.

Many large organizations are struggling to remain in control over their IT infrastructure. What would help for these organizations is automated asset identification: given an internal IP range, scan the network and based on certain heuristics identify what the server's role is (i.e. is it a web server, a database, an ERP system, an end user, or a VoIP device).
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>


Investigation of security on Chinese smartwatches.

Smartwatches are an unknown area in information risk. They are an additional display for certain sensitive data (i.e. executive mail, calendars and other notifications), but are not necessarily covered by organizations' existing mobile security products. In addition, it is often much easier to steal a watch than it is to steal a phone. What is the data that gets 'left behind' on smartwatches in case of theft, and what information risks do they pose?
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>
Kasper van Brakel <kbrakel=>os3.nl>
R Witsenburg <renee.witsenburg=>os3.nl>


WhatsApp End-to-End Encryption: Are Our Messages Private?

WhatsApp has recently switched to using the Signal protocol for their messaging, which should provide greatly enhanced security and privacy over their earlier, non end-to-end encrypted propietary protocol. Of course, since WhatsApp is closed source, one has to trust WhatsApp to actually use this Signal protocol, since one cannot review the source code. What other (automated) methods are there to verify that WhatsApp actually employs this protocol? This research is about reverse engineering Android and/or iOS apps.;
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>
Pavlos Lontorfos <Pavlos.Lontorfos=>os3.nl>
Tom Carpaij <tcarpaij=>os3.nl>


Video broadcasting manipulation detection.

The detection of manipulation of broadcasting videostreams with facial morphing on the internet . Examples are provided from https://dl.acm.org/citation.cfm?id=2818122 and other on line sources.
Zeno Geradts <Z.J.M.H.Geradts=>uva.nl>


Probabilistic password recognition.

Password authentication is still a very popular way of authenticating users. When a law-enforcement agency seizes the hard drive of a suspect, they have a small window of time to gather evidence from it to extend pre-trial detention. Because of the large amount of data, automated tools can be useful to scan a drive for interesting files. Files containing passwords are especially interesting because they can provide access to extra data. This research focuses on the probability that a certain input string is a password. There has been a lot of research on the strength of passwords [1][2][3] but little to no research has been done on the probability that a string could be a password.
In theory, every string that holds to the requirements of the system enforcing the passwords, could be a password. Because of this, there is no way to know for sure whether a string is a password or not. Because of this, the purpose of this research is to get to a probability that a string is a password.
The main question for this research is:
  • How can software calculate the probability of an input string being a password?
The research question can be divided into multiple sub- questions:
  1. What characteristics differentiate a password from ’regular’ text?
  2. How can these characteristics be used to come to an algorithm that defines a probability of a string being a password?
Zeno Geradts <Z.J.M.H.Geradts=>uva.nl>

Tiko Huizinga <tiko.huizinga=>os3.nl>


The development of a contained and user emulated malware assessment platform.

Using common tools such as Puppet, Docker or other mass-deployment solutions create a Windows and Linux blended solution that enables the automatic creation of a virtualized test lab for the evaluation of a potential malware across multiple Antivirus (A/V) products concurrently and securely. This does not involve analysis of the potential malware in a sandbox such as Cuckoo sandbox but the evaluation of an executable across multiple free and commercial A/V products.

Area of expertise: Red Teaming Operations
Vincent van Mieghem <vvanmieghem=>deloitte.nl>
Henri Hambartsumyan <HHambartsumyan=>deloitte.nl>

Siebe Hodzelmans <siebe.hodzelmans=>os3.nl>
Frank Potter <Frank.Potter=>os3.nl>


SURFwireless availability analysis.

h4> Since 2016 SURFNET offers Wi-Fi as a service. This includes the tender process, Wi-Fi measurements, planning the location of the access points, maintenance, monitoring, and the security of the Wi-Fi network. The goal of this research project is to investigate whether Wireless intrusion prevention systems can be used to protect the Wi-Fi network of SURF against well known attacks like rogue access points, mitigation, encryption cracking and what measures can SURF take against these kind of attacks.

Request questions:
  • Which common Wi-Fi attacks can be used to threaten SURFwireless?
  • How can SURFNET detect that the availability of the SURFwireless service is under threat and estimate its impact?
  • What measures can SURFnet take to improve the availability of SURFwireless?
Frans Panken <frans.panken=>surfnet.nl>

Kasper VanBrakel <Kasper.vanBrakel=>os3.nl>


In-band telemetry with P4.

The UvA is collaborating with SURFnet, UTwente and SIDN to create a P4 nationwide experimental environment, as part of the 2STiC initiative. In this project we want to investigate how to use INT (In-band Network Telemetry) specification in a distributed P4 (www.p4.org <http://www.p4.org>) testbed. See for additional information on INT and P4: https://p4.org/assets/INT-current-spec.pdf

The specific usecase we consider is flow measurements, namely end-to-end latency and throughput over time. The research will address the following challenges:
  • The specification does not prescribe where the INT tag should be inserted in the packets. We will determine the most suitable design for tag INT tag insertion as function of the considered usecase.
  • We will investigate the role and the optimal behavior of the INT sinks, ie the elements that extract the information from the packets.
  • We will develop an initial implementation and evaluate its performance in the testbed.
Joseph Hill <j.d.hill=>uva.nl>
Paola Grosso <p.grosso=>uva.nl>

Siebe Hodzelmans <siebe.hodzelmans=>os3.nl>


Scaling the configuration of AMS-IX Route Servers.

The route servers are a vital part of an IXP ecosystem, a central component that allows the exchange of prefixes between IXP peers without establishing hundreds of BGP sessions. In AMS-IX, with more than 700 established peers and 350.000 prefixes, the configuration of the Route Servers becomes harder and the current toolset has reached its scalability limitations. While these numbers grow significantly year by year, the demand for a new framework that can push configuration in a more scalable approach becomes more critical. The new framework should use modern tools like container technology and python templates with the Large Installation & Administration techniques in order to cover our engineering needs. At the same time, the solution should cover the customer requirements for fast re-configuration, integrity and accuracy. The background knowledge fields that are required for this project are the Networking track of OS3 and especially the BGP protocol, the system engineering and virtualisation technologies, which will be glued together with LIA techniques and Python scripting. We advice the assignment of 2 students as we require a small PoC in the given timeframe.
Stavros Konstantaras <stavros.konstantaras=>ams-ix.net>

David Garay <David.Garay=>os3.nl>


The Serval Project;

Making a low-cost, scalable tsunami and all-hazards warning system with integrated FM radio transmitter.

The Sulawesi earthquake reminded us of the significant gap that exists between the generation of tsunami (and other hazard) warnings, that works quite well, and the means of getting those warnings out to those in the small isolated coastal communities that need them.; There is a need for a low-cost and scalable solution to providing early warning capabilities. Such a system also needs to be useful year-round, so that it will be maintained and work when needed. For this reason, we are building a FM radio juke-box into the system.; In this project, you will help to advance this project from proof-of-concept to prototype stage, through assisting with the generation of the radio juke-box client software as well as the back-end middle-ware for feeding alerts to the satellite up-link, and receiving them on the terminal equipment.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>


The Serval Project;

Shaking down the Serval Mesh Extender

The Serval Mesh Extender is a ruggedised solar-powered peer-to-peer mesh communications system that allows isolated communities to have local communications, and through interconnection to HF digital radios and other means, to connect those communities together.; The system now largely works, but effort is required to more thoroughly test the system under realisitic conditions, and to document, and then eliminate software issues that interfere with the efficient operation of the system.; This will occur through interaction with a remotely accessed semi-automatic test bed network.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>


The Serval Project;

Security through Simpllicity: Creating an open-source smart-phone like device.

SPECTRE and MELTDOWN, and the 20 years it took to discover the vulnerabilities, have unambiguously shown that the complexity of modern computing devices has grown to the point where verification of security is simply impossible. Yet we still have need for strong assurances of security to support many uses of modern technology.; However, things were not always like this. Computers of the 80s and 90s were simple enough that both hardware and software could be verified. Therefore we are creating an open-source smart-phone like device based on an improved evolution of the well known Commodore 64 architecture implemented in FPGA.; We have working bench-top prototypes, and are moving to prototype hardware, and are looking for both IT/CS as well as electronic engineering students to help move the project to prototype stage, and to test the hardware, and implement the software, so that we can have usable test devices in 2019.
Paul Gardner-Stephen <paul.gardner-stephen=>flinders.edu.au>


Cross-blockchain oracle.

Interconnection between different blockchain instances, and smart contracts residing on those, will be essential for a thriving multi-blockchain business ecosystem. Technologies like hashed timelock contracts (HTLC) enable atomic swaps of cryptocurrencies and tokens between blockchains. A next challenge is the cross-blockchain oracle, where the status of an oracle value on one blockchain enables or prevents a transaction on another blockchain.
The goal of this research project is to explore the possibilities, impossibilities, trust assumptions, security and options for a cross-blockchain oracle, as well as to provide a minimal viable implementation.
Oskar van Deventer <oskar.vandeventer=>tno.nl>
Maarten Everts <maarten.everts=>tno.nl>


Apple File System (APFS).

Apple recently introduced APFS with their latest version of OS X, Sierra. The new file system comes with some interesting new features that either pose challenges or opportunities for digital forensics. The goal in this project is to pick one or more relevant features (i.e. encryption, nanosecond timestamps, flexible space allocation, snapshot/cloning, etc.) and reverse engineer their inner workings to come up with a proof-of-concept parsing tool that provides useful input for forensic investigations of Apple systems.
Gina Doekhie <gina.doekhie=>fox-it.com>


Man vs the Machine.

Machine Learning has advanced to the point where our computer systems can detect malicious activity through baselining of large volumes of data and picking out the anomalies and non-conformities.; As an example, the finance sector has been using machine learning to detect fraudulent transactions and has been very successful at minimizing the impact of stolen credit card numbers over the past few years.;
  • As we further leverage machine learning and other advance analytics to improve cyber security detection in other industries, what does the role of a cybersecurity analyst evolve into?
  • What are the strengths of machine learning?;
  • What are its weaknesses?; What activities remain after machine learning?;
  • How and when does AI come into the picture?;
  • What are the key skills needed to still be relevant?;
  • What emerging technologies are contributing to the change?;
  • What do new individuals entering into cyber security focus on?;
  • And what do existing cyber security professionals develop to stay current?;
  • What will the industry look like in 2 years?; 5 years? 10+ years?
Rita Abrantes <Rita.Abrantes=>shell.com>


IoT DOS prevention and corporate responsibility.

The Dyn DOS attacks shows a fundamental problem in internet connected devices. Huge swathes of unpatched and improperly configured devices with access to high bandwidth are misused to bring down inter; What technical prevention and detection methods can organizations employ to make sure that they are not a contributor to this problem? And what can they do once it does appear they are inadvertently contributing to this problem? This would focus on literary research combining research in DoS prevention, asset management, patch management and network monitoring.
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>

Swann Scholtes <Swann.Scholtes=>os3.nl>


Penetration test dashboarding.

A penetration test is a difficult thing for both penetration tester as the penetration tested. How does the penetration tested really know what is going in their penetration test, and keep in control? How can the tester himself stay up to date on what his/her team members are actually doing?

Penetration testing is a creative process, but it can be dashboarded to some degree. The data is there – mostly in log files – but it requires an extraordinary amount to make this log data understandable in human terms. But, making this understandable can be an automated process using penetration test tooling. But – what information is in fact required to be displayed in this dashboard, and what is the best way of actually showing this data? This research would combine literary research and interviews with the development of a small proof-of-concept.
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>
Tiko Huizinga <Tiko.Huizinga=>os3.nl>


Forensic investigation of wearables.

Wearables and especially smartwatches are an unfamiliar area in information risk because of their novelty. At the moment, primary concerns are aimed at privacy issues, but not at information risk issues. However, these devices are an additional display for certain sensitive data (i.e. executive mail, calendars and other notifications), but are not necessarily covered by organizations' existing mobile security processes and technology. In addition, it is often simply much easier to steal a watch or another wearable than it is to steal a phone.

This research focuses on the following question: what value could a wearable have to cyber criminals when it is stolen? What is the data that gets 'left behind' on smartwatches in case of theft, and what information risks do they pose?
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>


Security evaluation of glucose monitoring applications for Android smartphones.

With the “recent” health trend of fitness apps and hardware such as fitbits, combined with the /need/ to share results with friends, family and the world through facebook, runkeeper, strava and other sites we have entered into an era of potential cyber unhealthiness. What potentially valuable information could be retrieved from the web or through bluetooth about people to influence health insurance rates of individuals? Note: this is a broad question, and it is up to the student to choose his/her own liking (e.g. focus on bluetooth security of fitbits/mi’s; identification of individuals through strava/runkeeper posts; quantifying the public sharing of health information; etc. etc.).
Ruud Verbij <Verbij.Ruud=>kpmg.nl>

Roy Vermeulen <rvermeulen=>os3.nl>
Edgar Bohte <ebohte=>os3.nl>


Inventory of smartcard-based healthcare identification solutions in Europe and behond: technology and adoption.

For potential international adoption of Whitebox technology in the future, in particular the technique of patients carrying authorization codes with them to authorize healthcare professionals, we want to make an inventory of the current status of healthcare PKIs and smartcard technology in Europe and if possible also outside Europe.

Many countries have developed health information exchange systems over the last 1-2 decades, most of them without much regard of what other countries are doing, or of international interoperability. However, common to most systems developed today is the development of a (per-country) PKI for credentials, typically smartcards, that are provided to healthcare professionals to allow the health information exchange system to identify these professionals, and to establish their 'role' (or rather: the speciality of a doctor, such as GP, pharmacist, gyneacologist, etc.). We know a few of these smartcard systems, e.g., in Austria and France, but not all of them, and we do not know their degree of adoption.

In this project, we would like students to enquire about and report on the state of the art of healthcare smartcard systems in Europe and possibly outside Europe (e.g., Asia, Russia):
  • what products are rolled out by what companies, backed by what CAs (e.g., governmental, as is the case with the Dutch "UZI" healthcare smartcard)?
  • Is it easy to obtain the relevant CA keys?
  • And what is the adoption rate of these smartcards under GPs, emergency care wards, hospitals, in different countries?
  • What are relevant new developments (e.g., contactless solutions) proposed by major stakeholders or industry players in the market?
Note that this project is probably less technical than usual for an SNE student, although it is technically interesting. For comparison, this project may also be fitting for an MBA student.

For more information, see also (in Dutch): https://whiteboxsystems.nl/sne-projecten/#project-2-onderzoek-adoptie-health-smartcards-in-europa-en-daarbuiten
General introduction
Whitebox Systems is a UvA spin-off company working on a decentralized system for health information exchange. Security and privacy protection are key concerns for the products and standards provided by the company. The main product is the Whitebox, a system owned by doctors (GPs) that is used by the GP to authorize other healthcare professionals so that they - and only they - can retrieve information about a patient when needed. Any data transfer is protected end-to-end; central components and central trust are avoided as much as possible. The system will use a published source model, meaning that although we do not give away copyright, the code can be inspected and validated externally.

The Whitebox is currently transitioning from an authorization model that started with doctor-initiated static connections/authorizations, to a model that includes patient-initiated authorizations. Essentially, patients can use an authorization code (a kind of token) that is generated by the Whitebox, to authorize a healthcare professional at any point of care (e.g., a pharmacist or a hospital). Such a code may become part of a referral letter or a prescription. This transition gives rise to a number of interesting questions, and thus to possible research projects related to the Whitebox design, implementation and use. Two of these projects are described below. If you are interested in these project or have questions about other possibilities, please contact <guido=>whiteboxsystems.nl>.

For a more in-depth description of the projects below (in Dutch), please see https://whiteboxsystems.nl/sne-projecten/
Guido van 't Noordende <g.j.vantnoordende=>uva.nl>


Decentralized trust and key management.

Currently, the Whitebox provides a means for doctors (General Practitioner GPs) to establish static trusted connections with parties they know personally. These connections (essentially, authenticated TLS connections with known, validated keys), once established, can subsequently be used by the GP to authorize the party in question to access particular patient information. Examples are static connections to the GP post which takes care of evening/night and weekend shifts, or to a specific pharmacist. In this model, trust management is intuïtive and direct. However, with dynamic authorizations established by patients (see general description above), a question comes up on whether the underlying (trust) connections between the GP practice (i.e., the Whitebox) and the authorized organization (e.g,. hospital or pharmacist) may be re-usable as a 'trusted' connection by the GP in the future.

The basis question is:
  • what is the degree of trust a doctor can place in (trust) relations that are established by this doctor's patients, when they authorize another healthcare professional?
More in general:
  • what degree of trust that can be placed in relations/connections established by a patient, also in view of possible theft of authorization tokens held by patients?
  • What kind of validation methods can exist for a GP to increase or validate a given trust relation implied by an authorization action of a patient?
Perhaps the problem can be raised to a higher level also: can (public) auditing mechanisms -- for example, using block chains -- be used to help establish and validate trust in organizations (technically: keys of such organizations), in systems that implement decentralized trust-base transactions, like the Whitebox system does?

In this project, the student(s) may either implement part of a solution or design, or model the behavior of a system inspired by the decentralized authorization model of the Whitebox.

As an example: reputation based trust management based on decentralized authorization actions by patients of multiple doctors may be an effective way to establish trust in organization keys, over time. Modeling trust networks may be an interesting contribution to understanding the problem at hand, and could thus be an interesting student project in this context.

NB: this project is a rather advanced/involved design and/or modelling project. Students should be confident on their ability to understand and design/model a complex system in the relatively short timeframe provided by an RP2 project -- this project is not for the faint of heart. Once completed, an excellent implementation or evaluation may become the basis for a research paper.

See also (in Dutch): https://whiteboxsystems.nl/sne-projecten/#project-2-ontwerp-van-een-decentraal-vertrouwensmodel
General introduction
Whitebox Systems is a UvA spin-off company working on a decentralized system for health information exchange. Security and privacy protection are key concerns for the products and standards provided by the company. The main product is the Whitebox, a system owned by doctors (GPs) that is used by the GP to authorize other healthcare professionals so that they - and only they - can retrieve information about a patient when needed. Any data transfer is protected end-to-end; central components and central trust are avoided as much as possible. The system will use a published source model, meaning that although we do not give away copyright, the code can be inspected and validated externally.

The Whitebox is currently transitioning from an authorization model that started with doctor-initiated static connections/authorizations, to a model that includes patient-initiated authorizations. Essentially, patients can use an authorization code (a kind of token) that is generated by the Whitebox, to authorize a healthcare professional at any point of care (e.g., a pharmacist or a hospital). Such a code may become part of a referral letter or a prescription. This transition gives rise to a number of interesting questions, and thus to possible research projects related to the Whitebox design, implementation and use. Two of these projects are described below. If you are interested in these project or have questions about other possibilities, please contact <guido=>whiteboxsystems.nl>.

For a more in-depth description of the projects below (in Dutch), please see https://whiteboxsystems.nl/sne-projecten/
Guido van 't Noordende <g.j.vantnoordende=>uva.nl>



LDBC Graphalytics.

LDBC Graphalytics, is a mature, industrial-grade benchmark for graph-processing platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance.

Until recently, graph processing used only common big data infrastructure, that is, with much local and remote memory per core and storage on disk. However, operating separate HPC and big data infrastructures is increasingly more unsustainable. The energy and (human) resource costs far exceed what most organizations can afford. Instead, we see a convergence between big data and HPC infrastructure.
For example, next-generation HPC infrastructure includes more cores and hardware threads than ever-before. This leads to a large search space for application-developers to explore, when adapting their workloads to the platform.

To take a step towards a better understanding of performance for graph processing platforms on next-generation HPC infrastructure, we would like to work together with 3-5 students on the following topics:
  1. How to configure graph processing platforms to efficiently run on many/multi-core devices, such as the Intel Knights Landing, which exhibits configurable and dynamic behavior?
  2. How to evaluate the performance of modern many-core platforms, such as the NVIDIA Tesla?
  3. How to setup a fair, reproducible experiment to compare and benchmark graph-processing platforms?
Alex Uta <a.uta=>vu.nl>
Marc X. Makkes <m.x.makkes=>vu.nl>


Normal traffic flow information distribution to detect malicious traffic.

In the era of an increasingly encrypted communication it is getting harder to distinguish normal from malicious traffic. Deep packet inspection is no longer an option, unless the trusted certificate store of the monitored clients is altered. However, Netflow data might still be able to provide relevant information about the parties involved in the communication and the traffic volumes they exchange. So would it be possible to tell apart ill-intentioned traffic by looking only at the flows and using a little help from the content providers, like for example website owners and mobile application vendors?

The basic idea is to research a framework or a data interchange format between the content providers, described above, and the monitoring devices. Both in the case of a website and a mobile application such a description can be used to list the authorised online resources that should be used and what is the relative distribution of the traffic between them. If such a framework proves to be successful, it can help in alerting for covert channel malware communication, cross-site scripting and all other types of network communication not initially intended by the original content provider.


Smart performance information discovery for Cloud resources.

The selection of virtual machines (VMs) must account for the performance requirements of applications (or application components) to be hosted on them. The performance of components on specific types of VM can be predicted based on static information (e.g. CPU, memory and storage) provided by cloud providers, however the provisioning overhead for different VM instances and the network performance in one data centre or across different data centres is also important. Moreover, application-specific performance cannot always be easily derived from this static information.

An information catalogue is envisaged that aims to provide a service that can deliver the most up to date cloud resource information to cloud customers to help them use the Cloud better. The goal of this project will be to extend earlier work [1], but will focus on smart performance information discovery. The student will:
  1. Investigate the state of the art for cloud performance information retrieval and cataloguing.
  2. Propose Cloud performance metadata, and prototype a performance information catalogue.
  3. Customize and integrate an (existing) automated performance collection agent with the catalogue.
  4. Enable smart query of performance information from the catalogue using certain metadata.
  5. (Optional) Test the results with the use cases in on-going EU projects like SWITCH.
Some reading material:
  1. Elzinga, O., Koulouzis, S., Hu, Y., Wang, J., Zhou, H., Martin, P., Taal, A., de Laat, C., and Zhao, Z (2017), Automatic collector for dynamic cloud performance Information, IEEE Networking, Architecture and Storage (NAS), Shenzheng, China, Auguest 7-8, 2017 https://doi.org/10.1109/NAS.2017.8026845
More info: Arie Taal, Paul Martin, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>


Network aware performance optimization for Big Data applications using coflows.

Optimizing data transmission is crucial to improve the performance of data intensive applications. In many cases, network traffic control plays a key role in optimising data transmission especially when data volumes are very large. In many cases, data-intensive jobs can be divided into multiple successive computation stages, e.g., in MapReduce type jobs. A computation stage relies on the outputs of the the previous stage and cannot start until all its required inputs are in place. Inter-stage data transfer involves a group of parallel flows, which share the same performance goal such as minimising the flow's completion time.

CoFlow is an application-aware network control model for cluster-based data centric computing. The CoFlow framework is able to schedule the network usage based on the abstract application data flows (called coflows). However, customizing CoFlow for different application patterns, e.g., choosing proper network scheduling strategies, is often difficult, in particular when the high level job scheduling tools have their own optimizing strategies.

The project aims to profile the behavior of CoFlow with different computing platforms, e.g., Hadoop and Spark etc.
  1. Review the existing CoFlow scheduling strategies and related work
  2. Prototyping test applications using; big data platforms (including Apache Hadoop, Spark, Hive, Tez).
  3. Set up coflow test bed (Aalo, Varys etc.) using existing CoFlow installations.
  4. Benchmark the behavior of CoFlow in different application patterns, and characterise the behavior.
Background reading:
  1. CoFlow introduction: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-211.pdf
  2. Junchao Wang, Huan Zhouy, Yang Huz, Cees de Laatx and Zhiming Zhao, Deadline-Aware Coflow Scheduling in a DAG, in NetCloud 2017, Hongkong, to appear [upon request]
More info: Junchao Wang, Spiros Koulouzis, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>


Elastic data services for time critical distributed workflows.

Large-scale observations over extended periods of time are necessary for constructing and validating models of the environment. Therefore, it is necessary to provide advanced computational networked infrastructure for transporting large datasets and performing data-intensive processing. Data infrastructures manage the lifecycle of observation data and provide services for users and workflows to discover, subscribe and obtain data for different application purposes. In many cases, applications have high performance requirements, e.g., disaster early warning systems.

This project focuses on data aggregation and processing use-cases from European research infrastructures, and investigates how to optimise infrastructures to meet critical time requirements of data services, in particular for different patterns of data-intensive workflow. The student will use some initial software components [1] developed in the ENVRIPLUS [2] and SWITCH [3] projects, and will:
  1. Model the time constraints for the data services and the characteristics of data access patterns found in given use cases.
  2. Review the state of the art technologies for optimising virtual infrastructures.
  3. Propose and prototype an elastic data service solution based on a number of selected workflow patterns.
  4. Evaluate the results using a use case provided by an environmental research infrastructure.
  1. https://staff.fnwi.uva.nl/z.zhao/software/drip/
  2. http://www.envriplus.eu
  3. http://www.switchproject.eu
More info: —Spiros Koulouzis, Paul Martin, Zhiming Zhao
Zhiming Zhao <z.zhao=>uva.nl>


Contextual information capture and analysis in data provenance.

Tracking the history of events and the evolution of data plays a crucial role in data-centric applications for ensuring reproducibility of results, diagnosing faults, and performing optimisation of data-flow. Data provenance systems [1] are a typical solution, capturing and recording the events generated in the course of a process workflow using contextual metadata, and providing querying and visualisation tools for use in analysing such events later.

Conceptual models such as W3C PROV (and extensions such as ProvONE), OPM and CERIF have been proposed to describe data provenance, and a number of different solutions have been developed. Choosing a suitable provenance solution for a given workflow system or data infrastructure requires consideration of not only the high-level workflow or data pipeline, but also performance issues such as the overhead of event capture and the volume of provenance data generated.

The project will be conducted in the context of EU H2020 ENVRIPLUS project [1, 2]. The goal of this project is to provide practical guidelines for choosing provenance solutions. This entails:
  1. Reviewing the state of the art for provenance systems.
  2. Prototyping sample workflows that demonstrate selected provenance models.
  3. Benchmarking the results of sample workflows, and defining guidelines for choosing between different provenance solutions (considering metadata, logging, analytics, etc.).
  1. About project: http://www.envriplus.eu
  2. Provenance background in ENVRIPLUS: https://surfdrive.surf.nl/files/index.php/s/uRa1AdyURMtYxbb
  3. Michael Gerhards, Volker Sander, Torsten Matzerath, Adam Belloum, Dmitry Vasunin, and Ammar Benabdelkader. 2011. Provenance opportunities for WS-VLAM: an exploration of an e-science and an e-business approach. In Proceedings of the 6th workshop on Workflows in support of large-scale science (WORKS '11). http://dx.doi.org/10.1145/2110497.2110505
More info: - Zhiming Zhao, Adam Belloum, Paul Martin
Zhiming Zhao <z.zhao=>uva.nl>


Profiling Partitioning Mechanisms for Graphs with Different Characteristics.

In computer systems, graph is an important model for describing many things, such as workflows, virtual infrastructures, ontological model etc. Partitioning is an frequently used graph operation in the contexts like parallizing workflow execution, mapping networked infrastructures onto distributed data centers [1], and controlling load balance of resources. However, developing an effective partition solution is often not easy; it is often a complex optimization issue involves constraints like system performance and cost constraints.;

A comprehensive benchmark on graph partitioning mechanisms is helpful to choose a partitioning solver for a specific model. This portfolio can also give advices on how to partition based on the characteristics of the graph. This project aims at benchmarking the existing partition algorithms for graphs with different characteristics, and profiling their applicability for specific type of graphs.;
This project will be conducted in the context of EU SWITCH [2] project. the students will:
  1. Review the state of the art of the graph partitioning algorithms and related tools, such as Chaco, METIS and KaHIP, etc.
  2. Investigate how to define the characteristics of a graph, such as sparse graph, skewed graph, etc. This can also be discussed with different graph models, like planar graph, DAG, hypergraph, etc.
  3. Build a benchmark for different types of graphs with various partitioning mechanisms and find the relationship behind.;
  4. Discuss about how to choose a partitioning mechanism based on the graph characteristics.
Reading material:
  1. Zhou, H., Hu Y., Wang, J., Martin, P., de Laat, C. and Zhao, Z., (2016) Fast and Dynamic Resource Provisioning for Quality Critical Cloud Applications, IEEE International Symposium On Real-time Computing (ISORC) 2016, York UK http://dx.doi.org/10.1109/ISORC.2016.22
  2. SWITCH: www.switchproject.eu

More info: Huan Zhou, Arie Taal, Zhiming Zhao

Zhiming Zhao <z.zhao=>uva.nl>


Auto-Tuning for GPU Pipelines and Fused Kernels.

Achieving high performance on many-core accelerators is a complex task, even for experienced programmers. This task is made even more challenging by the fact that, to achieve high performance, code optimization is not enough, and auto-tuning is often necessary. The reason for this is that computational kernels running on many-core accelerators need ad-hoc configurations that are a function of kernel, input, and accelerator characteristics to achieve high performance. However, tuning kernels in isolation may not be the best strategy for all scenarios.

Imagine having a pipeline that is composed by a certain number of computational kernels. You can tune each of these kernels in isolation, and find the optimal configuration for each of them. Then you can use these configurations in the pipeline, and achieve some level of performance. But these kernels may depend on each other, and may also influence each other. What if the choice of a certain memory layout for one kernel causes performance degradation on another kernel?

One of the existing optimization strategies to deal with pipelines is to fuse kernels together, to simplify execution patterns and decrease overhead. In this project we aim to measure the performance of accelerated pipelines in three different tuning scenarios:
  1. tuning each component in isolation,
  2. tuning the pipeline as a whole, and
  3. tuning the fused kernel. Measuring the performance of one or more pipelines in these scenarios we hope to, on one level, being able to determine which is the best strategy for the specific pipelines on different hardware platform, and on another level we hope to better understand which are the characteristics that influence this behavior.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>


Speeding up next generation sequencing of potatoes

Genotype and single nucleotide polymorphism calling (SNP) is a technique to find bases in next-generation sequencing data that differ from a reference genome. This technique is commonly used in (plant) genetic research. However, most algorithms focus on allowing calling in diploid heterozygous organisms (specifically human) only. Within the realm of plant breeding, many species are of polyploid nature (e.g. potato with 4 copies, wheat with 6 copies and strawberry with eight copies). For genotype and SNP calling in these organisms, only a few algorithms exist, such as freebayes (https://github.com/ekg/freebayes). However, with the increasing amount of next generation sequencing data being generated, we are noticing limits to the scalability of this methodology, both in compute time and memory consumption (>100Gb).

We are looking for a student with a background in computer science, who will perform the following tasks:

  • Examine the current implementation of the freebayes algorithm
  • Identify bottlenecks in memory consumption and compute performance
  • Come up with an improved strategy to reduce memory consumption of the freebayes algorithm
  • Come up with an improved strategy to execute this algorithm on a cluster with multiple CPU’s or on GPU/s (using the memory of multiple compute nodes)
  • Implement an improved version of freebayes, according to the guidelines established above
  • Test the improved algorithm on real datasets of potato.

This is a challenging master thesis project on an important food crop (potato) on a problem which is relevant for both science and industry. As part of the thesis, you will be given the opportunity to present your progress/results to relevant industrial partners for the Dutch breeding industry.

Occasional traveling to Wageningen will be required.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>


Auto-tuning for Power Efficiency.

Auto-tuning is a well-known optimization technique in computer science. It has been used to ease the manual optimization process that is traditionally performed by programmers, and to maximize the performance portability. Auto-tuning works by just executing the code that has to be tuned many times on a small problem set, with different tuning parameters. The best performing version is than subsequently used for the real problems. Tuning can be done with application-specific parameters (different algorithms, granularity, convergence heuristics, etc) or platform parameters (number of parallel threads used, compiler flags, etc).

For this project, we apply auto-tuning on GPUs. We have several GPU applications where the absolute performance is not the most important bottleneck for the application in the real world. Instead the power dissipation of the total system is critical. This can be due to the enormous scale of the application, or because the application must run in an embedded device. An example of the first is the Square Kilometre Array, a large radio telescope that currently is under construction. With current technology, it will need more power than all of the Netherlands combined. In embedded systems, power usage can be critical as well. For instance, we have GPU codes that make images for radar systems in drones. The weight and power limitations are an important bottleneck (batteries are heavy).

In this project, we use power dissipation as the evaluation function for the auto-tuning system. Earlier work by others investigated this, but only for a single compute-bound application. However, many realistic applications are memory-bound. This is a problem, because loading a value from the L1 cache can already take 7-15x more energy than an instruction that only performs a computation (e.g., multiply).

There also are interesting platform parameters than can be changed in this context. It is possible to change both core and memory clock frequencies, for instance. It will be interesting to if we can at runtime, achieve the optimal balance between these frequencies.

We want to perform auto-tuning on a set of GPU benchmark applications that we developed.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>


Fast Data Serialization and Networking for Apache Spark.

Apache Spark is a system for large-scale data processing used for Big Data applications business applications, but also in many scientific applications. Spark uses Java (or Scala) object serialization to transfer data over the network. Especially if data fits in memory, the performance of serialization is the most important bottleneck in Spark applications. Spark currently offers two mechanisms for serialization: Standard Java object serialization and Kryo serialization.

In the Ibis project (www.cs.vu.nl/ibis), we have developed an alternative serialization mechanism for high-performance computing applications that relies on compile-time code generation and zero-copy networking for increased performance. Performance of JVM serialization can also be compared with benchmarks: https://github.com/eishay/jvm-serializers/wiki. However, we also want to evaluate if we can increase Spark performance at the application level by using out improved object serialization system. In addition, our Ibis implementation can use fast local networks such as Infiniband transparently. We also want to investigate if using specialized networks increases application performance. Therefore, this project involves extending Spark with our serialization and networking methods (based on existing libraries), and on analyzing the performance of several real-world Spark applications.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>
Kees de Jong <kees.dejong=>os3.nl>
Anas Younis <anas.younis=>os3.nl>


Applying and Generalizing Data Locality Abstractions for Parallel Programs.

TIDA is a library for high-level programming of parallel applications, focusing on data locality. TIDA has been shown to work well for grid-based operations, like stencils and convolutions. These are in an important building block for many simulations in astrophysics, climate simulations and water management, for instance. The TIDA paper gives more details on the programming model.

This projects aims to achieve several things and answer several research questions:

TIDA currently only works with up to 3D. In many applications we have, higher dimensionalities are needed. Can we generalize the model to N dimensions?
The model currently only supports a two-level hierarchy of data locality. However, modern memory systems often have many more levels, both on CPUs and GPUs (e.g., L1, L2 and L3 cache, main memory, memory banks coupled to a different core, etc). Can we generalize the model to support N-level memory hierarchies?
The current implementation only works on CPUs, can we generalize to GPUs as well?
Given the above generalizations, can we still implement the model efficiently? How should we perform the mapping from the abstract hierarchical model to a real physical memory system?

We want to test the new extended model on a real application. We have examples available in many domains. The student can pick one that is of interest to her/him.
Rob van Nieuwpoort <R.vanNieuwpoort=>uva.nl>


A Deep Dive into the Dark Web.


Every now and then you encounter claims that the 'surface' web is about 4% of the internet and the deep web is about 96% of the internet. Many 'infographs' are made to illustrate this, and it is a popular believe, see:
<https://www.google.nl/search?q=surface+web+4%25+deep+web+96%26&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjLp8e_nNTWAhVFU1AKHdmJDVoQ_AUICigB&biw=1689&bih=922 >.

However, these claims seem to originate from a white paper released in 2001 with the following claims <https://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0007.104>:
  • Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.
  • The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web.
The goal of this research project is to determine how large the dark web currently is either in absolute size or compared to the 'surface' web. Other focus points can be regarding different definitions of 'surface', 'dark' and 'deep' web, and how the size, popularity and/or definition of the dark web has developed itself since 2001.
Stijn van Winsen <vanWinsen.Stijn=>kpmg.nl>
Soufiane el Aissaoui <elAissaoui.Soufiane=>kpmg.nl>

Coen Schuijt <Coen.Schuijt=>os3.nl>


Ethereum Smart Contract Fuzz Testing.

An Ethereum smart contract can be seen as a computer program that runs on the Ethereum Virtual Machine (EVM), with the ability to accept, hold and transfer funds programmatically. Once a smart contract has been place on the blockchain, it can be executed by anyone. Furthermore, many smart contracts accept user input. Because smart contracts operate on a cryptocurrency with real value, security of smart contracts is of the utmost importance. I would like to create a smart contract fuzzer that will check for unexpected behaviour or crashes of the EVM. Based on preliminary research, such a fuzzer does not exist yet.
Rodrigo Marcos <rodrigo.marcos=>secforce.com>


Client-side Attacks on the LastPass Browser Extension.

Tools exist for the extraction of credentials for certain popular password managers (such as KeePass 2.x, the Chrome password manager, etc.). During redteam projects where a cyber attack is simulated, we make use of tooling that can extract credentials from memory (e.g. KeeThief for KeePass 2.x).
However, similar tooling appears to be missing for older KeePass 1.x databases and other popular password managers including PasswordSafe, 1Password and LastPass. We are looking to investigate which protection mechanisms these password managers employ, and whether it is possible to extract credentials in the same way. Both solutions for offline usage and online usage are of interest (especially if a desktop client is available).
Cedric Van Bockhaven <cvanbockhaven=>deloitte.nl>

Derk Barten <Derk.Barten=>os3.nl>


TCP tunneling over Citrix.

Citrix provides services for remote virtual desktop infrastructure (VDI / Xen Desktop) or application virtualization (XenApp). Citrix is sometimes used as a security measure to sandbox the execution of sensitive applications (e.g. so a financial application that may only be run from a single server, with the users that require the access connecting to the virtual desktop). The organization then sets additional restrictions: no access to clipboard data, no access to shared drives, and no outbound connectivity that is allowed to prevent data leaks.
Citrix is built on top of traditional Windows technologies such as RDP to establish the connection to the virtualized desktop infrastructure. RDP has the capability to extend the remote desktop session with clipboard management, attaching of printers and sound devices, and drive mapping. Additionally, it is possible to create plugins to provide other functionalities.

The rdp2tcp project features the possibility to tunnel TCP connections (TCP forwarding) over a remote desktop session. This means no extra ports have to be opened.
We would like to investigate whether it is possible to establish a TCP tunnel over a Citrix virtual desktop session. This would allow routing of traffic through the Citrix server, potentially providing the ability to move laterally through the network in order to access systems connected to the Citrix server (that are not directly exposed to the Internet).
Cedric Van Bockhaven <cvanbockhaven=>deloitte.nl>




Bypassing Phishing Mail Filters (cont'd).

Email phishing is currently one of the most problematic threats in network security. By sending out emails that may be very similar to legitimate ones, attackers aim to harvest information by making users believe that they are communicating with a trusted entity. Although many techniques exist[3] to prevent phishing emails from reaching end users, studies[1] show that a lot of these techniques lack efficiency, are costly and too complex to be used in large environments or are simply not used. As Ammar Almomani et al. stated in their survey[1], the main technical approaches to counter email phishing are content analysis, network-level protection and authentication. The objective of this study is to examine which mechanisms are effectively used by spam filters against phishing attacks at a net- work and protocol level, and to determine how to bypass these mechanisms.

The main research question of this study is defined as follows:

;;; Which network and authentication aspects of phishing emails can be modified in order to bypass common spam filters?

In order to answer our main research question, we will have to answer a number of sub-questions:

;;; What network level protections and authentication mechanisms are com- monly used to prevent phishing attacks?
;;; Which of these protections can be found in spam filters?
;;; How efficient are these solutions?
;;; How efficient is reputation-based email filtering?

[1] Ammar Almomani et al. “A survey of phishing email filtering techniques”. In: IEEE communications surveys & tutorials 15.4 (2013), pp. 2070–2090.

[3] Ian Fette, Norman Sadeh, and Anthony Tomasic. “Learning to detect phishing emails”. In: Proceedings of the 16th international conference on World Wide Web. ACM. 2007, pp. 649–656.
Alex Stavroulakis <Stavroulakis.Alex=>kpmg.nl>
Marat Nigmatullin <Nigmatullin.Marat=>kpmg.nl>
Adrien Raulot <Adrien.Raulot=>os3.nl>


Characterization of a Cortex-M4 microcontroller with backside optical fault injection.

Riscure produces fault injection tooling. This includes lasers that are used to alter the intended behaviour of chips. The goal of this project is to analyze the effectiveness of one of our newer lasers on embedded systems. During this project you will characterize the impact of this laser on the intended behaviour of the targeted chip.
Niek Timmers <Timmers=>riscure.com>

Jasper Hupkens <jasper.hupkens=>os3.nl> Dominika Rusek <dominika.rusek=>os3.nl>


Invisible Internet Project - I2P.

Anonymity networks, such as Tor or I2P, were built to allow users to access network resources without revealing their identity [1,2]. This project will be aimed at theoretical research into existing attacks and how they hold up given I2P updates and current network size [3,4,5]. The fact that only little is known about the i2p network and due to its potential for future growth and public perception of it being the most secure solution compared to Tor and Freenet [6], results in our research questions:

  1. What are possible attacks against the I2P network?
  2. What is the feasibility of such attacks?

In this research, we will present a number of possible attacks against the i2p network. Specifically, the attacks that are able to deanonymize the i2p users and reveal their identities. We will be researching from the theoretical point of these attacks, and propose a mitigation mechanisms against them. Should time and ethical considerations allow, a proof of concept can supplement the research.

Ethical Consideration:

During our research, we will be looking on the i2p network and the way it works. In addition to that, we will be looking at the possible attacks from a theoretical point of view. To get a better understanding of the network, we may need to do some practical reconnaissance. However, this will be mostly passive in nature and no attack shall be attempted against the live I2P network. Therefore, we do not see ethical issues where any confidential or personal data might leak.

Some related work:

  1. https://www.cs.ucsb.edu/~chris/research/doc/raid13_i2p.pdf
  2. https://static.siccegge.de/pdfs/bachelor-thesis.pdf
  3. https://www.dailydot.com/debug/tor-freenet-i2p-anonymous-network/
  4. https://hal.inria.fr/file/index/docid/653136/filename/RR-7844.pdf
  5. https://geti2p.net/en/comparison/tor
  6. https://www.tandfonline.com/doi/full/10.1080/21642583.2017.1331770
  7. https://hal.inria.fr/hal-01238453/file/I2P-design-vs-performance-security.pdf

See also: http://www.dcssproject.net/i2p/

Henri Hambartsumyan <HHambartsumyan=>deloitte.nl>
Vincent van Mieghem <vvanmieghem=>deloitte.nl>
Fons Mijnen <fmijnen=>deloitte.nl>

Vincent Breider <vincent.breider=>os3.nl>
Tim de Boer <tim.deboer=>os3.nl>


A Comparative Security Evaluation for IPv4 and IPv6 Addresses.

We currently move to ever-greater deployment rates of IPv6. However, comparative IPv4 and IPv6 security evaluations in the past have shown that the security state of multihomed systems is often worse via IPv6 than via IPv4. In this research project, you will build an Internet measurement setup that identifies IPv4/IPv6 multihomed systems and measures their security state for IPv4 and IPv6 correspondingly. The scientific contribution of your work will then be in the evaluation and analysis of the collected data, especially in the context of prior work.

Suggested reading: Czyz, J., Luckie, M.J., Allman, M. and Bailey, M., 2016, February. Don't Forget to Lock the Back Door! A Characterization of IPv6 Network Security Policy. In NDSS.
Tobias Fiebig <T.Fiebig=>tudelft.nl>

Erik Lamers <erik.lamers=>os3.nl>
Vincent van der Eijk <Vincent.vanderEijk=>os3.nl>


Cloud based simulation and data assimilation architecture.

Airbus Leiden is developing services in the domain of Air Quality. For an operational service an efficient architecture is needed to run the simulations, and data assimilation. Currently the setup is based on a "traditional" MPI only architecture. However the execution of ensembles (for the data assimilation) and simulation (over large spatial domains) could benefit from more modern approaches like Dockers, Kubernetes and the like.
Aim of this project is to develop an cloud-technology based architecture for efficient (time and money) execution of the simulations and assimilations.
Sjaak Koot <j.koot=>airbusds.nl>





Automated acquisition of forensic artefacts in the cloud for incident response and digital forensics purposes.

With the rise of cloud computing usage across the world, cloud platforms become more interesting for attackers as well. As incident responders we need to be able to extract and acquire information from cloud platforms like AWS/Azure and Google Cloud Platform.
In this research, we would like to develop an automated process, which allows for forensically sound acquisition of artefacts from different cloud platforms. Part of the research is identification of available data across the different cloud providers as well as building a PoC that helps in automated acquisition of that data.
There is support from the management on this topic as well as dedicated technical knowledge for this topic. If you need more details you can send me an email.;
Korstiaan Stam <korstiaan.stam=>pwc.com>


Adaptive monitoring based on normality and normativity.

As humans know from common sense -- and cognitive studies confirm -- events are relevant to subjects when they are exceptional (for them) or when they (potentially) might have positive or negative impact on their desires or interests. The goal of this project is to investigate how to develop similar relevance mechanisms in computational settings in order to provide adaptive monitoring. Intuitively, the system needs to form an idea of normality from observations, and use it to evaluate whether and to what extent a new observation is exceptional. Second, the system should be provided with a reward model (possibly specified at design time, but that could be modified or refined dynamically) and use it to evaluate the potential impact of a new observation. Once implemented, these filters of relevance could be used for instance in a monitoring application to highlight to the user where to pay further attention. The target domains of such an application might be the most various, for instance networking, social systems, etc.; The objectives of this study are to:
  • - investigate computational models for relevance, drawing from existing literature (information theory, algorithmic information theory, simplicity theory, etc.)
  • - decide an application domain and settle upon an associated representational model
  • - develop the functions necessary for relevance, e.g. prototyping and reward model; and the mechanisms quantifying relevance
  • - build a prototype for the target application domain

  • Dessalles, J. L. (2013). Algorithmic simplicity and relevance. Algorithmic Probability and Friends, 7070 LNAI, 119–130.
  • Breuker, J. (1994). Components of problem solving and types of problems. A Future for Knowledge Acquisition, 867, 118–136.
  • Lindenmayer, D. B., & Likens, G. E. (2009). Adaptive monitoring: a new paradigm for long-term research and monitoring. Trends in Ecology and Evolution, 24(9), 482–486.
  • Domshlak, C., Hüllermeier, E., Kaci, S., & Prade, H. (2011). Preferences in AI: An overview. Artificial Intelligence, 175(7–8), 1037–1052.
Giovanni Sileno <G.Sileno=>uva.nl>


Smart contracts specified as contracts.

A common critique to smart contracts is that they are neither smart, neither contracts: they are nothing more than instructions to be executed in a distributed infrastructure. Then, if their specification has to be associated to a desired model of interaction between the parties, so that users can have a clearer understanding of the consequences of their execution, they should actually share primitives with legally binding contracts. The objectives of this project are:
- to reflect on and develop an acquisition interface based on normative positions, starting from existing logical frameworks as deontic logic and Hohfeldian positions;
  • to elaborate on the compilation of acquired models to an implementation level (e.g. solidity);
  • to elaborate on possible validation or verification methods.
  • Governatori, G., Idelberger, F., Milosevic, Z., Riveret, R., Sartor, G., & Xu, X. (2018). On legal contracts, imperative and declarative smart contracts, and blockchain systems. Artificial Intelligence and Law, 1–33.
  • Frantz, C. K., & Nowostawski, M. (2016). From Institutions to Code: Towards Automated Generation of Smart Contracts. IEEE International Workshops on Foundations and Applications of Self* Systems.
  • Sileno, G., Boer, A., & van Engers, T. (2014). On the Interactional Meaning of Fundamental Legal Concepts. In Proceedings of the 27th International Conference on Legal Knowledge and Information Systems (JURIX 2014) (Vol. FAIA 271, pp. 39–48).
Giovanni Sileno <G.Sileno=>uva.nl>


Cyber Theft through Social Engineering.

Social Engineering; is a means to gain access to people’s account and steal personal and confidential information.; This treat has become quite significant as people continue to build larger online footprints, mainly through social media platforms like Snapchat, Facebook, Twitter, & LinkedIn, but also by other resources such as email, ancestral repositories, & public tax repositories.; Corporate executives, key government officials, and celebrities are the ones we hear about in the news getting hacked by social engineering, but anyone can fall victim to this sort of exploit.; What can we do as individuals, as an organization, as a society to combat this threat?; What are the risks?; Where are we most exposed?; What systems typically fail during a socially engineered hack?; How do we prevent it from happening?; How do we recover after we’ve been hacked through social engineering?; Elements to explore include, but not limited to:
  • Personal responsibility and behaviors
  • Education
  • Government regulation
  • Corporate policies
  • Industry solutions
  • Others?
Rita Abrantes <Rita.Abrantes=>shell.com>


Detecting Blue Team analyses via Adwords.

During red teaming exercises it is of vital importance for the red team to know when the blue team has recognised their actions and are investigating their artefacts. Having such knowledge gives the red team the opportunity to either brace for impact, clean up their channels and lay low, change C2-channels or otherwise adjust their attacks to perhaps remain hidden for the blue team. This is all done in order to be a better sparring partner for the blue team and give them a better training. During our red team exercises we make use of many different ways of detecting blue team activities. As we believe the entire red teaming industry needs to improve we have open sourced some of these checks into our RedELK tooling(1). More info on our approach and details of RedELK here(2).

We are always looking for new novel ways for blue team detection. Lately, research was disclosed where adwords are used for this purpose (3). We want students to investigate the feasibility of using adwords for blue team activity, as well as have a fully working PoC. The analyses should include effectiveness as well as ease of setup and possibility of including into RedELK.

We are open to other novel ways for detection of blue team activities. This RP can easily be changed to your liking if you have another novel technique and hypothesis that fits the end goal. Get in touch if you have such an idea.

Students preferably already have experience in either offensive or defensive IT operations.
  1. https://github.com/outflanknl/RedELK
  2. https://www.youtube.com/watch?v=OjtftdPts4g
  3. https://www.youtube.com/watch?v=wlKqyuefE1E
Marc Smeets <marc=>outflank.nl>


Industrial Control System research.

Interface between 3th party software and an embedded OS

In Industrial Systems vendors only provide support until a certain patch of an OS. However OS versions are reliant on patching for solving security issues. Is it possible to develop an interface between the software and the OS in such a way that its possible to maintain availability and security?

Vulnerability assessment of Safety instrumented system (SIS)

SIS are promoted as being more secure, reliable and redundant. Is this true or are these systems still vulnerable? How more secure are these systems really? What are the differences between PLC and SIS?
Dima van de Wouw <dvandewouw=>deloitte.nl>


Eduroam / WPA2-Enterprise Client Testing Suite.

The eduroam wireless roaming service for research and education supports 10,000 campuses across the globe. Technologies such as the configuration assistant tool limit end-user configuration errors, but misconfigurations (accidental or deliberate) still exist in the infrastructure and are often only revealed when a eduroam user visits a particular site. For the deployment of probes as “visiting eduroam users” the research question is:
  • What is the optimal set of authentication tests from a client to determine correct deployment of a wireless hotspot?
Additional sub-questions that could be explored:
  • With multiple clients at different sites - what additional information can you deduce from authentication failures.
  • What number of probes or set of features are needed to root cause a problem?
  • Which combination of other monitoring logs can be used to determine problems without client testing?

The project will be able to utilise a network of “virtual end users” to test these test in reality - not just in theory.
Brook Schofield <brook.schofield=>geant.org>


EduGAIN and Federation Metadata Propagation Time.

Mesh identity federations and the eduGAIN Interfederation Service build their trust on the exchange of SAML metadata to limit the audience to known actors. Responding to security threats, key rollover or even updates to service configuration can be achieved with changes in metadata configuration of a service (SP) or identity provider (IdP). The time that it takes for a configuration to flow from the IdP/SP to their home federation, via inter federations services such as eduGAIN, and on to other IdPs/SPs is important to ensure consistent configuration throughout the environment. The research question is:
  • What is {best,worst,average} the propagation time of metadata throughout SAML identity federations?
Additional sub-questions that could be explored:
  • Can manual vs automatic metadata updates be detected by looking at metadata propagation times?
  • What levels of cohesion within federations, and what bilateral agreement can be exposed by looking at metadata exchange?
  • Can clashing versions of metadata be detected via external assessment of metadata exchange?
Brook Schofield <brook.schofield=>geant.org>

Marcel.denReijer <Marcel.denReijer=>os3.nl>


Security of embedded technology.

Analyzing the security of embedded technology, which operates in an ever changing environment, is Riscure's primary business. Therefore, research and development (R&D) is of utmost importance for Riscure to stay relevant. The R&D conducted at Riscure focuses on four domains: software, hardware, fault injection and side-channel analysis. Potential SNE Master projects can be shaped around the topics of any of these fields. We would like to invite interested students to discuss a potential Research Project at Riscure in any of the mentioned fields. Projects will be shaped according to the requirements of the SNE Master.
Please have a look at our website for more information: https://www.riscure.com
Previous Research Projects conducted by SNE students:
  1. https://www.os3.nl/_media/2013-2014/courses/rp1/p67_report.pdf
  2. https://www.os3.nl/_media/2011-2012/courses/rp2/p61_report.pdf
  3. http://rp.delaat.net/2014-2015/p48/report.pdf
  4. https://www.os3.nl/_media/2011-2012/courses/rp2/p19_report.pdf
If you want to see what the atmosphere is at Riscure, please have a look at: https://vimeo.com/78065043
Please let us know If you have any additional questions!
Niek Timmers <Timmers=>riscure.com>
Albert Spruyt <Spruyt=>riscure.com>
Martijn Bogaard <bogaard=>riscure.com>
Dana Geist <geist=>riscure.com>


Intel SGX proof-of-concept.

Intel SGX offers new instructions for Intel CPUs that allow you to have a “secure enclave” in which code can be run in a compartmentalized fashion (that should be secure even if the main OS compromised). This project could look into how SGX can be used to save e.g. documents safely so even the admin can’t access them, and what pitfalls could be of the system.
Gijs Hollestelle <ghollestelle=>deloitte.nl>

Robin Klusman <robin.klusman=>os3.nl>
Derk Barten <derk.barten=>os3.nl>


Mimikatz driver to R/W arbitrary kernel memory.

Mimikatz has a driver bundled that allows an attacker to arbitrary R/W to kernel memory. This project would look into using the mimikatz driver in order to run privileged code via the driver. For example, working from the kernel, it is possible to unhook A/V in order to bypass endpoint protection software. However, several protections are in place (e.g. KPP) that make this difficult. It would be interesting to look into a generic way to unhook minifilter callbacks by using the mimikatz kernel driver.
Cedric Van Bockhaven <cvanbockhaven=>deloitte.nl>


Car hacking research.

Intrusion detection for car systems

Cars are becoming more connected and networked, because of this more attack vectors available on a car.
  • Is it possible to develop an intrusion detection system for a car and what are the possible actions that can be taken after an alert is posted?
  • Are new protocols being developed to replace CAN (like Flexray) secure enough?
Security of upcoming ICS/IOT communication protocols
  • Assessing the security of upcoming protocols for ICS systems, comparing them to each other and also to the current industry standards.
Colin Schappin <cschappin=>deloitte.nl>


Release Windows Kerberos Credentials in Industrial Control Systems (ICS).

Industrial Control System research.

Windows saves all credentials entered into it since boot. Because ICS systems need a near 100% availability, a reboot of the machine is not possible to clear the memory of; the credentials. Is there a way to securely and without any compromises to the availability remove these credentials from the memory in ICS systems?
Dima van de Wouw <dvandewouw=>deloitte.nl>

Nick Offerman <Nick.Offerman=>os3.nl>
steffan.roobol <steffan.roobol=>os3.nl>


Industrial Control System research.

ICS malware network behavioral analysis.

  • How does malware look like on an ICS network?
  • Does this differ from regular IT systems and are pattern based / machine learning based solutions applicable to ICS systems?

ICS process mapping to finite state machines and analyzing system behavior.

  • It is possible to map a process (control, safety, ...) used in ICS systems to a finite state machine (FSM)?
  • Can this process of conversion be made easier for ICS processes?
  • Is it possible to use this FSM to monitor the behavior of the system and see if it shows unusual behavior (malware or defect equipment)?
Bartosz Czaszynski <bczaszynski=>deloitte.nl>


Integration of EVPN in Openstack.

EVPN-VxLAN is the default overlay solution for IP-Fabrics and Cumulus has upstreamed the EVPN implementation into the FRRouting project. EVPN can also be run on a regular Linux host (https://cumulusnetworks.com/blog/evpn-host/), but Openstack doesn’t have integration with EVPN/FRR or the other changes made in the Linux kernel the last few years (e.g VRFs, vlan-aware bridging).
Attilla de Groot <attilla=>cumulusnetworks.com>
Frank Potter <Frank.Potter=>os3.nl>


Distributed firewalling with BGP Flowspec.

BGP Flowspec (RFC 5575) is a standard to distribute ACLs with BGP. This is mainly used in DDOS mitigation, but I think it would be suitable to  implement a distributed firewall and create a microsegmentation solution in a datacenter. This could either be used in combination with the infeastructure and an OS like Cumulus Linux or in (relation to the above) when routing is done on a host/hypervisor. FRRouting currently has Flowspec partly implemented (only as a receiver), which could be used as an implementation.
Attilla de Groot <attilla=>cumulusnetworks.com>


Certificate Transparency Monitoring.

This relatively new technique to monitor newly issued certificates has a rich plethora of possibilities when it comes to detecting phishing sites, and even tls enabled C2s (command and control). Open source projects are already "streaming" in real time the new certificates, and many security research teams (us included) are analyzing the data. The majority of these communities share the same techniques: homograph attacks (aka typosquatting ), punycode, and appending a domain to target phishing site (www[.]github[.]com[.]longMaliciousDomain[.]it).
These techniques give good results but not without a fair amount of false positives. This research project aims to explore new (more creative) analysis techniques and implement them in our environment to verify their effectiveness.
Moreover, the student will be encouraged to think what other purposes this rich set of data could be used for.

References :
Leandro Velasco <leandro.velasco=>kpn.com>
Jeroen Klaver <jeroen.klaver=>kpn.com>


Automated (or assisted) Threat Intelligence verification.

In the last few years much research has been done in the field of Threat Intelligence. Many tools have been released to harvest, parse, aggregate, store, and share Indicators Of Compromise (IOC) (https://github.com/hslatman/awesome-threat-intelligence) but yet one big problem remains at the moment of using it, *False Positives*. Commercial, open source, or even home brew feeds of threat intelligence need to go trough a phase of verification. This is a tedious job, mostly done by security analyst, where the data is analysed in order to rule out outdated, non relevant, or wrong (IP: IOCs. The idea of this research project is to analyse the various possibilities to perform this verification phase in an automated fashion.
Leandro Velasco <leandro.velasco=>kpn.com>
Jeroen Klaver <jeroen.klaver=>kpn.com>


Malicious .Net application detection using Event Tracing for Windows (ETW).

The cat and mouse game between attackers (RedTeams) and Defenders (BlueTeams) is a never ending story. In the past years attackers have found that Antivirus bypass was doable by performing "fileless attacks" leveraging common tooling in windows environments. A common tool wildly exploited is powershell. As a counter measurement the industry is slowly implementing endpoint monitoring. This practice aims to build on top of the antiviruses by analyzing the events that happens in the system using software like sysmon or other EDR tooling. Moreover, microsoft implemented powershell script block logging. This allows defenders to not just monitor low level events but also analyse the commands executed by the powershell engine. Attackers after noticing that their trick started to get attention moved away and started implementing malicious .Net applications. Due to the nature of the .Net framework, attackers are able to deploy a .Net agent on the target system and send raw .Net code that will be compiled and executed by the agent from memory, thus avoiding detection.
Security researches had found that Event Tracing for Windows (ETW), first introduced in Windows 2000, could be used to detect these new threats.
Recently the company FireEye has released SilkETW, an open source tool that facilitate the use of the data generated by ETW. However, many challenges still remain, vendors and blue teams need to have a better understanding of the events generated and integrate these events into their detection strategies.

The idea behind this research project is to study the effectiveness of this newly discovered technology against threats such as the Covenant framework (https://github.com/cobbr/Covenant) and webshells such as the one recently disclosed by the apt34/Oilrig dump (https://d.uijn.nl/2019/04/18/yet-another-apt34-oilrig-leak-quick-analysis/).

Leandro Velasco <leandro.velasco=>kpn.com>
Jeroen Klaver <jeroen.klaver=>kpn.com>


OSINT  Washing Street.

At the moment more and more OSINT is available via all kinds of sources,a lot them are legit services that are used by malicious actors. Examples are github, pastebin, twitter etc. If you look at pastebin data you might find IOC/TTPS but usually the payloads delivered in many stages so it is important to have a system that follows the path until it finds the real payload. The question here is how can you build a generic pipeline that unravels data like a matryoshka doll. So no matter the input, the pipeline will try to decode, query or perform whatever relevant action that is needed. This would result in better insight in the later stages of an attack. An example of a framework using the method is Stoq (https://github.com/PUNCH-Cyber/stoq), but this lakes research in usability and if the results are added value compared to other osint sources.
Leandro Velasco <leandro.velasco=>kpn.com>
Jeroen Klaver <jeroen.klaver=>kpn.com>




I hereby would like to invite you to the annual RP2 presentations, where the SNE students will be presenting their research.
Considering the wide variety of presentations the day promises to be very interesting and we hope you will join us.
Program (Printer friendly version: HTML, PDF): The event is stretched over two days: Wednesday July 3 and Thursday July 5, 2019.

Tuesday July 2, 2019, CWI Turingzaal, Sciencepark 125, Amsterdam.
Time D #RP Title Name(s) LOC
RP #stds

Welcome, introduction. Cees de Laat

12h45 25

13h10 20

13h30 20

13h50 20

14h10 20
(bio) break

14h30 25

14h55 25

15h20 25

15h45 15

(bio) break

16h00 20

16h20 20

16h40 20



Thursday July 4, 2019, Auditorium H0.008, FNWI, Sciencepark 904, Amsterdam.
Time D #RP Title Name(s) LOC
RP #stds

Welcome, introduction. Cees de Laat

10h00 25

10h25 25

10h50 20

11h10 25

11h35 25

12h00 25

13h10 20

13h30 20

13h50 20

14h10 20
bio break

14h30 25

14h55 20

15h15 20

15h35 25

16h00 20

16h20 20

16h40 20




Program (Printer friendly version: HTML, PDF : Monday feb 4th 2019, 15h05 - 17h00 in B.1.23 at Science Park 904 NL-1098XH Amsterdam.
(all presentations are 20 minutes for single and 25 minutes for pairs of students, yellow = requested specific day/time.)

Time D #RP Title Name(s) LOC
RP #stds

Welcome, introduction. Staff

15h10 20 72
How To Reduce The Risk Of Email Data Breaches. Nick Offerman minvenj 1
15h30 20 59
Credential extraction of in-memory password managers. Derk Barten deloitte 1
15h50 20

16h10 25 28
The development of a contained and user emulated
malware assessment platform
Siebe Hodzelmans, Frank Potter deloitte 1
16h35 25 63
Invisible Internet Project - I2P. Vincent Breider, Tim de Boer deloitte 1


Tuesday feb 5th 2019, 10h00 - 17h00 in room B1.23
at Science Park 904 NL-1098XH Amsterdam.
Time D #RP Title Name(s) LOC RP #stds

Welcome, introduction. Cees de Laat

10h00 25 64 A Comparative Security Evaluation for IPv4 and IPv6 Addresses. Erik Lamers, Vincent van der Eijk tudelft 1
25 62
Characterization of a Cortex-M4 microcontroller with backside optical fault injection. Jasper Hupkens, Dominika Rusek riscure 1
10h50 20
bio/coffee break

11h10 25 27
Password Classification. Tiko Huizinga NFI
11h35 25 4
ABlockchain's Relationship with Sovrin for Digital Self-Sovereign Identities. Daan Weller, Raoul Dijksman TNO


13h00 25 41
Security of diabetes monitoring apps. Roy Vermeulen, Edgar Bohte kpmg 1
13h25 25 24
Forensic investigation of Chinese smartwatches. Kasper van Brakel, Renee Witsenburg kpmg 1
13h50 20 57 A Deep Dive into the Dark Web. Coen Schuijt kpmg 1
14h10 20
bio/tea/coffee break

14h30 25 25
WhatsApp End-to-End Encryption: Are Our Messages Private? Pavlos Lontorfos, Tom Carpaij kpmg 1
14h55 25 12
Technical feasibility of Segment Routing Traffic Engineering to steer traffic through VNFs. Ronald van der Gaag, Mike Slotboom SURFnet
15h20 20
Network Peering Dashboard for SURFnet. David Garay SURFnet 1


;Out of normal schedule presentations: Room B1.23at Science Park 904 NL-1098XH Amsterdam. Program:
Date Time Place D #RP Title Name(s) LOC RP #stds
B1.23 30 1
End-to-end automated email component testing.
Isaac Klop, Kevin Csuka NLnet
B1.23 20 61
Bypassing Phishing Protections.
Adrien Raulot
B1.23 20 11
Network Functions Virtualization and Security. Rik Janssen SURFnet 2