SARNET Demonstrations at SuperComputing 16

This post comes to us courtesy of Prof. Cees de Laat and his team from the University of Amsterdam.

SARNET, Secure Autonomous Response NETworks, is a project funded by the
Dutch Research Foundation [1]. The University of Amsterdam, TNO, KLM, and
Ciena conduct research on automated methods against attacks on computer
network infrastructures. By using the latest techniques in Software
Defined Networking and Network Function Virtualization, a SARNET can use
advanced methods to defend against cyber-attacks and return the network
to its normal state. The research goal of SARNET is to obtain the
knowledge to create ICT systems that: 1) model the system’s state based
on the emerging behavior of its components; 2) discover by observations
and reasoning if and how an attack is developing and calculate the
associated risks; 3) have the knowledge to calculate the effect of
countermeasures on states and their risks; 4) choose and execute the
most effective countermeasure.
Similar to the SC15 demonstration[2],[3], we showed an interactive touch
table based demonstration that controls a Software Defined Network
running on the ExoGENI infrastructure.

sarnet-poster-sc16
In the SC16 demo [4] the visitor selects the attack type and its origin, the
system will respond and defend against this attack autonomously. The
response includes the use of security VNF’s that are deployed using
ExoGeni infrastructure when required for analysis or mitigation,
the underlying Software Defined Network routes the attack traffic to the
VNF for analysis or mitigation. The demo showed how Network Function
Virtualization and Software Defined Networks can be useful in attack
mitigation and how they can be used effectively in setting up autonomous
responses to higher layer attacks.

[1] SARNET project page: http://www.delaat.net/sarnet
[2] SARNET demonstration at SC15 – http://sc.delaat.net/sc15/SARNET.html
[3] R. Koning et al., “Interactive analysis of sdn-driven defence
against distributed denial of service attacks,” in “2016 IEEE NetSoft
Conference and Workshops (NetSoft),” (IEEE, 2016), pp. 483–488.
[4] SARNET demonstration at SC16 – http://sc.delaat.net/

Using ExoGENI Slice Stitching capabilities

This blog entry shows how to stitch slices belonging to potentially different users together. The video demonstrates the workflow and a short discussion below outlines limitations of the current implementation.

Several API operations have been introduced to support slice-to-slice stitching:

  1. permitSliceStitch – informs controller that the owner of this reservation (node or VLAN, see below) is allowing stitching of other slices to this reservation using a specific password
  2. revokeSliceStitch – the inverse of permit, removes permission to stitch to any other slice. Existing stitches are not affected.
  3. performSliceStitch – stitch a reservation in one slice to a reservation in another slice using a password set by permitSliceStitch
  4. undoSliceStitch – undo an existing stitch between two reservations
  5. viewStitchProperties – inspect whether a given reservation allows stitching and whether there are active or inactive stitches from other slices
    1. provides information like slice stitched to, reservation stitched to, DN identifier of the owner of the other slice, when the stitch was performed and if unstitched, when.

Caveats and useful facts:

  • Slices that you are attempting to stitch must be created by the same controller
  • The reservations that you are trying to stitch together must be on the same aggregate/rack
  • You cannot stitch pieces of the same slice to itself
  • Only stitching of compute nodes (VMs and baremetal) to VLANs/links is allowed. This will not work for shared vlans or links connecting to storage (those aren’t real reservations in the ORCA sense)
  • Stitching is asymmetric (one side issues permit, the other side performs the stitch)
  • Unstitching is symmetric (either side can unstitch from the other, password is not required)
  • Each stitching operation has a unique guid and this is how it is known in the system
  • An inactive stitch is distinguished from an active stitch by the presence of ‘undone‘ property indicating date/time in RFC3399 format when the unstitching operation was performed
  • Passwords (bearer tokens) used to authorize stitching operation are stored according to best security practices (salted and transformed). They are meant to be communicated between slice owners out-of-band (phone/email/SMS/IM/pigeons).
  • Stitching to point-to-point links is allowed, however keep in mind that if you used automatic IP assignment, the two nodes that are the endpoints of the link will have a /30 netmask, which means if you stitch more nodes into that link, they won’t be able to communicate with existing nodes until you set a common broader netmask on all nodes
    • Note that ORCA enforces IP address assignment by guest-side neucad tool running in the VM. If you are reassigning IP address manually on the node, remember to kill that daemon first, otherwise it will overwrite your changes.

Known Limitations:

  • No NDL manifest support. Stitching is not visible in the manifest and can only be deduced by querying a node or a link for its stitching properties
    • Consequently no RSpec manifest support
  • No GENI API support

Using ExoGENI slice modify/dynamic slice capabilities

This short post demonstrates in video form how to use the Flukes GUI to drive ExoGENI slice modify capabilities.

There are a few items that are worth remembering when using ExoGENI slice modify

  • Initial slice doesn’t need to be bound, however for all follow on modify operations you must bind slice elements explicitly
  • Slice modify is a ‘batch’ operation – you can accumulate some number of modifications and then click ‘Submit Changes’ to realize them. For everyone’s sanity it is best to keep the number of changes relatively small in each step.
    • Corollary: Simply clicking ‘Delete’ on node or link does not immediately remove it. You still have to click ‘Submit Changes’ to make it happen.
  • Modifying multi-point inter-domain links is not allowed. You can create a new multi-point link or delete existing one, but you cannot change the degree of an existing link for the time being
  • It is not possible to link a new inter-domain path to an existing broadcast link
  • When deleting paths across multiple domains please delete not just the ION/AL2S crossconnect, but the two neighboring crossconnects belonging to each rack, as shown in this figure:

modify-delete-1

ExoGENI used for predicting storm landfall and impact

Recent collaborations between RENCI and the Global Environment for Network Innovations (GENI) have enabled ADCIRC-based storm surge and wave simulations to access GENI’s federated network and computational resources. The collaboration has resulted in a scientific workflow that controls the execution of an ensemble of simulations executing across the GENI federated infrastructure and predicting storm surge with unprecedented detail.

More details can be found in this NSF item and in this US Ignite linkScreen Shot 2015-12-12 at 9.12.10 PM.

Using Docker in ExoGENI

Overview

This brief post explains how to use Docker in ExoGENI. The image built for this post is posted in ExoGENI Image Registry and also available in Flukes.

Name: Docker-v0.1
URL: http://geni-images.renci.org/images/ibaldin/docker/centos6.6-docker-v0.1/centos6.6-docker-v0.1.xml
Hash: b2262a8858c9c200f9f43d767e7727a152a02248

This is a CentOS 6.6-based image with a Docker install on top. Note that this post is not meant to serve as a primer on Docker. For this you should consult Docker documentation.

Theory of operation

Docker is a platform to configure, ship and run applications. It uses thin containers to isolate dockers from each other. The docker daemon uses a devmapper to manage its disk space. By default Docker creates a sparse 100G file for data, and each docker can take up to 10G of disk space, which clearly is too large to run inside a VM, should it try to fill it up.

To ensure this doesn’t happen the ExoGENI Docker VM sizes each docker not to exceed 5G and the overall space given to Docker in its sparse file is limited to 20G. This setting is adjusted by editing a line in /etc/sysconfig/docker:

other_args="--storage-opt dm.basesize=5G --storage-opt dm.loopdatasize=20G"

If you wish to resize the amount of space available to your Docker, please edit this line accordingly and then do the following:

$ service docker stop
$ cd /var/lib
$ rm -rf docker
$ service docker start

Please note that wiping out the /var/lib/docker directory as shown above will wipe out all images and containers you may have created so far. If you wish to save the image you created, please do

$ docker save -o image.tar 

and save each image. Once Docker disk space is resized and restarted, you can reload the images back using docker load command.

Using the Docker image

You can simply boot some number of instances with the Docker image pointed above and start loading Dockers from Docker Hub or creating your own.

We recommend using the larger VM sizes, like XOLarge and XOExtraLarge to make sure you don’t run out of disk space.

Using perfSonar in ExoGENI

Overview

Special thanks go to Brian Tierney of LBL/ESnet for his help in creating the perfSonar image.

This post describes how to use a perfSonar image in ExoGENI slices. The image built for this blog post is now posted in the ExoGENI Image Registry and available in Flukes.

Name: psImage-v0.3
URL: http://geni-images.renci.org/images/ibaldin/perfSonar/psImage-v0.3/psImage-v0.3.xml
Hash: e45a2c809729c1eb38cf58c4bff235510da7fde5

Note that we are using a Level 2 perfSonar image out of a Centos 6.6 base image with modified ps_light docker from ESnet. However the registration with the perfSonar lookup service is disabled in this image.

Theory of operation

The perfSonar image uses Docker technology to deploy its components. The following elements are included in the image:

  • Client programs for nuttcp, iperf, iperf3, bwctl and owamp included as simple RPMs accessible by all users
  • Server programs for bwctl and owamp running inside a Docker

The image starts Docker on boot, loads the needed Docker images and automatically launches the ‘ps_light_xo’ Docker with the server programs in it.

-bash-4.1# docker ps
CONTAINER ID        IMAGE                COMMAND                CREATED             STATUS              PORTS               NAMES
ba28266c1aec        ps_light_xo:latest   "/bin/sh -c '/usr/bi   6 minutes ago       Up 6 minutes                            suspicious_lovelace  

Under normal operation the user should not have no interact with the server programs – the Docker is running in net host mode and the server programs listen on all the interfaces the VM may have. However, if needed, the user can gain access to the Docker with server programs using the following command:

$ docker exec -ti <guid> /bin/bash

where ‘<guid>’ refers to the automatically started Docker. You can find out the guid by issuing this command:

$ docker ps

Using the image

You can create a topology using the perfSonar image (listed in Image registry and above) and then run the client programs on some nodes against server nodes on other nodes. Since the image has both client and server programs, measurements can be done in any direction as long as IP connectivity is assured.

Once the slice has booted try a few client programs:

-bash-4.1# owping 172.16.0.2
Approximately 13.0 seconds until results available

--- owping statistics from [172.16.0.1]:8852 to [172.16.0.2]:8966 ---
SID:	ac100002d8c6ba8674af285470d65b0b
first:	2015-04-01T14:42:15.627
last:	2015-04-01T14:42:25.314
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = -0.496/-0.4/-0.144 ms, (err=3.9 ms)
one-way jitter = 0.2 ms (P95-P50)
TTL not reported
no reordering


--- owping statistics from [172.16.0.2]:8938 to [172.16.0.1]:8954 ---
SID:	ac100001d8c6ba867d50999ce0a1166f
first:	2015-04-01T14:42:15.553
last:	2015-04-01T14:42:24.823
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 1.09/1.3/1.5 ms, (err=3.9 ms)
one-way jitter = 0.2 ms (P95-P50)
TTL not reported
no reordering

or

-bash-4.1# bwctl -c 172.16.0.2
bwctl: Using tool: iperf
bwctl: 16 seconds until test results available

RECEIVER START
------------------------------------------------------------
Server listening on TCP port 5578
Binding to local address 172.16.0.2
TCP window size: 87380 Byte (default)
------------------------------------------------------------
[ 15] local 172.16.0.2 port 5578 connected with 172.16.0.1 port 59083
[ ID] Interval       Transfer     Bandwidth
[ 15]  0.0-10.0 sec  1356333056 Bytes  1081206753 bits/sec
[ 15] MSS size 1448 bytes (MTU 1500 bytes, ethernet)

RECEIVER END

Things to note

OWAMP in particular is sensitive to accurate time measurements, which is why the VMs come packaged with ntpd that starts on boot. However this does not solve all the problems. Measuring jitter in a VM may produce unpredictable results due to VM sharing cores with other VMs in the same worker node. While in ExoGENI we do not oversubscribe cores, we also do not (yet) do any core pinning when placing VMs inside workers, which means time artifacts may occur when VMs switch cores.

The end result is that while jitter measurements using OWAMP may have high resolution, their accuracy should be questioned. To improve the accuracy try using larger instance sizes, like XOLarge and XOExtraLarge.

Using GENI for teaching Computer Networking at NCSU

Author: Prof. Rudra Dutta of NCSU Computer Science

I have been using GENI in some form in my teaching for the last three years –

very tentatively to start with, but more extensively over time. Fall, 2014
was my most ambitious use so far.

The course I was teaching was Internet Protocols – a graduate level course on
networking that assumes at least one previous general networking course.
I assume basic working knowledge of networking, TCP/IP, socket programming,
and general knowledge of common Internet functionality such as HTTP, DNS etc.
After a quick refresher of some of this topic, the course dives into details of
the forwarding engine, QoS issues in forwarding, programming kernel modules
with netfilter, some content about routing, SDN, etc. The first half of the
course largely individual and group assignments on these topics, and the
second half is one large group project. Groups are assigned by myself for both
assignments and project – not self-assigned.

In this instance, I used GENI in two different ways – first, specific
questions out of some of the homework assignments were required to be done on
GENI, and later, GENI was specified as one of the three platforms that students
could use for the project. More detailed information about the administration,
including the specific assignments, is available for those interested from
http://dutta.csc.ncsu.edu/csc573_fall14/index.html. I exclusively guided the
students into using ExoGENI substrates, because experience from previous
semesters indicated that it was the most nearly consistent in providing
slice behaviors. (Some other substrates would have varying slice and
stitching behavior, when trying with the same RSPEC multiple times – this was
confusing to students.) We also used a methodology of designing/reserving
through Flukes and accessing separately by ssh, because it went well with the
rest of ExoGENI, and again presented a unique way to negotiate the
authentication/authorization issues for the students.

Before assigning the first homework, I had briefly covered GENI operations in
class. The first assignment actually had them create GENI IDs, request joining
the ncsu_teaching project, and they ended up simply being able to reserve a
slice with a simple four-node tandem network, then setting up routing tables at
each node to get a simple ping through. Later homework assignments were more
complex, until the final one asked them to create a seven-node topology and use
both OpenFlow and kernel module programming to build and investigate the
behavior of a firewall.

There were a total of 86 students, who were eventually
grouped into 22 project teams; however, the class started with a somewhat
larger number of students who attempted the early assignments. There were the
usual initial problems; students complained of resources not being available,
access never working, very sluggish access, and other similar issues. Upon
investigation most of these could be resolved to misunderstandings about ssh
key files, lack of appreciation of how much extra bandwidth it requires to
push through a GUI through two sets of networking connections (many of the
students had no suitable self-owned computing to access GENI from, and were
using servers from VCL, the NCSU computing cloud, to do so), not realizing that
management interfaces were visible to them and trying to use them for their
slice usage, etc. There were also some actual GENI issues – over this period
ExoGENI went through some ExoSM problems, which caused them to advocate that
anybody not using cross-rack stitching should use specific SMs rather than
ExoSM (contrary to what the webpages typically said), and also changed the
format of the Flukes .properties file, which the TA had to scramble to get
communicated to all students. By far the problem that had the biggest impact
on the students was that resources were not always available when needed –
students would wait for hours and days without requests being provisioned.
We cannot be sure but believe that these represent real resource crunches, not
an artifact or mistake of some kind.

When time came to propose project ideas, I was somewhat surprised (after all
the complaints) that 12 out of the 22 teams picked GENI as their platform
outright, and another 7 listed it as one of the two platforms they were going
to use (3 of these eventually ended up working on GENI). While the teams had
varied success in their projects, I was glad to see that they had all
negotiated GENI well. Some of the projects were quite impressive. Most of
them would have been possible to execute in my department’s networking lab,
but it would not have been possible to support the same number and variety of
projects.

Each of the teams that used GENI as their project platform wrote up a short
assessment of the role of GENI in their project. A few of these are appended.
Most of them speak of pros as well as cons, but on the whole they confirm that
the availability of GENI enriched this course, and produced learning benefit
that would not have been unattainable without it.

Team Responses