Scalably Managing ExoGENI nodes using AWS tools

Introduction and Overview

In this post we will explore how to use Amazon AWS tools to scalably manage the infrastructure of your slices. These tools allow you to manage hybrid infrastructures consisting of EC2 instances and external nodes, in our case created in ExoGENI testbed. They allow to perform remote command executions on multiple nodes at once, inventory the state of the nodes – their software, network and other configurations and perform custom tasks, all without differentiating between EC2 nodes and ExoGENI nodes. In fact, it is possible to use these tools only on ExoGENI nodes, without having any EC2 nodes involved. The management tasks can be done from EC2 web console, or using AWS CLI tools, or programmatically, with libraries like Boto. This tutorial concentrates on using the web console and command line tools only.

In this tutorial we will be using several AWS services: EC2, CloudFormation (Infrastructure as Code), IAM (Identity and Access Management) and SSM (Simple Systems Manager). A disclaimer: IAM, SSM and CloudFormation services are included into EC2 pricing, you pay for the EC2 instances you start, S3 storage space and sometimes traffic. That means if you are starting only ExoGENI instances, there should be no costs if you do not use S3 buckets.

Prerequisites:

The tutorial follows the following workflow

  1. Start up AWS stack using CloudFormation. We will create a small ‘slice’ inside AWS with 3 instances
  2. We will demonstrate the use of the SSM Run Command on those instances
  3. We will start an ExoGENI slice, whose instances automatically join SSM
  4. We will demonstrate how to manage EC2 and ExoGENI instances together using the same tools

Starting EC2 stack using CloudFormation

We begin by downloading a CloudFormation template that starts the EC2 side of our experiment. Notice that it isn’t necessary to have EC2 instances to use the Run Command, however in the tutorial we show both EC2 instances and ExoGENI instances.

In our case the stack consists of three hosts, one bastion host with a public IP address and two other hosts in different subnets that communicate with the outside world using a NAT gateway.

screen-shot-2017-01-05-at-3-09-05-pm

The stack can be started using the following command:

$ aws cloudformation create-stack --stack-name GENIStack --template-body file:///path/to/downloaded/geni-vpc.template --parameters ParameterKey=InstanceType,ParameterValue=t2.small,ParameterKey=KeyName,ParameterValue=<Name of your SSH Key Pair> --capabilities CAPABILITY_IAM

There are several important parameters in this command we should discuss:

  • –stack-name GENIStack is the name you are giving this stack. All EC2 instances in the stack will be tagged with this name and you will be able to invoke remote commands on them based on this name
  • –template-body must be a URL of the template you are starting. In this case it is a file on a filesystem. Could also be an S3 object
  • –parameters ParameterKey=InstanceType,ParameterValue=t2.small,ParameterKey=KeyName,ParameterValue=<Name of your SSH Key Pair> specifies in Key/Value tuples several parameters. In this case we specify that our EC2 instances will be t2.small and we must name the SSH key to be used with them. The name should be visible as ‘Key Pair Name’ in EC2 console under Network & Security/Key Pairs
  • –capabilities CAPABILITY_IAM, because this template creates roles inside AWS IAM, it needs special capabilities declared explicitly

While this command is executing you can check the progress of the stack either via AWS CloudFormation web console, or using the CLI:

$ aws cloudformation describe-stacks

While the stack is being created, lets take a look at several elements of the stack template file that are critical to this tutorial.  The template is a JSON file.

Each instance in this stack is tagged with the RunCmdInstanceProfile associated with RunCmdRole IAM role that allows instances in the stack limited privilege to use the SSM service. This is an equivalent of a ‘speaks-for’ in GENI:

 "RunCmdInstanceProfile": {
    "Type": "AWS::IAM::InstanceProfile",
    "Properties": {
      "Path": "/",
      "Roles": [ { "Ref": "RunCmdRole" } ]
    }
 },
 "RunCmdRole": {
    "Type": "AWS::IAM::Role",
    "Properties": {
       "AssumeRolePolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
          {
             "Sid": "",
             "Effect": "Allow",
             "Principal": {
             "Service": "ec2.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
       }
       ]
    },
    "Path": "/"
    }
 }

Each instance assumes RunCmdInstanceProfile and the role uses a policy RunCmdPolicies that allows for SSM operations (policy omitted for brevity).

Another important aspect is the startup script used by each instance that updates, then starts the SSM agent on startup:

 "IamInstanceProfile": {
    "Ref": "RunCmdInstanceProfile"
    },
 "UserData": { "Fn::Base64" : { "Fn::Join" : ["", [
    "#!/bin/bash -xe\n",
    "cd /tmp\n",
    "echo ", {"Ref": "AWS::Region"}, " > region.txt\n",
    "curl https://amazon-ssm-",{"Ref": "AWS::Region"}, ".s3.amazonaws.com/latest/linux_amd64/amazon-ssm-agent.rpm -o amazon-ssm-agent.rpm\n",
    "sudo yum install -y amazon-ssm-agent.rpm\n",
    "sudo restart amazon-ssm-agent\n"
 ]]}}

Once the stack completes, you should see something like this (adjusted for your parameters):

$ aws cloudformation describe-stacks
STACKS 2017-01-05T16:05:42.658Z False arn:aws:cloudformation:us-east-1:621231197516:stack/GENIStack/cf1a7e80-d360-11e6-ae3f-503f23fb559a GENIStack CREATE_COMPLETE
CAPABILITIES CAPABILITY_IAM
OUTPUTS Primary private IP of host 2 Host2 Private IP 192.168.2.200
OUTPUTS Primary private IP of host 1 Host1 Private IP 192.168.1.26
OUTPUTS Primary public IP of gateway host EIP IP Address 34.196.53.116 on subnet subnet-02b4df2f
PARAMETERS KeyName MyKeys
PARAMETERS InstanceType t2.small

And in CloudFormation console

screen-shot-2017-01-05-at-11-22-11-am

In the EC2 console, when you go down to ‘Systems Manager Shared Resources’ and click on ‘Managed Instances’ you should see three EC2 instances belonging to the stack you just created lit up ‘green’:

screen-shot-2017-01-05-at-11-24-01-am

Notice that the console already offers you a way to run commands on them using the ‘Run Command’ button. Run Command operation is based on a number of pre-existing JSON document templates (SSM Documents) that are selected to run a particular type of a command. AWS classifies the documents as Windows or Linux compatible.

The full list of currently available documents is available via the EC2 console in Systems Manager Shared Resources/Documents or via AWS CLI:

$ aws ssm list-documents

You can click on the Run Command button and select an ‘SSM Document’ that is a template for the command. In this case we want to run a command-line ‘ifconfig -a’, so select ‘AWS-RunShellScript’ document and fill out the form (select all instances and in the command space enter ‘ifconfig -a’). Run the command. SSM issues a GUID corresponding to this command and you can inspect the output by clicking on the GUID and looking at command output for each instance. SSM is asynchronous and you need to wait for command completion on individual instances to see the output.

AWS keeps the full history of Run Command invocations, previous invocations can be explored in the EC2 console under ‘Systems Manager Services/Run Command’ with commands listed by date and guid.

screen-shot-2017-01-05-at-11-36-23-am

We can achieve the same results from AWS CLI by doing the following:

$ aws ssm send-command --instance-ids i-0b387c665628f5f9b i-02e2a213adfa03bab i-0e6edec45f97ede23 --document-name "AWS-RunShellScript" --comment "IP config" --parameters commands=ifconfig --output text

The above command explicitly names EC2 instances on which the command needs to be executed. Alternatively you can use this form:

$ aws ssm send-command --targets "Key=tag:aws:cloudformation:stack-name,Values=GENIStack" --document-name "AWS-RunShellScript" --comment "IP config" --parameters commands=ifconfig --output text

Notice that in this case we match instances by the name of the CloudFormation stack we gave above.

The output of the command can be examined using

$ aws ssm list-command-invocations --command-id <guid of the command invocation returned by the previous command> --details

There are other commands under aws ssm toolset, you should be free to explore them (do ‘aws ssm help‘).

Starting ExoGENI slice with instances connected to AWS SSM

In this section we will add ExoGENI instances to the list of instances managed via SSM. Before you start the slice you must create a special kind of credential for your instances to be able to talk to AWS SSM.

We begin by creating a new role we will call SSMServiceRole to provide SSM credentials to hybrid (non-EC2) instances. First we must create a JSON trust file that allows principals assume that role (cut and paste the contents and call it SSMService-Trust.json):

{
 "Version": "2012-10-17",
 "Statement": {
 "Effect": "Allow",
 "Principal": {"Service": "ssm.amazonaws.com"},
 "Action": "sts:AssumeRole"
 }
}

Use the file to create the role:

$ aws iam create-role --role-name SSMServiceRole --assume-role-policy-document file://SSMService-Trust.json

Associate a standard (managed) AWS policy AmazonEC2RoleforSSM with this role that allows SSM operations:

$ aws iam attach-role-policy --role-name SSMServiceRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM

If you were paying attention, you might ask “Didn’t we create a similar role with the template for EC2 instances?” and you’d be right. However at present it doesn’t appear possible to use the role created as part of CloudFormation stack outside the stack.

You can inspect existing roles in your account by using AWS IAM web console and clicking on ‘Roles’ or by executing a CLI command:

$ aws iam list-roles

In this case the command should show two roles – one created by CloudFormation, and the other the role we created just now.

Now we must create temporary tokens for SSM agent on your ExoGENI instances to access the SSM service. Notice keyword is ‘temporary’ as in they have an expiration date past which the instances will not be able to communicate with SSM. Because of that, it is critical that there is only a minimal time skew between your ExoGENI instances and AWS (more on that below). Each token is associated with some number of ‘registrations’ – nodes in your slice that can use SSM (default is 1) and an expiration date (default 24 hours). We create a new activation for our slice:

$ aws ssm create-activation --default-instance-name MyXoServers --iam-role SSMServiceRole --registration-limit 10 --expiration-date 2017-01-10T20:30:00.000Z

Examining parameters above:

  • Default instance name will be a string by which your instances are known in the SSM (each will also be issued a unique instance identifier)
  • We must include the role we defined above in the activation
  • Define the max number of nodes you plan to have in your ExoGENI slice by registration limit
  • Define the expiration date (in this case using UTC)

The command provides two strings, the first one a code (20 characters), the second, a registration ID. Both are needed by the SSM agent in your ExoGENI instances to authenticate to SSM. Now we’re ready to start the ExoGENI slice.

When starting an ExpGENI slice, you can use any topology on any rack or controller, so long as you include the following post-bootscript for each instances you want managed via AWS SSM. Notice this script is for CentOS 6.X images, you may need to adapt it to Debian derivatives and systemctl-based RedHat-like distributions.

#!/bin/bash

SSMDIR=/tmp/ssm
AWSREGID="<provide the activation registration guid>"
AWSCODE="<provide the activation code>"
AWSREGION=us-east-1
NTPSERVER=clock1.unc.edu

# NO NEED TO EDIT BELOW FOR CentOS 6.x

ntpdate ${NTPSERVER} > /dev/null
/etc/init.d/ntpd restart 
mkdir ${SSMDIR}
curl https://amazon-ssm-${AWSREGION}.s3.amazonaws.com/latest/linux_amd64/amazon-ssm-agent.rpm -o ${SSMDIR}/amazon-ssm-agent.rpm
yum install -y ${SSMDIR}/amazon-ssm-agent.rpm 
stop amazon-ssm-agent
amazon-ssm-agent -register -id ${AWSREGID} -code ${AWSCODE} -region ${AWSREGION}
start amazon-ssm-agent

This file downloads SSM agent on boot, provides it with the credentials to AWS SSM service acquired in the previous step and restarts it. Notice the invocation of ntpdate – it is critical for the operation of SSM that the clocks in the instances are relatively true. If you have a significant clock skew, the agent on the instance will fail to connect to SSM.

Define a slice  topology in Flukes and be sure to cut and paste a modified version of the post-boot script above into each node you intend to manage via AWS. Notice that in addition to the code and registration id, you may beed to modify the AWS region, depending on the setting in your AWS account.

You can watch your slice come up in Flukes, but also use the EC2 Systems Manager Shared Resources/ Managed Instances console to see managed instances in your slice come up and go green. Note that node names given in Flukes show up in the console as ‘Computer Name’; also note that each ExoGENI instance also receives a unique instance ID starting with ‘mi’. Finally, note that the IP address reported for all instances (EC2 and ExoGENI) is the private address assigned to the management interface eth0.

screen-shot-2017-01-05-at-1-52-37-pm

Using Tools to Manage the Hybrid Infrastructure

Now that we have a ‘slice’ of EC2 and an ExoGENI slice that respond to AWS management tools, we can demonstrate some of the capabilities.

First off, just like in example above we can issue random commands to multiple instances in a scalable fashion, as shown above for EC2, but now we can use the AWS-issued instance IDs for our ExoGENI instances to name them. First we can list all managed instances using AWS CLI:

$ aws ssm describe-instance-information

Using instance IDs reported above we can craft a command to send to all instances:

$ aws ssm send-command --instance-ids <space separated list of instance ids from EC2 and your slice> --document-name "AWS-RunShellScript" --comment "IP config" --parameters commands=ifconfig --output text

You can check on the status of the invocations (if they completed successfully):

$ aws ssm list-command-invocations --command-id <guid of command id returned by previous command>

If you want to see the output, add –details to the previous command. You can also do

$ aws ssm get-command-invocation --command-id <guid of command id> --instance-id <id of the instance>

to inspect status of individual invocations on nodes.

We can also take inventory (software, network configuration) of the nodes and have it refresh periodically. The inventory can be visible in the web console and be saved into S3 bucket (costs will apply). This can be done from the console by clicking on ‘Setup Inventory’ button in the managed instances list. We will demonstrate doing it via CLI here. Unlike the per-command invocation shown above, inventory requires creating an association between an SSM inventory document and instances with a cron schedule so it periodically refreshes its content:

$ aws ssm create-association --name AWS-GatherSoftwareInventory --targets  "Key=instanceids,Values=<comma separated list of instance ids>" --schedule-expression "cron(0 0/30 * 1/1 * ? *)" --parameters networkConfig=Enabled,windowsUpdates=Disabled,applications=Enabled

This step takes a while (10 minutes or more) to complete, you can see in EC2 console under Managed Instances (by clicking on the instance) the state of the association. Once it completes, the inventory becomes available to view.

You can also see the state of existing associations by executing

$ aws ssm list-associations

Note that each association has a unique guid, which can be used to query for the state of association:

$ aws ssm describe-association --association-id <association guid>

After the association completes successfully we can query for inventory of the nodes:

$ aws ssm list-inventory-entries --instance-id <one of instance ids above> --type-name <inventory type>

Inventory type name is one of the following strings:

  • AWS:Application – lists installed packages
  • AWS:Network – lists interface configuration
  • AWS:AWSComponent – lists installed AWS components on the instance (typically SSM agent)
  • Other types are Windows-specific.

Conclusion

This tutorial demonstrated how to use AWS remote management tools to jointly manage EC2 and ExoGENI instances. Some of this functionality, particularly the remote execution, can be done in other ways, however the AWS approach offers several advantages:

  • Asynchronous event-driven nature makes it significantly more scalable, compared to typically serial execution of commands via remote shell (you can use psh though to speed things up)
  • Historical information about commands saved in AWS for review to ensure experiment progress log and repeatability
  • Comprehensive software inventory (also possible to keep history) per instance
  • Programmatic API available for scripting via Boto

This concludes the tutorial, the two following sections have suggestions on troubleshooting and next steps.

Troubleshooting

  • SSM agent behavior on the instances is logged under /var/log/amazon/ssm
  • If you run out of activations or your activation for SSM agent in ExoGENI nodes expires, you can create a new activation, and configure the ssm agent on each node with new credentials following the flow of the ExoGENI post-boot script above.
  • If you get stuck being unable to specify a particular CLI parameter, check this page.

Things to explore further

  • Programmatic API implementations, like Boto
  • Implementing new SSM command documents specific to your experiment

SARNET Demonstrations at SuperComputing 16

This post comes to us courtesy of Prof. Cees de Laat and his team from the University of Amsterdam.

SARNET, Secure Autonomous Response NETworks, is a project funded by the
Dutch Research Foundation [1]. The University of Amsterdam, TNO, KLM, and
Ciena conduct research on automated methods against attacks on computer
network infrastructures. By using the latest techniques in Software
Defined Networking and Network Function Virtualization, a SARNET can use
advanced methods to defend against cyber-attacks and return the network
to its normal state. The research goal of SARNET is to obtain the
knowledge to create ICT systems that: 1) model the system’s state based
on the emerging behavior of its components; 2) discover by observations
and reasoning if and how an attack is developing and calculate the
associated risks; 3) have the knowledge to calculate the effect of
countermeasures on states and their risks; 4) choose and execute the
most effective countermeasure.
Similar to the SC15 demonstration[2],[3], we showed an interactive touch
table based demonstration that controls a Software Defined Network
running on the ExoGENI infrastructure.

sarnet-poster-sc16
In the SC16 demo [4] the visitor selects the attack type and its origin, the
system will respond and defend against this attack autonomously. The
response includes the use of security VNF’s that are deployed using
ExoGeni infrastructure when required for analysis or mitigation,
the underlying Software Defined Network routes the attack traffic to the
VNF for analysis or mitigation. The demo showed how Network Function
Virtualization and Software Defined Networks can be useful in attack
mitigation and how they can be used effectively in setting up autonomous
responses to higher layer attacks.

[1] SARNET project page: http://www.delaat.net/sarnet
[2] SARNET demonstration at SC15 – http://sc.delaat.net/sc15/SARNET.html
[3] R. Koning et al., “Interactive analysis of sdn-driven defence
against distributed denial of service attacks,” in “2016 IEEE NetSoft
Conference and Workshops (NetSoft),” (IEEE, 2016), pp. 483–488.
[4] SARNET demonstration at SC16 – http://sc.delaat.net/

Using ExoGENI Slice Stitching capabilities

This blog entry shows how to stitch slices belonging to potentially different users together. The video demonstrates the workflow and a short discussion below outlines limitations of the current implementation.

Several API operations have been introduced to support slice-to-slice stitching:

  1. permitSliceStitch – informs controller that the owner of this reservation (node or VLAN, see below) is allowing stitching of other slices to this reservation using a specific password
  2. revokeSliceStitch – the inverse of permit, removes permission to stitch to any other slice. Existing stitches are not affected.
  3. performSliceStitch – stitch a reservation in one slice to a reservation in another slice using a password set by permitSliceStitch
  4. undoSliceStitch – undo an existing stitch between two reservations
  5. viewStitchProperties – inspect whether a given reservation allows stitching and whether there are active or inactive stitches from other slices
    1. provides information like slice stitched to, reservation stitched to, DN identifier of the owner of the other slice, when the stitch was performed and if unstitched, when.

Caveats and useful facts:

  • Slices that you are attempting to stitch must be created by the same controller
  • The reservations that you are trying to stitch together must be on the same aggregate/rack
  • You cannot stitch pieces of the same slice to itself
  • Only stitching of compute nodes (VMs and baremetal) to VLANs/links is allowed. This will not work for shared vlans or links connecting to storage (those aren’t real reservations in the ORCA sense)
  • Stitching is asymmetric (one side issues permit, the other side performs the stitch)
  • Unstitching is symmetric (either side can unstitch from the other, password is not required)
  • Each stitching operation has a unique guid and this is how it is known in the system
  • An inactive stitch is distinguished from an active stitch by the presence of ‘undone‘ property indicating date/time in RFC3399 format when the unstitching operation was performed
  • Passwords (bearer tokens) used to authorize stitching operation are stored according to best security practices (salted and transformed). They are meant to be communicated between slice owners out-of-band (phone/email/SMS/IM/pigeons).
  • Stitching to point-to-point links is allowed, however keep in mind that if you used automatic IP assignment, the two nodes that are the endpoints of the link will have a /30 netmask, which means if you stitch more nodes into that link, they won’t be able to communicate with existing nodes until you set a common broader netmask on all nodes
    • Note that ORCA enforces IP address assignment by guest-side neucad tool running in the VM. If you are reassigning IP address manually on the node, remember to kill that daemon first, otherwise it will overwrite your changes.

Known Limitations:

  • No NDL manifest support. Stitching is not visible in the manifest and can only be deduced by querying a node or a link for its stitching properties
    • Consequently no RSpec manifest support
  • No GENI API support

Using ExoGENI slice modify/dynamic slice capabilities

This short post demonstrates in video form how to use the Flukes GUI to drive ExoGENI slice modify capabilities.

There are a few items that are worth remembering when using ExoGENI slice modify

  • Initial slice doesn’t need to be bound, however for all follow on modify operations you must bind slice elements explicitly
  • Slice modify is a ‘batch’ operation – you can accumulate some number of modifications and then click ‘Submit Changes’ to realize them. For everyone’s sanity it is best to keep the number of changes relatively small in each step.
    • Corollary: Simply clicking ‘Delete’ on node or link does not immediately remove it. You still have to click ‘Submit Changes’ to make it happen.
  • Modifying multi-point inter-domain links is not allowed. You can create a new multi-point link or delete existing one, but you cannot change the degree of an existing link for the time being
  • It is not possible to link a new inter-domain path to an existing broadcast link
  • When deleting paths across multiple domains please delete not just the ION/AL2S crossconnect, but the two neighboring crossconnects belonging to each rack, as shown in this figure:

modify-delete-1

ExoGENI used for predicting storm landfall and impact

Recent collaborations between RENCI and the Global Environment for Network Innovations (GENI) have enabled ADCIRC-based storm surge and wave simulations to access GENI’s federated network and computational resources. The collaboration has resulted in a scientific workflow that controls the execution of an ensemble of simulations executing across the GENI federated infrastructure and predicting storm surge with unprecedented detail.

More details can be found in this NSF item and in this US Ignite linkScreen Shot 2015-12-12 at 9.12.10 PM.

Using Docker in ExoGENI

Overview

This brief post explains how to use Docker in ExoGENI. The image built for this post is posted in ExoGENI Image Registry and also available in Flukes.

Name: Docker-v0.1
URL: http://geni-images.renci.org/images/ibaldin/docker/centos6.6-docker-v0.1/centos6.6-docker-v0.1.xml
Hash: b2262a8858c9c200f9f43d767e7727a152a02248

This is a CentOS 6.6-based image with a Docker install on top. Note that this post is not meant to serve as a primer on Docker. For this you should consult Docker documentation.

Theory of operation

Docker is a platform to configure, ship and run applications. It uses thin containers to isolate dockers from each other. The docker daemon uses a devmapper to manage its disk space. By default Docker creates a sparse 100G file for data, and each docker can take up to 10G of disk space, which clearly is too large to run inside a VM, should it try to fill it up.

To ensure this doesn’t happen the ExoGENI Docker VM sizes each docker not to exceed 5G and the overall space given to Docker in its sparse file is limited to 20G. This setting is adjusted by editing a line in /etc/sysconfig/docker:

other_args="--storage-opt dm.basesize=5G --storage-opt dm.loopdatasize=20G"

If you wish to resize the amount of space available to your Docker, please edit this line accordingly and then do the following:

$ service docker stop
$ cd /var/lib
$ rm -rf docker
$ service docker start

Please note that wiping out the /var/lib/docker directory as shown above will wipe out all images and containers you may have created so far. If you wish to save the image you created, please do

$ docker save -o image.tar 

and save each image. Once Docker disk space is resized and restarted, you can reload the images back using docker load command.

Using the Docker image

You can simply boot some number of instances with the Docker image pointed above and start loading Dockers from Docker Hub or creating your own.

We recommend using the larger VM sizes, like XOLarge and XOExtraLarge to make sure you don’t run out of disk space.

Using perfSonar in ExoGENI

Overview

Special thanks go to Brian Tierney of LBL/ESnet for his help in creating the perfSonar image.

This post describes how to use a perfSonar image in ExoGENI slices. The image built for this blog post is now posted in the ExoGENI Image Registry and available in Flukes.

Name: psImage-v0.3
URL: http://geni-images.renci.org/images/ibaldin/perfSonar/psImage-v0.3/psImage-v0.3.xml
Hash: e45a2c809729c1eb38cf58c4bff235510da7fde5

Note that we are using a Level 2 perfSonar image out of a Centos 6.6 base image with modified ps_light docker from ESnet. However the registration with the perfSonar lookup service is disabled in this image.

Theory of operation

The perfSonar image uses Docker technology to deploy its components. The following elements are included in the image:

  • Client programs for nuttcp, iperf, iperf3, bwctl and owamp included as simple RPMs accessible by all users
  • Server programs for bwctl and owamp running inside a Docker

The image starts Docker on boot, loads the needed Docker images and automatically launches the ‘ps_light_xo’ Docker with the server programs in it.

-bash-4.1# docker ps
CONTAINER ID        IMAGE                COMMAND                CREATED             STATUS              PORTS               NAMES
ba28266c1aec        ps_light_xo:latest   "/bin/sh -c '/usr/bi   6 minutes ago       Up 6 minutes                            suspicious_lovelace  

Under normal operation the user should not have no interact with the server programs – the Docker is running in net host mode and the server programs listen on all the interfaces the VM may have. However, if needed, the user can gain access to the Docker with server programs using the following command:

$ docker exec -ti <guid> /bin/bash

where ‘<guid>’ refers to the automatically started Docker. You can find out the guid by issuing this command:

$ docker ps

Using the image

You can create a topology using the perfSonar image (listed in Image registry and above) and then run the client programs on some nodes against server nodes on other nodes. Since the image has both client and server programs, measurements can be done in any direction as long as IP connectivity is assured.

Once the slice has booted try a few client programs:

-bash-4.1# owping 172.16.0.2
Approximately 13.0 seconds until results available

--- owping statistics from [172.16.0.1]:8852 to [172.16.0.2]:8966 ---
SID:	ac100002d8c6ba8674af285470d65b0b
first:	2015-04-01T14:42:15.627
last:	2015-04-01T14:42:25.314
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = -0.496/-0.4/-0.144 ms, (err=3.9 ms)
one-way jitter = 0.2 ms (P95-P50)
TTL not reported
no reordering


--- owping statistics from [172.16.0.2]:8938 to [172.16.0.1]:8954 ---
SID:	ac100001d8c6ba867d50999ce0a1166f
first:	2015-04-01T14:42:15.553
last:	2015-04-01T14:42:24.823
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 1.09/1.3/1.5 ms, (err=3.9 ms)
one-way jitter = 0.2 ms (P95-P50)
TTL not reported
no reordering

or

-bash-4.1# bwctl -c 172.16.0.2
bwctl: Using tool: iperf
bwctl: 16 seconds until test results available

RECEIVER START
------------------------------------------------------------
Server listening on TCP port 5578
Binding to local address 172.16.0.2
TCP window size: 87380 Byte (default)
------------------------------------------------------------
[ 15] local 172.16.0.2 port 5578 connected with 172.16.0.1 port 59083
[ ID] Interval       Transfer     Bandwidth
[ 15]  0.0-10.0 sec  1356333056 Bytes  1081206753 bits/sec
[ 15] MSS size 1448 bytes (MTU 1500 bytes, ethernet)

RECEIVER END

Things to note

OWAMP in particular is sensitive to accurate time measurements, which is why the VMs come packaged with ntpd that starts on boot. However this does not solve all the problems. Measuring jitter in a VM may produce unpredictable results due to VM sharing cores with other VMs in the same worker node. While in ExoGENI we do not oversubscribe cores, we also do not (yet) do any core pinning when placing VMs inside workers, which means time artifacts may occur when VMs switch cores.

The end result is that while jitter measurements using OWAMP may have high resolution, their accuracy should be questioned. To improve the accuracy try using larger instance sizes, like XOLarge and XOExtraLarge.