The following items from the National Science Foundation describe recent experiments in running storm surge modeling software on ExoGENI:
This brief post explains how to use Docker in ExoGENI. The image built for this post is posted in ExoGENI Image Registry and also available in Flukes.
Name: Docker-v0.1 URL: http://geni-images.renci.org/images/ibaldin/docker/centos6.6-docker-v0.1/centos6.6-docker-v0.1.xml Hash: b2262a8858c9c200f9f43d767e7727a152a02248
This is a CentOS 6.6-based image with a Docker install on top. Note that this post is not meant to serve as a primer on Docker. For this you should consult Docker documentation.
Theory of operation
Docker is a platform to configure, ship and run applications. It uses thin containers to isolate dockers from each other. The docker daemon uses a devmapper to manage its disk space. By default Docker creates a sparse 100G file for data, and each docker can take up to 10G of disk space, which clearly is too large to run inside a VM, should it try to fill it up.
To ensure this doesn’t happen the ExoGENI Docker VM sizes each docker not to exceed 5G and the overall space given to Docker in its sparse file is limited to 20G. This setting is adjusted by editing a line in /etc/sysconfig/docker:
other_args="--storage-opt dm.basesize=5G --storage-opt dm.loopdatasize=20G"
If you wish to resize the amount of space available to your Docker, please edit this line accordingly and then do the following:
$ service docker stop $ cd /var/lib $ rm -rf docker $ service docker start
Please note that wiping out the /var/lib/docker directory as shown above will wipe out all images and containers you may have created so far. If you wish to save the image you created, please do
$ docker save -o image.tar
and save each image. Once Docker disk space is resized and restarted, you can reload the images back using docker load command.
Using the Docker image
You can simply boot some number of instances with the Docker image pointed above and start loading Dockers from Docker Hub or creating your own.
We recommend using the larger VM sizes, like XOLarge and XOExtraLarge to make sure you don’t run out of disk space.
Special thanks go to Brian Tierney of LBL/ESnet for his help in creating the perfSonar image.
This post describes how to use a perfSonar image in ExoGENI slices. The image built for this blog post is now posted in the ExoGENI Image Registry and available in Flukes.
Name: psImage-v0.3 URL: http://geni-images.renci.org/images/ibaldin/perfSonar/psImage-v0.3/psImage-v0.3.xml Hash: e45a2c809729c1eb38cf58c4bff235510da7fde5
Note that we are using a Level 2 perfSonar image out of a Centos 6.6 base image with modified ps_light docker from ESnet. However the registration with the perfSonar lookup service is disabled in this image.
Theory of operation
The perfSonar image uses Docker technology to deploy its components. The following elements are included in the image:
- Client programs for nuttcp, iperf, iperf3, bwctl and owamp included as simple RPMs accessible by all users
- Server programs for bwctl and owamp running inside a Docker
The image starts Docker on boot, loads the needed Docker images and automatically launches the ‘ps_light_xo’ Docker with the server programs in it.
-bash-4.1# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ba28266c1aec ps_light_xo:latest "/bin/sh -c '/usr/bi 6 minutes ago Up 6 minutes suspicious_lovelace
Under normal operation the user should not have no interact with the server programs – the Docker is running in net host mode and the server programs listen on all the interfaces the VM may have. However, if needed, the user can gain access to the Docker with server programs using the following command:
$ docker exec -ti <guid> /bin/bash
where ‘<guid>’ refers to the automatically started Docker. You can find out the guid by issuing this command:
$ docker ps
Using the image
You can create a topology using the perfSonar image (listed in Image registry and above) and then run the client programs on some nodes against server nodes on other nodes. Since the image has both client and server programs, measurements can be done in any direction as long as IP connectivity is assured.
Once the slice has booted try a few client programs:
-bash-4.1# owping 172.16.0.2 Approximately 13.0 seconds until results available --- owping statistics from [172.16.0.1]:8852 to [172.16.0.2]:8966 --- SID: ac100002d8c6ba8674af285470d65b0b first: 2015-04-01T14:42:15.627 last: 2015-04-01T14:42:25.314 100 sent, 0 lost (0.000%), 0 duplicates one-way delay min/median/max = -0.496/-0.4/-0.144 ms, (err=3.9 ms) one-way jitter = 0.2 ms (P95-P50) TTL not reported no reordering --- owping statistics from [172.16.0.2]:8938 to [172.16.0.1]:8954 --- SID: ac100001d8c6ba867d50999ce0a1166f first: 2015-04-01T14:42:15.553 last: 2015-04-01T14:42:24.823 100 sent, 0 lost (0.000%), 0 duplicates one-way delay min/median/max = 1.09/1.3/1.5 ms, (err=3.9 ms) one-way jitter = 0.2 ms (P95-P50) TTL not reported no reordering
-bash-4.1# bwctl -c 172.16.0.2 bwctl: Using tool: iperf bwctl: 16 seconds until test results available RECEIVER START ------------------------------------------------------------ Server listening on TCP port 5578 Binding to local address 172.16.0.2 TCP window size: 87380 Byte (default) ------------------------------------------------------------ [ 15] local 172.16.0.2 port 5578 connected with 172.16.0.1 port 59083 [ ID] Interval Transfer Bandwidth [ 15] 0.0-10.0 sec 1356333056 Bytes 1081206753 bits/sec [ 15] MSS size 1448 bytes (MTU 1500 bytes, ethernet) RECEIVER END
Things to note
OWAMP in particular is sensitive to accurate time measurements, which is why the VMs come packaged with ntpd that starts on boot. However this does not solve all the problems. Measuring jitter in a VM may produce unpredictable results due to VM sharing cores with other VMs in the same worker node. While in ExoGENI we do not oversubscribe cores, we also do not (yet) do any core pinning when placing VMs inside workers, which means time artifacts may occur when VMs switch cores.
The end result is that while jitter measurements using OWAMP may have high resolution, their accuracy should be questioned. To improve the accuracy try using larger instance sizes, like XOLarge and XOExtraLarge.
This press release from the National Science Foundation highlights the research by energy researchers from NCSU in using ExoGENI:
Author: Prof. Rudra Dutta of NCSU Computer Science
I have been using GENI in some form in my teaching for the last three years –
very tentatively to start with, but more extensively over time. Fall, 2014
was my most ambitious use so far.
The course I was teaching was Internet Protocols – a graduate level course on
networking that assumes at least one previous general networking course.
I assume basic working knowledge of networking, TCP/IP, socket programming,
and general knowledge of common Internet functionality such as HTTP, DNS etc.
After a quick refresher of some of this topic, the course dives into details of
the forwarding engine, QoS issues in forwarding, programming kernel modules
with netfilter, some content about routing, SDN, etc. The first half of the
course largely individual and group assignments on these topics, and the
second half is one large group project. Groups are assigned by myself for both
assignments and project – not self-assigned.
In this instance, I used GENI in two different ways – first, specific
questions out of some of the homework assignments were required to be done on
GENI, and later, GENI was specified as one of the three platforms that students
could use for the project. More detailed information about the administration,
including the specific assignments, is available for those interested from
http://dutta.csc.ncsu.edu/csc573_fall14/index.html. I exclusively guided the
students into using ExoGENI substrates, because experience from previous
semesters indicated that it was the most nearly consistent in providing
slice behaviors. (Some other substrates would have varying slice and
stitching behavior, when trying with the same RSPEC multiple times – this was
confusing to students.) We also used a methodology of designing/reserving
through Flukes and accessing separately by ssh, because it went well with the
rest of ExoGENI, and again presented a unique way to negotiate the
authentication/authorization issues for the students.
Before assigning the first homework, I had briefly covered GENI operations in
class. The first assignment actually had them create GENI IDs, request joining
the ncsu_teaching project, and they ended up simply being able to reserve a
slice with a simple four-node tandem network, then setting up routing tables at
each node to get a simple ping through. Later homework assignments were more
complex, until the final one asked them to create a seven-node topology and use
both OpenFlow and kernel module programming to build and investigate the
behavior of a firewall.
There were a total of 86 students, who were eventually
grouped into 22 project teams; however, the class started with a somewhat
larger number of students who attempted the early assignments. There were the
usual initial problems; students complained of resources not being available,
access never working, very sluggish access, and other similar issues. Upon
investigation most of these could be resolved to misunderstandings about ssh
key files, lack of appreciation of how much extra bandwidth it requires to
push through a GUI through two sets of networking connections (many of the
students had no suitable self-owned computing to access GENI from, and were
using servers from VCL, the NCSU computing cloud, to do so), not realizing that
management interfaces were visible to them and trying to use them for their
slice usage, etc. There were also some actual GENI issues – over this period
ExoGENI went through some ExoSM problems, which caused them to advocate that
anybody not using cross-rack stitching should use specific SMs rather than
ExoSM (contrary to what the webpages typically said), and also changed the
format of the Flukes .properties file, which the TA had to scramble to get
communicated to all students. By far the problem that had the biggest impact
on the students was that resources were not always available when needed –
students would wait for hours and days without requests being provisioned.
We cannot be sure but believe that these represent real resource crunches, not
an artifact or mistake of some kind.
When time came to propose project ideas, I was somewhat surprised (after all
the complaints) that 12 out of the 22 teams picked GENI as their platform
outright, and another 7 listed it as one of the two platforms they were going
to use (3 of these eventually ended up working on GENI). While the teams had
varied success in their projects, I was glad to see that they had all
negotiated GENI well. Some of the projects were quite impressive. Most of
them would have been possible to execute in my department’s networking lab,
but it would not have been possible to support the same number and variety of
Each of the teams that used GENI as their project platform wrote up a short
assessment of the role of GENI in their project. A few of these are appended.
Most of them speak of pros as well as cons, but on the whole they confirm that
the availability of GENI enriched this course, and produced learning benefit
that would not have been unattainable without it.
As we’re upgrading the ExoGENI infrastructure to the new release of ORCA (5.0 Eastsound), there are a few things experimenters should know about the features and capabilities of this new release.
The main feature being added is the so called state recovery, or the ability to restart the various ORCA actors and retain the state about created slices. This will allow experimenters to run long-lived experiments without concerns about the interference of software updates or some other disruptive events. The recovery handles many situations, although catastrophic events may still result in the loss of slice information.
Another area of attention for us has been bare-metal node provisioning – we have strengthened the code that performs bare-metal provisioning, making it more error-proof and also added the ability to attach iSCSI storage volumes to bare-metal nodes. This capability until now has only worked for virtual machine slivers.
ORCA5 has allowed us to enable hybrid mode support in the rack switches, which in simple terms means those experimenters that care to use the OpenFlow capabilities of the switch, can do that, while the rest can use traditional VLANs, with more predictable performance guarantees.
Finally, we introduced the ability to see the boot consoles of VMs in case of failure, a feature we hope will help in debugging stubborn image creation issues.
- Attachment to mesoscale VLANs
- Won’t work properly with current NDL converter
- Doesn’t work due to yet to be determined problems with switch hybrid configuration – packets don’t pass properly between OpenFlow and VLAN parts of the switch.
- NDL conversion for some slice manifests may not work properly. Slices may appear disconnected. This requires an update to the NDL converter, which will be done once more racks are upgraded.
We’ve described in the past how to run OpenFlow experiments in ExoGENI using virtual OVS switches hosted in virtual machines. With the release of ORCA5, it is now possible to run some experiments using the real OpenFlow switches in ExoGENI racks (for IBM-based racks they are the BNT G8264).
To do that, start your OpenFlow controller on a host with a publicly accessible IP address. Then create a slice in Flukes (remember to assign IP addresses to dataplane interfaces of the nodes) as shown in the figure below, making sure to mark the reservation as ‘OpenFlow’ and fill in the details – your email, slice password (not really important – can be any random string and you don’t need to remember it) and the url of the controller in the form of
tcp:<hostname or ip>:<port number, typically 6633>
Submit the slice and wait for the Flowvisor on the rack head node to connect to your controller. You should see the various events (e.g. PACKET_IN) flowing by in the controller log. And that’s it.
Current limitations – only one link per slice. Only ORCA5 IBM racks can do this at the moment, which excludes UvA, WVnet, NCSU and Duke racks.
Author: Jeffrey L. Tilson and Jonathan Mills.
Context: Open Science for Synthesis is unique bi-coastal training offered for early career scientists who want to learn new software and technology skills needed for open, collaborative, and reproducible synthesis research. UC Santa Barbara’ National Center for Ecological Analysis and Synthesis (NCEAS) and University of North Carolina’s Renaissance Computing Institute (RENCI) co-lead this three-week intensive training workshop with participants in both Santa Barbara, CA and Chapel Hill, NC from July 21 – August 8, 2014. The training was sponsored by the Institute for Sustainable Earth and Environmental Software (ISEES) and the Water Science Software Institute(WSSI), both of which are conceptualizing an institute for sustainable scientific software.
The participants were initially clustered into research groups based, in part, upon mutual interests. Then in conjunction with their research activities, daily bi-coastal sessions were started to develop expertise in sustainable software practices in the technical aspects that underlie successful open science and synthesis – from data discovery and integration to analysis and visualization, and special techniques for collaborative scientific research as applied to the team-projects. The specific projects are described at https://github.com/NCEAS/training/wiki/OSS-2014-Synthesis-Projects.
Specifics of ExoGENI: In support of the research teams, ExoGENI provisioned a total of three slices, where a slice is defined as one or more compute resources (virtual machines or bare metal nodes) that are interconnected via a dedicated private network. The largest slice contained four virtual machines (VM), with each VM having 75 GB of disk space, 4 cpus, and 12GB of RAM. A second slice, using two of the same sized VMs as the first, additionally had a 1 TB storage volume mounted via iSCSI onto each host. The last slice utilized two bare metal nodes, each with 20 CPU cores and 96GB of RAM, and had R installed for statistical programming. These slices were allocated throughout the duration of the conference. Access by workshop participants was provided via ssh keys. Workshop staff were provided additional keys for root access.
Lessons learned: The ExoGENI provided resources were easy to assemble and make available to the research teams. Each team provided their best guess regarding memory, disk, and computation needs which resulted in three different classes of ExoGENI resources.
The ExoGENI resources that were initiated for participants were all Linux oriented. Moving forward, alternative operating systems should be considered perhaps by getting research group feedback at the start of the workshop.
ExoGENI is a new GENI testbed that links GENI to two advances in virtual infrastructure services outside of GENI: open cloud computing (OpenStack) and dynamic circuit fabrics. ExoGENI orchestrates a federation of independent cloud sites located across the US and circuit providers, like NLR and Internet2 through their native IaaS API interfaces, and links them to other GENI tools and resources.
ExoGENI is, in effect, a widely distributed networked infrastructure-as-a-service (NIaaS) platform geared towards experimentation and computational tasks. ExoGENI employs sophisticated topology embedding algorithms that take advantage of semantic resource descriptions using NDL-OWL – a variant of Network Description Language.
Individual ExoGENI deployments consist of cloud site “racks” on host campuses, linked with national research networks through programmable exchange points. The ExoGENI sites and control software are enabled for flexible networking operations using traditional VLAN-based switching and OpenFlow. Using ORCA (Open Resource Control Architecture) control framework software, ExoGENI offers a powerful unified hosting platform for deeply networked, multi-domain, multi-site cloud applications. We intend that ExoGENI will seed a larger, evolving platform linking other third- party cloud sites, transport networks, and other infrastructure services, and that it will enable real-world deployment of innovative distributed services and new visions of a Future Internet.
To learn about how to use the testbed, please visit the ExoGENI wiki.
Projects that power ExoGENI:
- Hadoop Tutorial
- ExoGENI: Getting Started Tutorial
- Enabling Site-Aware Scheduling for Apache Storm in ExoGENI
- Using Docker in ExoGENI
- Using perfSonar in ExoGENI
- Network Delay Emulation
- Using Dropbox for ExoGENI Images
- Running OpenFlow experiments with real rack switches
- Creating Custom ExoGENI Images using Virtual Box
- Dealing with LLDP topology discovery in ExoGENI slices.
- Lies, damn lies, and iperf: Dataplane network tuning in ExoGENI today
- Creating a Custom Image from an Existing Virtual Machine
- Using stitchports to connect slices to external infrastructure
- New Capability: SoNIC-enabled Software-Defined Precise Network Measurements in GENI
- Example: creating slivers of iSCSI Storage
- Taking the mystery out of ExoGENI resource binding Part 2: inter-rack slices
- Example: Modifying a Slice (add/remove nodes)
- ExoGENI New Features from Sep. 2013 upgrade
- Hadoop in a Slice
- Taking the mystery out of ExoGENI resource binding Part 1: unbound slices
- Example: Postboot Scripts for Creating Hostnames and Name Resolution
- Welcome to ExoBlog!
- Using GENI for teaching Computer Networking at NCSU
- Using ExoGENI for training in reproducible synthesis research
- Lehigh University CSE 303 Operating System HW10
- Predicting a storm’s landfall and impact using ExoGENI
- ExoGENI and SmartGrid Research
- US Ignite recognizes researchers from NC State and RENCI for innovative app for monitoring power grids