As we’re upgrading the ExoGENI infrastructure to the new release of ORCA (5.0 Eastsound), there are a few things experimenters should know about the features and capabilities of this new release.
The main feature being added is the so called state recovery, or the ability to restart the various ORCA actors and retain the state about created slices. This will allow experimenters to run long-lived experiments without concerns about the interference of software updates or some other disruptive events. The recovery handles many situations, although catastrophic events may still result in the loss of slice information.
Another area of attention for us has been bare-metal node provisioning – we have strengthened the code that performs bare-metal provisioning, making it more error-proof and also added the ability to attach iSCSI storage volumes to bare-metal nodes. This capability until now has only worked for virtual machine slivers.
ORCA5 has allowed us to enable hybrid mode support in the rack switches, which in simple terms means those experimenters that care to use the OpenFlow capabilities of the switch, can do that, while the rest can use traditional VLANs, with more predictable performance guarantees.
Finally, we introduced the ability to see the boot consoles of VMs in case of failure, a feature we hope will help in debugging stubborn image creation issues.
- Attachment to mesoscale VLANs
- Won’t work properly with current NDL converter
- Doesn’t work due to yet to be determined problems with switch hybrid configuration – packets don’t pass properly between OpenFlow and VLAN parts of the switch.
- NDL conversion for some slice manifests may not work properly. Slices may appear disconnected. This requires an update to the NDL converter, which will be done once more racks are upgraded.
We’ve described in the past how to run OpenFlow experiments in ExoGENI using virtual OVS switches hosted in virtual machines. With the release of ORCA5, it is now possible to run some experiments using the real OpenFlow switches in ExoGENI racks (for IBM-based racks they are the BNT G8264).
To do that, start your OpenFlow controller on a host with a publicly accessible IP address. Then create a slice in Flukes (remember to assign IP addresses to dataplane interfaces of the nodes) as shown in the figure below, making sure to mark the reservation as ‘OpenFlow’ and fill in the details – your email, slice password (not really important – can be any random string and you don’t need to remember it) and the url of the controller in the form of
tcp:<hostname or ip>:<port number, typically 6633>
Submit the slice and wait for the Flowvisor on the rack head node to connect to your controller. You should see the various events (e.g. PACKET_IN) flowing by in the controller log. And that’s it.
Creating a slice using OpenFlow capabilities of the rack switch
Current limitations – only one link per slice. Only ORCA5 IBM racks can do this at the moment, which excludes UvA, WVnet, NCSU and Duke racks.
Author: Jeffrey L. Tilson and Jonathan Mills.
Context: Open Science for Synthesis is unique bi-coastal training offered for early career scientists who want to learn new software and technology skills needed for open, collaborative, and reproducible synthesis research. UC Santa Barbara’ National Center for Ecological Analysis and Synthesis (NCEAS) and University of North Carolina’s Renaissance Computing Institute (RENCI) co-lead this three-week intensive training workshop with participants in both Santa Barbara, CA and Chapel Hill, NC from July 21 – August 8, 2014. The training was sponsored by the Institute for Sustainable Earth and Environmental Software (ISEES) and the Water Science Software Institute(WSSI), both of which are conceptualizing an institute for sustainable scientific software.
The participants were initially clustered into research groups based, in part, upon mutual interests. Then in conjunction with their research activities, daily bi-coastal sessions were started to develop expertise in sustainable software practices in the technical aspects that underlie successful open science and synthesis – from data discovery and integration to analysis and visualization, and special techniques for collaborative scientific research as applied to the team-projects. The specific projects are described at https://github.com/NCEAS/training/wiki/OSS-2014-Synthesis-Projects.
Specifics of ExoGENI: In support of the research teams, ExoGENI provisioned a total of three slices, where a slice is defined as one or more compute resources (virtual machines or bare metal nodes) that are interconnected via a dedicated private network. The largest slice contained four virtual machines (VM), with each VM having 75 GB of disk space, 4 cpus, and 12GB of RAM. A second slice, using two of the same sized VMs as the first, additionally had a 1 TB storage volume mounted via iSCSI onto each host. The last slice utilized two bare metal nodes, each with 20 CPU cores and 96GB of RAM, and had R installed for statistical programming. These slices were allocated throughout the duration of the conference. Access by workshop participants was provided via ssh keys. Workshop staff were provided additional keys for root access.
Lessons learned: The ExoGENI provided resources were easy to assemble and make available to the research teams. Each team provided their best guess regarding memory, disk, and computation needs which resulted in three different classes of ExoGENI resources.
The ExoGENI resources that were initiated for participants were all Linux oriented. Moving forward, alternative operating systems should be considered perhaps by getting research group feedback at the start of the workshop.
ExoGENI is a new GENI testbed that links GENI to two advances in virtual infrastructure services outside of GENI: open cloud computing (OpenStack) and dynamic circuit fabrics. ExoGENI orchestrates a federation of independent cloud sites located across the US and circuit providers, like NLR and Internet2 through their native IaaS API interfaces, and links them to other GENI tools and resources.
ExoGENI is, in effect, a widely distributed networked infrastructure-as-a-service (NIaaS) platform geared towards experimentation and computational tasks. ExoGENI employs sophisticated topology embedding algorithms that take advantage of semantic resource descriptions using NDL-OWL – a variant of Network Description Language.
Individual ExoGENI deployments consist of cloud site “racks” on host campuses, linked with national research networks through programmable exchange points. The ExoGENI sites and control software are enabled for flexible networking operations using traditional VLAN-based switching and OpenFlow. Using ORCA (Open Resource Control Architecture) control framework software, ExoGENI offers a powerful unified hosting platform for deeply networked, multi-domain, multi-site cloud applications. We intend that ExoGENI will seed a larger, evolving platform linking other third- party cloud sites, transport networks, and other infrastructure services, and that it will enable real-world deployment of innovative distributed services and new visions of a Future Internet.
To learn about how to use the testbed, please visit the ExoGENI wiki.
Projects that power ExoGENI:
- ORCA-BEN – core development of ORCA features. ExoGENI is controlled by a specific deployment of ORCA tailored to GENI needs and requirements.
- NeworkedClouds – adapting OpenStack to a networked clouds environment.
There are no new software features. This maintenance was meant to restore stitching to UCD rack broken by the reconfiguration of Internet2 AL2S.
- There is a new rack at TAMU that is accessible for intra-rack slices only for now due to lack of physical connectivity to its upstream provider. Once the connectivity is there we will enable stitching to TAMU. See the wiki for rack controller information.
- There is a new version of Flukes with minor changes. As a reminder, Flukes has been reported not to work well with Java 7 on certain platforms, specifically Mac OS. Java 6 typically solves the problems.
- Connectivity to OSF rack currently is malfunctioning due to problems with ESnet OSCARS control software instance. We will announce when those problems have been addressed.
- Stitching to UCD has been restored.
This post comes to us courtesy of Cong Wang from UMass Amherst.
ExoGENI provides the key features for stitching among multiple ExoGENI racks or connecting non-ExoGENI nodes to ExoGENI slices via a stitchport. This article provides some initial instructions on how to stitch ExoGENI slices to the external infrastructure with stitchports using VLANs.
The procedure of stitching to external stitch port is similar to the example above, except that the local machine needs to be configure correctly to allow connection to the VLAN. In the following example, the local laptop (outside ExoGENI testbed) is located in University of Wisconsin Madison campus. It connects to ExoGENI via layer 2 vlan 920. The Flukes request is shown in the figure. The available stitchport URL and Label/Tag can be found in ExoGENI wiki. More stitchports at other locations can be added upon request via the users mailing list.
After the slice is successfully reserved, the manifest view should be similar to the one shown below:
In order to attach the local (non-ExoGENI) node to the VLAN, the interface connected to the VLAN has to be configured correctly. In addition to the regular IP configuration, the attached interface has to be configured with the correct VLAN ID.
In the following an example for the case of Ubuntu is given:
At this point, you should be able to ping Node1, which means the stitchport has been successfully attached into ExoGENI.
This note summarizes the results of maintenance across the entire ExoGENI Testbed in April 2014.
- Minor fixes added to BEN to better support inter-domain topologies
- UDP performance issue addressed. See below for more details.
- FOAM updated across the racks
- Floodlight updated across the racks
- most of the connectivity caveats from the previous note still apply.
- UvA rack is currently not reachable. We suspect a problem with the configuration in SURFnet, which we will address.
- This behavior appears to have resolved itself by 05/06 without our intervention. Please report any further problems.
The details: UDP performance
Some of you have observed very poor performance for UDP transfers – extremely high packet losses and very low transfer rates. This issue was traced to three separate causes:
- Poor implementation of “learning switch” functionality in the version of Floodlight OpenFlow controller we were using. It resulted in sudden losses of packets after a period of time, particularly between bare-metal nodes. To resolve this issue we upgraded Floodlight to version 0.9 and replaced the “learning switch” module in it with a better-behaved “forwarding” module.
- Insufficient configuration of the QEMU interface to the guest VM, which resulted in very high packet losses. We updated the Quantum agent to support the proper options.
- Sensitivity of UDP transfers to host- and guest-side transmit and receive buffers. We tuned the host-side buffers on worker nodes, however the tuning guest-side must be accomplished by the experimenter.
To explain further how to get the best performance out of UDP on ExoGENI we will publish a separate blog entry in the immediate future.