Using GENI for teaching Computer Networking at NCSU

Author: Prof. Rudra Dutta of NCSU Computer Science

I have been using GENI in some form in my teaching for the last three years –

very tentatively to start with, but more extensively over time. Fall, 2014
was my most ambitious use so far.

The course I was teaching was Internet Protocols – a graduate level course on
networking that assumes at least one previous general networking course.
I assume basic working knowledge of networking, TCP/IP, socket programming,
and general knowledge of common Internet functionality such as HTTP, DNS etc.
After a quick refresher of some of this topic, the course dives into details of
the forwarding engine, QoS issues in forwarding, programming kernel modules
with netfilter, some content about routing, SDN, etc. The first half of the
course largely individual and group assignments on these topics, and the
second half is one large group project. Groups are assigned by myself for both
assignments and project – not self-assigned.

In this instance, I used GENI in two different ways – first, specific
questions out of some of the homework assignments were required to be done on
GENI, and later, GENI was specified as one of the three platforms that students
could use for the project. More detailed information about the administration,
including the specific assignments, is available for those interested from
http://dutta.csc.ncsu.edu/csc573_fall14/index.html. I exclusively guided the
students into using ExoGENI substrates, because experience from previous
semesters indicated that it was the most nearly consistent in providing
slice behaviors. (Some other substrates would have varying slice and
stitching behavior, when trying with the same RSPEC multiple times – this was
confusing to students.) We also used a methodology of designing/reserving
through Flukes and accessing separately by ssh, because it went well with the
rest of ExoGENI, and again presented a unique way to negotiate the
authentication/authorization issues for the students.

Before assigning the first homework, I had briefly covered GENI operations in
class. The first assignment actually had them create GENI IDs, request joining
the ncsu_teaching project, and they ended up simply being able to reserve a
slice with a simple four-node tandem network, then setting up routing tables at
each node to get a simple ping through. Later homework assignments were more
complex, until the final one asked them to create a seven-node topology and use
both OpenFlow and kernel module programming to build and investigate the
behavior of a firewall.

There were a total of 86 students, who were eventually
grouped into 22 project teams; however, the class started with a somewhat
larger number of students who attempted the early assignments. There were the
usual initial problems; students complained of resources not being available,
access never working, very sluggish access, and other similar issues. Upon
investigation most of these could be resolved to misunderstandings about ssh
key files, lack of appreciation of how much extra bandwidth it requires to
push through a GUI through two sets of networking connections (many of the
students had no suitable self-owned computing to access GENI from, and were
using servers from VCL, the NCSU computing cloud, to do so), not realizing that
management interfaces were visible to them and trying to use them for their
slice usage, etc. There were also some actual GENI issues – over this period
ExoGENI went through some ExoSM problems, which caused them to advocate that
anybody not using cross-rack stitching should use specific SMs rather than
ExoSM (contrary to what the webpages typically said), and also changed the
format of the Flukes .properties file, which the TA had to scramble to get
communicated to all students. By far the problem that had the biggest impact
on the students was that resources were not always available when needed –
students would wait for hours and days without requests being provisioned.
We cannot be sure but believe that these represent real resource crunches, not
an artifact or mistake of some kind.

When time came to propose project ideas, I was somewhat surprised (after all
the complaints) that 12 out of the 22 teams picked GENI as their platform
outright, and another 7 listed it as one of the two platforms they were going
to use (3 of these eventually ended up working on GENI). While the teams had
varied success in their projects, I was glad to see that they had all
negotiated GENI well. Some of the projects were quite impressive. Most of
them would have been possible to execute in my department’s networking lab,
but it would not have been possible to support the same number and variety of
projects.

Each of the teams that used GENI as their project platform wrote up a short
assessment of the role of GENI in their project. A few of these are appended.
Most of them speak of pros as well as cons, but on the whole they confirm that
the availability of GENI enriched this course, and produced learning benefit
that would not have been unattainable without it.

Team Responses

Using ExoGENI for training in reproducible synthesis research

Author: Jeffrey L. Tilson and Jonathan Mills.

Context: Open Science for Synthesis is unique bi-coastal training offered for early career scientists who want to learn new software and technology skills needed for open, collaborative, and reproducible synthesis research. UC Santa Barbara’ National Center for Ecological Analysis and Synthesis (NCEAS) and University of North Carolina’s Renaissance Computing Institute (RENCI) co-lead this three-week intensive training workshop with participants in both Santa Barbara, CA and Chapel Hill, NC from July 21 – August 8, 2014. The training was sponsored by the Institute for Sustainable Earth and Environmental Software (ISEES) and the Water Science Software Institute(WSSI), both of which are conceptualizing an institute for sustainable scientific software.

The participants were initially clustered into research groups based, in part, upon mutual interests. Then in conjunction with their research activities, daily bi-coastal sessions were started to develop expertise in sustainable software practices in the technical aspects that underlie successful open science and synthesis – from data discovery and integration to analysis and visualization, and special techniques for collaborative scientific research as applied to the team-projects. The specific projects are described at https://github.com/NCEAS/training/wiki/OSS-2014-Synthesis-Projects.

Specifics of ExoGENI: In support of the research teams, ExoGENI provisioned a total of three slices, where a slice is defined as one or more compute resources (virtual machines or bare metal nodes) that are interconnected via a dedicated private network.  The largest slice contained four virtual machines (VM), with each VM having 75 GB of disk space, 4 cpus, and 12GB of RAM.  A second slice, using two of the same sized VMs as the first, additionally had a 1 TB storage volume mounted via iSCSI onto each host.  The last slice utilized two bare metal nodes, each with 20 CPU cores and 96GB of RAM, and had R installed for statistical programming.  These slices were allocated throughout the duration of the conference. Access by workshop participants was provided via ssh keys. Workshop staff were provided additional keys for root access.

Lessons learned: The ExoGENI provided resources were easy to assemble and make available to the research teams. Each team provided their best guess regarding memory, disk, and computation needs which resulted in three different classes of ExoGENI resources.

The ExoGENI resources that were initiated for participants were all Linux oriented. Moving forward, alternative operating systems should be considered perhaps by getting research group feedback at the start of the workshop.

Lehigh University CSE 303 Operating System HW10

Author: Dawei Li

Topic: Category-based N-gram Counting and Analysis Using Map-Reduce Framework

Class Size: 14 students

We use one GENI account (through Flukes) to create slices for all students, and distribute the SSH private key as well as the assigned IP address to each of them.

Statistics
Total Slices 15
Total Nodes 73
Controllers Used OSF (7 slices/7 nodes each) WVN (8 slices/3 nodes each)
Duration 9 days (11 slices) 14 days (4 slices)

Comments:

The Flukes tool is really convenient. I just spend a few hours to figure out how to use it and how to create a Hadoop cluster. Only one thing is that I have to poll the status of the slice myself to know if it is ready or not. As far as I know, the Flack GUI of ProtoGENI can poll it automatically and show users the changing status until it is ready.

I have heard no complaint from students about connection problem, meaning that the testbed resources are relatively stable whether accessing on campus or not. Some students cannot log into the testbed just because they are not familiar with SSH. However, one grader said that he couldn’t log into the testbed on May 3rd (around midnight) using one OSF slice, but he can log in again the next morning.