Hadoop Tutorial

In this tutorial you will create a slice composed of three virtual machines that are a Hadoop cluster. The tutorial will lead you through creating the slice, observing the properties of the slice, and running a Hadoop example which sorts a large dataset.

After completing the tutorial you should be able to:

  1. Use ExoGENI to create a virtual distributed computational cluster.
  2. Create simple post boot scripts with simple template replacement.
  3. Use the basic functionality of a Hadoop filesystem.
  4. Observe resource utilization of compute nodes in a virtual distributed computational cluster.

Create the Request:

(Note: a completed rdf request file can be found here.   The request file can be loaded into Flukes by choosing File -> Open request)

1.  Start the Flukes application

2.  Create the Hadoop master node by selecting “Node” from the “Add Nodes” menu. Click in the field to place a VM in the request.Screen Shot 2015-09-17 at 2.19.01 PM

3.  Right-click the node and edit its properties.   Set:  (name: “master”,  node type: “XO medium”, image: “Hadoop 2.7.1 (Centos7)”, domain: “System select”).   The name must be “master”.Screen Shot 2015-09-17 at 2.30.32 PM

4. Create a group of Hadoop workers by choosing “Node Group” from the “Add Nodes” menu.  Click in the field to add the group to the request.Screen Shot 2015-09-17 at 2.31.55 PM

5.  Right-click the node group and edit its properties.   Set:  (name: “workers”,  node type: “XO medium”, image: “Hadoop 2.7.1 (Centos7)”, domain: “System select”, Number of Servers: “2”).   The name must be “workers”.Screen Shot 2015-09-17 at 2.33.23 PM

 

6.  Draw a link between the master and the workersScreen Shot 2015-09-17 at 2.34.50 PM

7.   Edit the properties of the master and workers by right-clicking and choosing “edit propoerties”.  Set the IP address of the master to 172.16.1.1/24 and the workers group to 172.16.1.100/24.  Note that each node in the workers group will be assigned the sequential  IP addresses starting with 172.16.1.100.

8.  Edit the post-boot script for the master node.   Right-click the master node and choose “Edit Properties”.  Click “PostBoot Script”.  A text box will appear.  Add the following script:

#!/bin/bash

#setup /etc/hosts
echo $master.IP("Link0") $master.Name() >> /etc/hosts
#set ( $sizeWorkerGroup = $workers.size() - 1 )
#foreach ( $j in [0..$sizeWorkerGroup] )
 echo $workers.get($j).IP("Link0") `echo $workers.get($j).Name() | sed 's/\//-/g'` >> /etc/hosts
#end

HADOOP_CONF_DIR=/home/hadoop/hadoop-2.7.1/etc/hadoop
CORE_SITE_FILE=${HADOOP_CONF_DIR}/core-site.xml
HDFS_SITE_FILE=${HADOOP_CONF_DIR}/hdfs-site.xml
MAPRED_SITE_FILE=${HADOOP_CONF_DIR}/mapred-site.xml
YARN_SITE_FILE=${HADOOP_CONF_DIR}/yarn-site.xml
SLAVES_FILE=${HADOOP_CONF_DIR}/slaves

echo "" > $CORE_SITE_FILE
cat > $CORE_SITE_FILE << EOF
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>fs.default.name</name>
 <value>hdfs://$master.Name():9000</value>
</property>
</configuration>
EOF

echo "" > $HDFS_SITE_FILE
cat > $HDFS_SITE_FILE << EOF
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
 <property>
 <name>dfs.datanode.du.reserved</name>
 <!-- cluster variant -->
 <value>20000000000</value>
 <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
 </description>
 </property>
<property>
 <name>dfs.replication</name>
 <value>2</value>
</property>
<property>
 <name>dfs.name.dir</name>
 <value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
 <name>dfs.data.dir</name>
 <value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
EOF

echo "" > $MAPRED_SITE_FILE
cat > $MAPRED_SITE_FILE << EOF
<configuration>
 <property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
 </property>
</configuration>
EOF

echo "" > $YARN_SITE_FILE
cat > $YARN_SITE_FILE << EOF
<?xml version="1.0"?>
<configuration>
<!-- Site specific YARN configuration properties -->
 <property>
 <name>yarn.resourcemanager.resource-tracker.address</name>
 <value>master:8031</value>
 </property>
 <property>
 <name>yarn.resourcemanager.address</name>
 <value>master:8032</value>
 </property>
 <property>
 <name>yarn.resourcemanager.scheduler.address</name>
 <value>master:8030</value>
 </property>
 <property>
 <name>yarn.resourcemanager.admin.address</name>
 <value>master:8033</value>
 </property>
 <property>
 <name>yarn.resourcemanager.webapp.address</name>
 <value>master:8088</value>
 </property>
<property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
 </property>
</configuration>
EOF

echo "" > $SLAVES_FILE
cat > $SLAVES_FILE << EOF
#set ( $sizeWorkerGroup = $workers.size() - 1 )
#foreach ( $j in [0..$sizeWorkerGroup] )
 `echo $workers.get($j).Name() | sed 's/\//-/g'` 
#end
EOF


echo "" > /home/hadoop/.ssh/config
cat > /home/hadoop/.ssh/config << EOF
Host `echo $self.IP("Link0") | sed 's/.[0-9][0-9]*$//g'`.* master workers-* 0.0.0.0
 StrictHostKeyChecking no
 UserKnownHostsFile=/dev/null
EOF

chmod 600 /home/hadoop/.ssh/config
chown hadoop:hadoop /home/hadoop/.ssh/config

9.  Edit the post-boot script for the workers group and add the same script as the master.

10. Name your slice and submit it to ExoGENI.

11.  Wait for the resources to become ActiveScreen Shot 2015-09-17 at 2.45.44 PM

Check the status of the VMs:

1. Login to the master node.

2.  Observe the properties of the network interfaces

[root@master ~]# ifconfig 
ens3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
 inet 10.103.0.9 netmask 255.255.255.0 broadcast 10.103.0.255
 inet6 fe80::f816:3eff:fe0c:3d02 prefixlen 64 scopeid 0x20<link>
 ether fa:16:3e:0c:3d:02 txqueuelen 1000 (Ethernet)
 RX packets 3518 bytes 823670 (804.3 KiB)
 RX errors 0 dropped 0 overruns 0 frame 0
 TX packets 1704 bytes 218947 (213.8 KiB)
 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
 inet 172.16.1.1 netmask 255.255.255.0 broadcast 172.16.1.255
 inet6 fe80::fc16:3eff:fe00:2862 prefixlen 64 scopeid 0x20<link>
 ether fe:16:3e:00:28:62 txqueuelen 1000 (Ethernet)
 RX packets 3734 bytes 689012 (672.8 KiB)
 RX errors 0 dropped 0 overruns 0 frame 0
 TX packets 1930 bytes 215880 (210.8 KiB)
 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
 inet 127.0.0.1 netmask 255.0.0.0
 inet6 ::1 prefixlen 128 scopeid 0x10<host>
 loop txqueuelen 0 (Local Loopback)
 RX packets 251 bytes 31259 (30.5 KiB)
 RX errors 0 dropped 0 overruns 0 frame 0
 TX packets 251 bytes 31259 (30.5 KiB)
 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

3. Observe the contents of the NEuca user data file.

[root@master ~]# neuca-user-data 
[global]
actor_id=b0e0e413-77ef-4775-b476-040ca8377d1d
slice_id=304eaa29-272d-48e2-8640-ec7300196ac9
reservation_id=c9fd992b-a67b-4cc6-8156-a062df9e8889
unit_id=e22734de-f187-41d4-b952-f07263fbfd43
;router= Not Specified
;iscsi_initiator_iqn= Not Specified
slice_name=pruth.hadoop.1
unit_url=http://geni-orca.renci.org/owl/c9d160a4-1ac0-4b9e-9a8f-8613077a6463#master
host_name=master
management_ip=128.120.83.33
physical_host=ucd-w5
nova_id=b8a37d51-bd50-40b6-89ff-6fbb87bb5fc1
[users]
root=no:ssh-dss AAAAB3NzaC1kc3MAAACBALYdIgCmJoZt+4YN7lUlDLR0ebwfsBd+d0Tw3O18JUc2bXoTzwWBGV+BkGtGljzmr1SDrRXkOWgA//pogG7B1vpizHV6K5MoFQDoSEy/64ycEago611xtMt13xuJei0pPyAphv/NrYlD1xZBMuEG9JTe8EK/H43ZhLcK4b1HwWrTAAAAFQDnZNZVDojT0aHHgJqBncy+iBHs9wAAAIBiMfYPoDgVASgknEzBschxTzTFuhof+lxBh0v5i9OsinMuRa1K5wBbA1eo63PKxywQSnODQhItme0Tn8Pp1ETpM0YkzE48K1NxW3l9iBipSRDMEh8aUlfX5R7xfRRY7tUNXlQQAzXYX8ZvXoA+mbZ9BkBXtSNI5uD1z3Gk5k/WQwAAAIBVIVuHJVgSiCw/m8yjCVH1QgO045ACf4l9/3HaoDwFaNrL1WKQvplhz/DVqtWq/2ZAIrwXr/0IgviRRZ/iVpul3s15ecTJzHAhHMaaDn4vuphH6xbs6JHLFyvBQGJy1euoY9BPqtTFZnH7KdWoChCQXfujDrtcx/5MfBn4tO5kQQ== pruth@dhcp152-54-9-28.europa.renci.org:
pruth=yes:ssh-dss AAAAB3NzaC1kc3MAAACBALYdIgCmJoZt+4YN7lUlDLR0ebwfsBd+d0Tw3O18JUc2bXoTzwWBGV+BkGtGljzmr1SDrRXkOWgA//pogG7B1vpizHV6K5MoFQDoSEy/64ycEago611xtMt13xuJei0pPyAphv/NrYlD1xZBMuEG9JTe8EK/H43ZhLcK4b1HwWrTAAAAFQDnZNZVDojT0aHHgJqBncy+iBHs9wAAAIBiMfYPoDgVASgknEzBschxTzTFuhof+lxBh0v5i9OsinMuRa1K5wBbA1eo63PKxywQSnODQhItme0Tn8Pp1ETpM0YkzE48K1NxW3l9iBipSRDMEh8aUlfX5R7xfRRY7tUNXlQQAzXYX8ZvXoA+mbZ9BkBXtSNI5uD1z3Gk5k/WQwAAAIBVIVuHJVgSiCw/m8yjCVH1QgO045ACf4l9/3HaoDwFaNrL1WKQvplhz/DVqtWq/2ZAIrwXr/0IgviRRZ/iVpul3s15ecTJzHAhHMaaDn4vuphH6xbs6JHLFyvBQGJy1euoY9BPqtTFZnH7KdWoChCQXfujDrtcx/5MfBn4tO5kQQ== pruth@dhcp152-54-9-28.europa.renci.org:
[interfaces]
fe163e002862=up:ipv4:172.16.1.1/24
[storage]
[routes]
[scripts]bootscript=#!/bin/bash
 #setup /etc/hosts
 echo 172.16.1.1 master >> /etc/hosts
 echo 172.16.1.100 `echo workers/0 | sed 's/\//-/g'` >> /etc/hosts
 echo 172.16.1.101 `echo workers/1 | sed 's/\//-/g'` >> /etc/hosts
 HADOOP_CONF_DIR=/home/hadoop/hadoop-2.7.1/etc/hadoop
 CORE_SITE_FILE=${HADOOP_CONF_DIR}/core-site.xml
 HDFS_SITE_FILE=${HADOOP_CONF_DIR}/hdfs-site.xml
 MAPRED_SITE_FILE=${HADOOP_CONF_DIR}/mapred-site.xml
 YARN_SITE_FILE=${HADOOP_CONF_DIR}/yarn-site.xml
 SLAVES_FILE=${HADOOP_CONF_DIR}/slaves
 echo "" > $CORE_SITE_FILE
 cat > $CORE_SITE_FILE << EOF
 <?xml version="1.0" encoding="UTF-8"?>
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 <configuration>
 <property>
 <name>fs.default.name</name>
 <value>hdfs://master:9000</value>
 </property>
 </configuration>
 EOF
 echo "" > $HDFS_SITE_FILE
 cat > $HDFS_SITE_FILE << EOF
 <?xml version="1.0" encoding="UTF-8"?>
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 <configuration>
 <property>
 <name>dfs.datanode.du.reserved</name>
 <!-- cluster variant -->
 <value>20000000000</value>
 <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
 </description>
 </property>
 <property>
 <name>dfs.replication</name>
 <value>2</value>
 </property>
 <property>
 <name>dfs.name.dir</name>
 <value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
 </property>
 <property>
 <name>dfs.data.dir</name>
 <value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
 </property>
 </configuration>
 EOF
 echo "" > $MAPRED_SITE_FILE
 cat > $MAPRED_SITE_FILE << EOF
 <configuration>
 <property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
 </property>
 </configuration>
 EOF
 echo "" > $YARN_SITE_FILE
 cat > $YARN_SITE_FILE << EOF
 <?xml version="1.0"?>
 <configuration>
 <!-- Site specific YARN configuration properties -->
 <property>
 <name>yarn.resourcemanager.resource-tracker.address</name>
 <value>master:8031</value>
 </property>
 <property>
 <name>yarn.resourcemanager.address</name>
 <value>master:8032</value>
 </property>
 <property>
 <name>yarn.resourcemanager.scheduler.address</name>
 <value>master:8030</value>
 </property>
 <property>
 <name>yarn.resourcemanager.admin.address</name>
 <value>master:8033</value>
 </property>
 <property>
 <name>yarn.resourcemanager.webapp.address</name>
 <value>master:8088</value>
 </property>
 <property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
 </property>
 </configuration>
 EOF
 echo "" > $SLAVES_FILE
 cat > $SLAVES_FILE << EOF
 `echo workers/0 | sed 's/\//-/g'` 
 `echo workers/1 | sed 's/\//-/g'` 
 EOF
 echo "" > /home/hadoop/.ssh/config
 cat > /home/hadoop/.ssh/config << EOF
 Host `echo 172.16.1.1 | sed 's/.[0-9][0-9]*$//g'`.* master workers-* 0.0.0.0
 StrictHostKeyChecking no
 UserKnownHostsFile=/dev/null
 EOF
 chmod 600 /home/hadoop/.ssh/config
chown hadoop:hadoop /home/hadoop/.ssh/config

4. Test for connectivity between the VMs.

[root@master ~]# ping workers-0
PING workers-0 (172.16.1.100) 56(84) bytes of data.
64 bytes from workers-0 (172.16.1.100): icmp_seq=1 ttl=64 time=1.21 ms
64 bytes from workers-0 (172.16.1.100): icmp_seq=2 ttl=64 time=0.967 ms
64 bytes from workers-0 (172.16.1.100): icmp_seq=3 ttl=64 time=1.03 ms
64 bytes from workers-0 (172.16.1.100): icmp_seq=4 ttl=64 time=1.06 ms
^C
--- workers-0 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 0.967/1.069/1.212/0.089 ms
[root@master ~]# ping workers-1
PING workers-1 (172.16.1.101) 56(84) bytes of data.
64 bytes from workers-1 (172.16.1.101): icmp_seq=1 ttl=64 time=1.35 ms
64 bytes from workers-1 (172.16.1.101): icmp_seq=2 ttl=64 time=1.06 ms
64 bytes from workers-1 (172.16.1.101): icmp_seq=3 ttl=64 time=0.922 ms
64 bytes from workers-1 (172.16.1.101): icmp_seq=4 ttl=64 time=1.00 ms
^C
--- workers-1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 0.922/1.083/1.351/0.163 ms

Run the Hadoop HDFS:

1.  Login to the master node as root

2.  Switch to the hadoop user

[root@master ~]# su hadoop -
[hadoop@master root]$

3.  Change to hadoop user’s home directory

[hadoop@master root]$ cd ~

4.  Format the HDFS file system

[hadoop@master ~]$ hdfs namenode -format
15/09/17 18:50:22 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/172.16.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.1
...

...
15/09/17 18:50:24 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/09/17 18:50:24 INFO util.ExitUtil: Exiting with status 0
15/09/17 18:50:24 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/172.16.1.1
************************************************************/

5.  Start the HDFS dfs service

[hadoop@master ~]$ start-dfs.sh
Starting namenodes on [master]
master: Warning: Permanently added 'master,172.16.1.1' (ECDSA) to the list of known hosts.
master: starting namenode, logging to /home/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-namenode-master.out
workers-1: Warning: Permanently added 'workers-1,172.16.1.101' (ECDSA) to the list of known hosts.
workers-0: Warning: Permanently added 'workers-0,172.16.1.100' (ECDSA) to the list of known hosts.
workers-1: starting datanode, logging to /home/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-datanode-workers-1.out
workers-0: starting datanode, logging to /home/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-datanode-workers-0.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-secondarynamenode-master.out

6.  Start the yarn service

[hadoop@master ~]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop-2.7.1/logs/yarn-hadoop-resourcemanager-master.out
workers-1: Warning: Permanently added 'workers-1,172.16.1.101' (ECDSA) to the list of known hosts.
workers-0: Warning: Permanently added 'workers-0,172.16.1.100' (ECDSA) to the list of known hosts.
workers-1: starting nodemanager, logging to /home/hadoop/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-workers-1.out
workers-0: starting nodemanager, logging to /home/hadoop/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-workers-0.out

7.   Test to see if the HDFS has started

[hadoop@master ~]$ hdfs dfsadmin -report
Configured Capacity: 14824083456 (13.81 GB)
Present Capacity: 8522616832 (7.94 GB)
DFS Remaining: 8522567680 (7.94 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (2):

Name: 172.16.1.101:50010 (workers-1)
Hostname: workers-1
Decommission Status : Normal 
Configured Capacity: 7412041728 (6.90 GB) 
DFS Used: 24576 (24 KB) 
Non DFS Used: 3150721024 (2.93 GB)
DFS Remaining: 4261296128 (3.97 GB)
DFS Used%: 0.00%
DFS Remaining%: 57.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Sep 17 18:55:41 UTC 2015

Name: 172.16.1.100:50010 (workers-0)
Hostname: workers-0 
Decommission Status : NormalsA
Configured Capacity: 7412041728 (6.90 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 3150745600 (2.93 GB)
DFS Remaining: 4261271552 (3.97 GB)
DFS Used%: 0.00%
DFS Remaining%: 57.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Sep 17 18:55:41 UTC 2015

 Simple HDFS tests:

1.  Create a small test file

[hadoop@master ~]$ echo Hello ExoGENI World > hello.txt

2. Put the test file into the Hadoop filesystem

[hadoop@master ~]$ hdfs dfs -put hello.txt /hello.txt

3. Check for the file’s existence

[hadoop@master ~]$ hdfs dfs -ls /
Found 1 items
-rw-r--r-- 2 hadoop supergroup 20 2015-09-17 19:11 /hello.txt

4. Check the contents of the file

[hadoop@master ~]$ hdfs dfs -cat /hello.txt
Hello ExoGENI World

Run the Hadoop Sort Testcase

Test the true power of the Hadoop filesystem by creating and sorting a large random dataset. It may be useful/interesting to login to the master and/or worker VMs and use tools like top, iotop, and iftop to observe the resource utilization on each of the VMs during the sort test. Note: on these VMs iotop and iftop must be run as root.

1.  Create a 1 GB random data set:  After the data is created, use the ls functionally to confirm the data exists. Note that the data is composed of several files in a directory.

[hadoop@master ~]$ hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar teragen 10000000 /input
15/09/17 19:15:41 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.1.1:8032
15/09/17 19:15:42 INFO terasort.TeraSort: Generating 10000000 using 2
15/09/17 19:15:42 INFO mapreduce.JobSubmitter: number of splits:2
15/09/17 19:15:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1442515994234_0001
15/09/17 19:15:43 INFO impl.YarnClientImpl: Submitted application application_1442515994234_0001
15/09/17 19:15:43 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1442515994234_0001/
15/09/17 19:15:43 INFO mapreduce.Job: Running job: job_1442515994234_0001
15/09/17 19:15:53 INFO mapreduce.Job: Job job_1442515994234_0001 running in uber mode : false
15/09/17 19:15:53 INFO mapreduce.Job: map 0% reduce 0%
15/09/17 19:16:05 INFO mapreduce.Job: map 1% reduce 0%
15/09/17 19:16:07 INFO mapreduce.Job: map 2% reduce 0%
15/09/17 19:16:08 INFO mapreduce.Job: map 3% reduce 0%
15/09/17 19:16:13 INFO mapreduce.Job: map 4% reduce 0%
15/09/17 19:16:17 INFO mapreduce.Job: map 5% reduce 0%
15/09/17 19:16:22 INFO mapreduce.Job: map 6% reduce 0%
...
15/09/17 19:23:10 INFO mapreduce.Job: map 96% reduce 0%
15/09/17 19:23:15 INFO mapreduce.Job: map 97% reduce 0%
15/09/17 19:23:19 INFO mapreduce.Job: map 98% reduce 0%
15/09/17 19:23:24 INFO mapreduce.Job: map 99% reduce 0%
15/09/17 19:23:28 INFO mapreduce.Job: map 100% reduce 0%
15/09/17 19:23:35 INFO mapreduce.Job: Job job_1442515994234_0001 completed successfully
15/09/17 19:23:35 INFO mapreduce.Job: Counters: 31
 File System Counters
 FILE: Number of bytes read=0
 FILE: Number of bytes written=229878
 FILE: Number of read operations=0
 FILE: Number of large read operations=0
 FILE: Number of write operations=0
 HDFS: Number of bytes read=167
 HDFS: Number of bytes written=1000000000
 HDFS: Number of read operations=8
 HDFS: Number of large read operations=0
 HDFS: Number of write operations=4
 Job Counters 
 Launched map tasks=2
 Other local map tasks=2
 Total time spent by all maps in occupied slots (ms)=916738
 Total time spent by all reduces in occupied slots (ms)=0
 Total time spent by all map tasks (ms)=916738
 Total vcore-seconds taken by all map tasks=916738
 Total megabyte-seconds taken by all map tasks=938739712
 Map-Reduce Framework
 Map input records=10000000
 Map output records=10000000
 Input split bytes=167
 Spilled Records=0
 Failed Shuffles=0
 Merged Map outputs=0
 GC time elapsed (ms)=1723
 CPU time spent (ms)=21600
 Physical memory (bytes) snapshot=292618240
 Virtual memory (bytes) snapshot=4156821504
 Total committed heap usage (bytes)=97517568
 org.apache.hadoop.examples.terasort.TeraGen$Counters
 CHECKSUM=21472776955442690
 File Input Format Counters 
 Bytes Read=0
 File Output Format Counters 
 Bytes Written=1000000000

2. Sort the data

[hadoop@master ~]$ hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar terasort /input /output
15/09/17 19:25:25 INFO terasort.TeraSort: starting
15/09/17 19:25:27 INFO input.FileInputFormat: Total input paths to process : 2
Spent 216ms computing base-splits.
Spent 3ms computing TeraScheduler splits.
Computing input splits took 220ms
Sampling 8 splits of 8
Making 1 from 100000 sampled records
Computing parititions took 5347ms
Spent 5569ms computing partitions.
15/09/17 19:25:32 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.1.1:8032
15/09/17 19:25:33 INFO mapreduce.JobSubmitter: number of splits:8
15/09/17 19:25:34 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1442515994234_0002
15/09/17 19:25:34 INFO impl.YarnClientImpl: Submitted application application_1442515994234_0002
15/09/17 19:25:34 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1442515994234_0002/
15/09/17 19:25:34 INFO mapreduce.Job: Running job: job_1442515994234_0002
15/09/17 19:25:42 INFO mapreduce.Job: Job job_1442515994234_0002 running in uber mode : false
15/09/17 19:25:42 INFO mapreduce.Job: map 0% reduce 0%
15/09/17 19:25:57 INFO mapreduce.Job: map 11% reduce 0%
15/09/17 19:26:00 INFO mapreduce.Job: map 17% reduce 0%
15/09/17 19:26:07 INFO mapreduce.Job: map 21% reduce 0%
15/09/17 19:26:10 INFO mapreduce.Job: map 25% reduce 0%
15/09/17 19:26:13 INFO mapreduce.Job: map 34% reduce 0%
15/09/17 19:26:14 INFO mapreduce.Job: map 37% reduce 0%
15/09/17 19:26:15 INFO mapreduce.Job: map 41% reduce 0%
15/09/17 19:26:16 INFO mapreduce.Job: map 44% reduce 0%
15/09/17 19:26:17 INFO mapreduce.Job: map 57% reduce 0%
15/09/17 19:26:18 INFO mapreduce.Job: map 58% reduce 0%
15/09/17 19:26:20 INFO mapreduce.Job: map 62% reduce 0%
15/09/17 19:26:22 INFO mapreduce.Job: map 62% reduce 8%
15/09/17 19:26:26 INFO mapreduce.Job: map 66% reduce 8%
15/09/17 19:26:27 INFO mapreduce.Job: map 67% reduce 8%
15/09/17 19:26:28 INFO mapreduce.Job: map 67% reduce 13%
15/09/17 19:26:35 INFO mapreduce.Job: map 72% reduce 13%
15/09/17 19:26:37 INFO mapreduce.Job: map 73% reduce 13%
15/09/17 19:26:38 INFO mapreduce.Job: map 79% reduce 13%
15/09/17 19:26:42 INFO mapreduce.Job: map 84% reduce 13%
15/09/17 19:26:44 INFO mapreduce.Job: map 84% reduce 17%
15/09/17 19:26:45 INFO mapreduce.Job: map 86% reduce 17%
15/09/17 19:26:48 INFO mapreduce.Job: map 89% reduce 17%
15/09/17 19:26:51 INFO mapreduce.Job: map 94% reduce 17%
15/09/17 19:26:53 INFO mapreduce.Job: map 96% reduce 17%
15/09/17 19:26:54 INFO mapreduce.Job: map 100% reduce 17%
...
15/09/17 19:34:21 INFO mapreduce.Job: map 100% reduce 87%
15/09/17 19:34:24 INFO mapreduce.Job: map 100% reduce 93%
15/09/17 19:34:27 INFO mapreduce.Job: map 100% reduce 99%
15/09/17 19:34:28 INFO mapreduce.Job: map 100% reduce 100%
15/09/17 19:34:28 INFO mapreduce.Job: Job job_1442515994234_0002 completed successfully
15/09/17 19:34:29 INFO mapreduce.Job: Counters: 51
 File System Counters
 FILE: Number of bytes read=2080000144
 FILE: Number of bytes written=3121046999
 FILE: Number of read operations=0
 FILE: Number of large read operations=0
 FILE: Number of write operations=0
 HDFS: Number of bytes read=1000000816
 HDFS: Number of bytes written=1000000000
 HDFS: Number of read operations=27
 HDFS: Number of large read operations=0
 HDFS: Number of write operations=2
 Job Counters 
 Killed map tasks=3
 Launched map tasks=11
 Launched reduce tasks=1
 Data-local map tasks=9
 Rack-local map tasks=2
 Total time spent by all maps in occupied slots (ms)=455604
 Total time spent by all reduces in occupied slots (ms)=495530
 Total time spent by all map tasks (ms)=455604
 Total time spent by all reduce tasks (ms)=495530
 Total vcore-seconds taken by all map tasks=455604
 Total vcore-seconds taken by all reduce tasks=495530
 Total megabyte-seconds taken by all map tasks=466538496
 Total megabyte-seconds taken by all reduce tasks=507422720
 Map-Reduce Framework
 Map input records=10000000
 Map output records=10000000
 Map output bytes=1020000000
 Map output materialized bytes=1040000048
 Input split bytes=816
 Combine input records=0
 Combine output records=0
 Reduce input groups=10000000
 Reduce shuffle bytes=1040000048
 Reduce input records=10000000
 Reduce output records=10000000
 Spilled Records=30000000
 Shuffled Maps =8
 Failed Shuffles=0
 Merged Map outputs=8
 GC time elapsed (ms)=2276
 CPU time spent (ms)=86920
 Physical memory (bytes) snapshot=1792032768
 Virtual memory (bytes) snapshot=18701041664
 Total committed heap usage (bytes)=1277722624
 Shuffle Errors
 BAD_ID=0
 CONNECTION=0
 IO_ERROR=0
 WRONG_LENGTH=0
 WRONG_MAP=0
 WRONG_REDUCE=0
 File Input Format Counters 
 Bytes Read=1000000000
 File Output Format Counters 
 Bytes Written=1000000000
15/09/17 19:34:29 INFO terasort.TeraSort: done

3.  Look at the output

List the output directory:

[hadoop@master ~]$ hdfs dfs -ls /output
Found 3 items
-rw-r--r-- 1 hadoop supergroup 0 2015-09-17 19:34 /output/_SUCCESS
-rw-r--r-- 10 hadoop supergroup 0 2015-09-17 19:25 /output/_partition.lst
-rw-r--r-- 1 hadoop supergroup 1000000000 2015-09-17 19:34 /output/part-r-00000

Get the sorted output file:

[hadoop@master ~]$ hdfs dfs -get /output/part-r-00000 part-r-00000

Look at the output file:

[hadoop@master ~]$ hexdump part-r-00000 | less

Does the output match your expectations?

 

ExoGENI: Getting Started Tutorial

This post describes how to create an account on ExoGENI, install the slice creation tool Flukes, and use Flukes to create simple slices.

Background:

These instructions assume you have been given a GENI account OR have an account with an institution that participates in the InCommon Federation.  Most colleges and universities do are in InCommon, check your university here.

Creating a GENI account:

  1. In a browser, go to the GENI portal (https://portal.geni.net/)Screen Shot 2015-09-17 at 12.13.41 PM
  2. Click the “Use GENI” buttonScreen Shot 2015-09-17 at 12.14.10 PM
  3. Under “Show a list of organizations” find your organization.   If you have been assigned a temporary account for an organized tutorial, choose “GENI Project Office”Screen Shot 2015-09-17 at 12.17.16 PM
  4. You will be redirected to your organization’s sign-on page (example shows UNC Chapel Hill’s Onyen system).  Login with the username and password you typically use to access services at your organization.  You will now be logged into the GENI Portal.Screen Shot 2015-09-17 at 12.18.04 PM
  5. Click your name in the upper right corner and choose “Profile” from the dropdown menu.Screen Shot 2015-09-17 at 12.27.43 PM
  6. If you are not part of any projects, you should request to join a project given to you by project PI or class instructor (there is a ‘Join Project’ button in the profile). Wait until you are approved.
  7. Click “SSL” in the grey bar.Screen Shot 2015-09-17 at 12.30.30 PM
  8. Click “Create an SSL certificate”Screen Shot 2015-09-17 at 12.33.46 PM
  9. Click “Generate Combined Certificate and Key File”Screen Shot 2015-09-17 at 12.35.01 PM
  10. Download your certificate and key. This will create a pem file that is named “geni-<your_login>.pem”. Save it to a  known location on your machine (suggestion:  ~/.ssl/geni-<your_login>.pem).    Screen Shot 2015-09-17 at 12.35.53 PM

At this point you can use your pem file to access ExoGENI.   However, if you would like to use other GENI resources you will need to join a GENI project.

Install/Configure Flukes:

1. Install Java on your system (if it is not there already). Note that we do not recommend using OpenJDK. Instead you should use Oracle JDK 7 or 8 for your platform.

2. Download the Flukes jnlp file:

wget http://geni-images.renci.org/webstart/flukes.jnlp

3. Edit (create) a ~/.flukes.properties file.  Modify below to point to your GENI pem file and the ssh keys that you wish to use.

orca.xmlrpc.url=https://geni.renci.org:11443/orca/xmlrpc
user.certfile=/Users/pruth/.ssl/geni-pruth.pem
user.certkeyfile=/Users/pruth/.ssl/geni-pruth.pem
enable.modify=true
ssh.key=~/.ssh/id_dsa
# SSH Public key to install into VM instances
ssh.pubkey=~/.ssh/id_dsa.pub
# Secondary login (works with ssh.other.pubkey)
ssh.other.login=pruth
# Secondary public SSH keys 
ssh.other.pubkey=~/.ssh/id_dsa.pub
# Should the secondary account have sudo privileges
ssh.other.sudo=yes
# Path to XTerm executable on your system
xterm.path=/opt/X11/bin/xterm

4. Run the the jnlp by double clicking the file or by using the javaws application on the command line.

javaws /path/to/the/file/flukes.jnlp

At this point the Flukes GUI should be running.  You may need to create security exceptions on your system to allow it to run.   In addition, there may be warnings about certificate authorities.  Usually clicking “run” or “continue” in the popup windows will work).  If Flukes is not running correctly see https://geni-orca.renci.org/trac/wiki/flukes for more information.

Simple Slice: 

  1. Start Flukes as described aboveScreen Shot 2015-09-18 at 12.58.45 PM
  2. Click in the field to place a VM in the request.Screen Shot 2015-09-17 at 1.16.15 PM
  3. Right click the new node and choose “Edit properties” from the menu.   If the application has not yet contacted the ExoGENI server it may ask for the password for your GENI key.   If you don’t know what this is it is probably an empty password.Screen Shot 2015-09-17 at 1.20.56 PM
  4. Choose a node type:  (suggestion:  XO Medium)
  5. Choose an image:  (suggestion:  Centos 6.3 v1.0.10)
  6. Choose a domain:  (suggestion:  Leave as “System select)Screen Shot 2015-09-17 at 1.25.49 PM
  7. Press “Ok”
  8. Name your slice by putting a short string in the text box next to the submit button:  (suggestion: <yourlogin>.test1Screen Shot 2015-09-17 at 1.27.35 PM
  9. Press the Submit button. After a few seconds a window will popup showing the resources you have been assigned (Note: this will take longer if you are participating in a large tutorial).   In the example, there in one VM that will be created at Pittsburg Supercomputing Center (PSC).Screen Shot 2015-09-17 at 1.29.48 PM
  10. Click “ok”
  11. Switch to the “Manifest View” by clicking “Manifest View”Screen Shot 2015-09-17 at 1.31.58 PM
  12. Click “My Slices”. A window will popup listing your new slice.Screen Shot 2015-09-17 at 1.33.33 PM
  13. Select the slice and click “Query”.  A window will popup listing the status of your requested resources.  The status could be:  Ticketed, Active, or Failed.  Ticketed resources are not yet ready.  Active resources are ready to use.   Failed resources experienced a problem that prevented it from becoming Active. Screen Shot 2015-09-17 at 1.36.58 PM
  14. If your resources are Ticketed then click “ok”, wait a couple of minutes, then query for the slice again.  Repeat until the resources are Active.
  15. If your xterm or putty configuration is correct you can login to the VM by right clicking the VM and choosing “Login to Node”.   Alternatively,  you can right-click the node and choose “View Properties” to see the public IP address of the VM.  You can then ssh to the VM using that address and the private key that corresponds to the public key you referenced in your .flukes.properties file.Screen Shot 2015-09-17 at 1.41.31 PM

Dumbell Slice

  1. Start Flukes
  2. Click the field twice to insert two VMs into the request.Screen Shot 2015-09-17 at 1.43.43 PM
  3. Click and drag a line from one node to the other.  This creates a point-to-point Ethernet between the nodes.Screen Shot 2015-09-17 at 1.44.45 PM
  4. Edit the properties of each node.  Set:  (node type: “XO medium”,  image: “Centos 6.3 v1.0.10”, domain: “System select”)
  5. In addition, set the Link0 IP address of one VM to 172.16.1.1/24 and the other to 172.16.1.2/24.Screen Shot 2015-09-17 at 1.48.55 PM
  6. Name your slice (suggestion:  <login_name>.dumbell1)
  7. Submit the request
  8. Query for the manifest. Notice that there are three resources this time (2 VMs and 1 link)Screen Shot 2015-09-17 at 1.51.06 PM
  9. After all resources are Active, login to one of the VMs. Run ifconfig.   Notice that the VM has an extra interface that has been assigned the 172.16.1.x address that you specified in the request.Screen Shot 2015-09-17 at 1.52.41 PM
  10. Try pinging the other VM.Screen Shot 2015-09-17 at 1.54.10 PM

Using Dropbox for ExoGENI Images

This post describes how to create and use custom ExoGENI images stored in Dropbox.   For information about creating custom images refer to previous posts about taking snapshots of existing VMs or creating images from VirtualBox VMs.

Background:

These instructions assume you have created a working Dropbox account.

Images:

These instructions are intended to help you host your own custom ExoGENI images in your own Dropbox account.   However, for the tutorial you may find it useful to use the following image, kernel, and ramdisk that are known to work on ExoGENI.

Pushing the Images to Dropbox:

Copy the image, kernel, and ramdisk files to the public folder of your Dropbox account.   For each of the files right-click the file and choose “Copy public link…“.   Record these links for later.  These are the public html links to your files.

The links will look something like the following.  Note: 12345678 will a unique number associated with your Dropbox account.

  • Image: https://dl.dropboxusercontent.com/u/12345678/centos6.3-v1.0.11.tgz
  • Kernel: https://dl.dropboxusercontent.com/u/12345678/vmlinuz-2.6.32-431.17.1.el6.x86_64
  • Ramdisk: https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img

If your urls do not match the previous pattern, you will likely need to move your files to the public folder.

Creating the Metadata file by hand:

Next you need to create the xml metadata file that describes the three files.   The metadata file contains the public url for each file as well and the sha1 hash for each file.

The sha1 hash can be found by typing the following on most *nix machines.

> sha1sum centos6.3-v1.0.11.tgz initramfs-2.6.32-431.17.1.el6.x86_64.img vmlinuz-2.6.32-431.17.1.el6.x86_64 
cb899c42394eecc008ada7c9b75456c7d7e1149b  centos6.3-v1.0.11.tgz
fc927a776e819b0951b5e8daf81f6991128e9abf  initramfs-2.6.32-431.17.1.el6.x86_64.img
726abdfd57dbe0ca079f3b38f8cce8b9f2323efa  vmlinuz-2.6.32-431.17.1.el6.x86_64

At this point you can create the xml metadata file to look like the following. Note: For the purposes of the tutorial we will name our metadata file centos6.3-v1.0.11.xml.

<images>
   <image>
      <type>ZFILESYSTEM</type>
      <signature>cb899c42394eecc008ada7c9b75456c7d7e1149b</signature>
      <url>https://dl.dropboxusercontent.com/u/12345678/centos6.3-v1.0.11.tgz</url>
   </image>
   <image>
      <type>KERNEL</type>
      <signature>fc927a776e819b0951b5e8daf81f6991128e9abf</signature>
      <url>https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img</url>
   </image>
   <image>
      <type>RAMDISK</type>
      <signature>fc927a776e819b0951b5e8daf81f6991128e9abf</signature>
      <url>https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img</url>
   </image>
</images>

Creating the Metadata file Automatically 

If you want to create the metadata file automatically you can use this script.

The script takes as parameters the three public urls to the image files.   It will then download each file, find the sha1sum, and create the metadata file.   Note: that this may take some time because the file must download the image file which could be very large.

> ./create-image-metadata.sh -z https://dl.dropboxusercontent.com/u/12345678/centos6.3-v1.0.11.tgz -r https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img -k https://dl.dropboxusercontent.com/u/12345678/vmlinuz-2.6.32-431.17.1.el6.x86_64 -n centos6.3-v1.0.11.xml
getting https://dl.dropboxusercontent.com/u/12345678/centos6.3-v1.0.11.tgz
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
100 345M 100 345M 0 0 23.9M 0 0:00:14 0:00:14 --:--:-- 29.8M
getting https://dl.dropboxusercontent.com/u/12345678/vmlinuz-2.6.32-431.17.1.el6.x86_64
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
100 4033k 100 4033k 0 0 2628k 0 0:00:01 0:00:01 --:--:-- 3168k
getting https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
100 12.0M 100 12.0M 0 0 4470k 0 0:00:02 0:00:02 --:--:-- 4946k
Creating XML image descriptor file centos6.3-v1.0.11.xml
Metadata:
<images> 
 <image>
 <type>ZFILESYSTEM</type>
 <signature>cb899c42394eecc008ada7c9b75456c7d7e1149b</signature>
 <url>https://dl.dropboxusercontent.com/u/12345678/centos6.3-v1.0.11.tgz</url>
 </image>
 <image>
 <type>KERNEL</type> 
 <signature>726abdfd57dbe0ca079f3b38f8cce8b9f2323efa</signature>
 <url>https://dl.dropboxusercontent.com/u/12345678/vmlinuz-2.6.32-431.17.1.el6.x86_64</url>
 </image>
 <image>
 <type>RAMDISK</type>
 <signature>fc927a776e819b0951b5e8daf81f6991128e9abf</signature>
 <url>https://dl.dropboxusercontent.com/u/12345678/initramfs-2.6.32-431.17.1.el6.x86_64.img</url>
 </image>
</images>
XML image descriptor file SHA1 hash is: d964e8f63c8e48c419a9eb1db50fb657eb19b468
XML image descriptor file is: centos6.3-v1.0.11.xml

Push the Metadata File to Dropbox

Find the sha1 hash of the metadata file (or look at the stdout from the create-image-metadata.sh script):

> sha1sum centos6.3-v1.0.11.xml 
d964e8f63c8e48c419a9eb1db50fb657eb19b468  centos6.3-v1.0.11.xml

Copy the metadata file to the public folder of your Dropbox account.   Right-click the file to get its public Dropbox url.

Using the image

You can now use the url and hash of the metadata file in your ExoGENI requests.

Creating Custom ExoGENI Images using Virtual Box

This post describes how to create a custom ExoGENI image using Virtual Box.  Note that these instructions can be used to create custom ExoGENI images from many other types of machines (virtual and otherwise).

Background:

We will be using the ExoGENI image snapshot mechanism described in a previous post. These instructions assume you have read the previous post and are familiar with the procedure it describes.

Creating a Virtual Box VM:

Virtual Box is a free open source application for running virtual machines.  Typically Virtual Box is used to run virtual machines on a desktop or laptop.

1.  Download the Virtual Box software.  Follow the instructions for installing Virtual Box on your machine.

2.  Download the installation ISO for your favorite Linux distribution (OR obtain a Virtual Box compatible image).

3.  Use Virtual Box to create a new virtual machine from your Linux ISO (OR import your existing image into Virtual Box)

Creating the ExoGENI image:

1.  Install the NEuca tools. The tools are available as an .rpm, .deb, or source.

Prepare the VM for the NEuca guest tools.  The NEuca tools depend on several python packages and the iSCSI tools which will need to be installed before the guest tools.  Also, you will need to install ssh to be able to access the VM after it boots on ExoGENI.

Debian/Ubuntu:

> sudo apt-get install python python-boto python-daemon python-ipaddr python-netaddr open-iscsi python-lockfile ssh

Get and install the NEuca guest tools:

> wget http://software.exogeni.net/repo/exogeni/neuca-guest-tools/neuca-guest-tools_1.4-1_all.deb
> dpkg -i neuca-guest-tools_1.4.1_all.deb

Centos:

> rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
> yum install python python-devel python-boto python-daemon python-ipaddr python-netaddr

Get and install the NEuca guest tools:

> wget http://software.exogeni.net/repo/exogeni/neuca-guest-tools/neuca-guest-tools-1.4-1.noarch.rpm
> rpm -ivh neuca-guest-tools-1.4-1.noarch.rpm

2.  The remainder of the instructions are the same as capturing an image from the previous post (starting at step “3. Get the snapshot script”).

Things that might go wrong:

1.   You need to allow root to login with keys.

In /etc/ssh/sshd_config, add to or modify the file to include the line:

PermitRootLogin without-password

And you must disable selinux.  Modify /etc/selinux/config to include the line:

SELINUX=disabled

Example: creating slivers of iSCSI Storage

This post introduces new iSCSI storage capabilities available to ExoGENI slices.   Many of the ExoGENI racks include a sliverable iSCSI device. Users can now allocate iSCSI targets that are made available to their compute nodes.  Optionally, NEuca tools within the compute nodes can be configured to attach to the target, format the filesystem, and mount the filesystem.

In Flukes, there is a new resource type.  In the drop down list of resources choose “Storage”.  Add a storage node to your slice request.  The storage node is represented by a green cylinder.   After adding a storage node, add a compute node and drag a link between the storage node and the compute node. The link represents the network over which the compute node will communicate with the storage target.

Your request should look like the following figure.

Screen Shot 2014-01-01 at 3.49.10 PM

Right click the storage node to view and set its properties.

Screen Shot 2014-01-01 at 4.09.34 PM

Set the domain to the same domain as the node.  For now, compute nodes can only attach to storage targets on the same rack.  Choose the capacity of the storage node in gigabytes.  Many racks are limited to 5 TB of space shared among all users, however large capacities can take a very long time to instantiate.   In addition, set any formatting parameters and the set the mount point.

After submitting the slice and waiting for it to instantiate, the virtual machine in this example will have a 10 GB iSCSI storage device mounted at /mnt/target.

 About RSpecs

Here is an example of how to create and attaching storage to a node using RSpec:

<node client_id="master" component_manager_id="urn:publicid:IDN+exogeni.net:ucdvmsite+authority+am" exclusive="false">
  <sliver_type name="XOXlarge">
  <disk_image name="http://geni-images.renci.org/images/pruth/Shakedown/adcirc/adcirc.condor.v0.6/adcirc.condor.v0.6a.data.xml" version="03b5b1392f46a005fbe256e4cba8c90cfd6a5ab8"/>
  </sliver_type>
  <interface client_id="master:stor0">
  </interface>
</node>
<node client_id="master-stor" component_manager_id="urn:publicid:IDN+exogeni.net:ucdvmsite+authority+am" exclusive="false">
  <sliver_type name="storage">
  <storage:storage resource_type="LUN" do_format="true" fs_param="-F -b 1024" fs_type="ext4" mnt_point="/mnt" capacity="942"/>
  </sliver_type>
  <interface client_id="masterstor:if0"/>
</node>
<link client_id="storage-lan">
  <interface_ref client_id="masterstor:if0"/>
  <interface_ref client_id="master:stor0"/>
</link>

 

 

Example: Modifying a Slice (add/remove nodes)

One of the goals of ExoGENI is to enable dynamically modifiable slices.  The subject of this post is how to add and remove nodes from an group in an existing ExoGENI slice.

The long term goals for modifying slices including the ability to:

  • Add/remove compute slivers
  • Modify compute sliver attributes (memory, disk, cpu, etc.)
  • Add/remove portions of topology (add/remove links, networks, interfaces)
  • Modify topology sliver performance (link bandwidth and latency)
  • More that we have not thought of yet.  Suggestions are welcome.

The currently deployed software allows the user to add and remove compute slivers provided the new slivers are members of an existing group of nodes.  The group requirement ensures that the modification will not require topology modification which is not yet supported.

Example

This example could be used as the base for a virtual computational cluster with a single submit node and a group of compute nodes.   The number of nodes in the group can be modified to fit the current requirements of the applications running in the slice.  Start by using Flukes to create a request that includes an individual node and a group of nodes attached with a network link.

Make sure the group settings include multiple nodes.  In this case, the group begins with four nodes.

Submit your request and wait for ExoGENI to instantiate the topology.  When it is complete you should see a manifest similar to the following.   Note that in this case the single node is on a different site than the group.   This is not a requirement but does make it easier to visually identify the group from the individual.

Now lets add some virtual machines to the group.  Right-click any node that is part of the group and select “Increase node group size…”.    You will be presented with a dialogue box that expects you to enter the number of nodes you wish to create.  Lets enter “4” to create four more nodes.

After you enter the number of nodes you wish to add, Flukes needs to be unstructured to submit the modify request.  Do this by clicking the “Commit Modify Actions” button near the top of the Manifest View.   Flukes will present you with a text box containing the raw NDL for the modify request.   Confirm the request by clicking “OK”.

Now your request has been submitted.  To view your modified slice click “Query for Manifest”.  You should now see the modified slice containing eight nodes in the group.

Deleting individual nodes is accomplished through a similar process.  Choose the node in the group that you wish to delete.  Right-click that node and choose “Delete <node name>”.   Confirm that you wish to delete that node.   Then commit the modify request and query for the updated the manifest.

Tips

  • Multiple modify actions can be composed into a single modify request.   Try adding and deleting nodes before committing the modify action.     In addition, you may wish to add and/or delete nodes from multiple groups within one slice with one modify request.
  • Don’t worry if you make a mistake and add the wrong number of nodes to a group or delete the wrong node.  Anytime before you commit the modify action you can click “Clear Modify Actions”.   This clears ALL the modify actions that you have worked on since the last time you committed a modify action.
  • If your group has a ExoGENI defined dataplane with IP addresses, make sure that the IP address space is large enough for the largest you expect your group to be.  In addition, make sure that any other nodes in that subnet have IP addresses that will not conflict with any new nodes added to the group.  Note: Fluke’s “Auto IP” feature often uses small subnets that may not be large enough for your dynamic group.

 

 

 

Example: Postboot Scripts for Creating Hostnames and Name Resolution

This post walks through an example slice that uses postboot scripts to assign hostnames to each VM and to add entries into /etc/hosts so that the VMs can address each other using the hostnames.   This example is useful in its own right but may also serve as an introduction to  postboot scripts.

Setting up the Request:

Start by creating the following topology in Flukes:

Configure the slice with some basic requirements by selecting your favorite node type, image, and domain for Node0 and NodeGroup0.  Select the number of VMs in the group for NodeGroup0.  These settings can be chosen by right-clicking the node or node group and choosing “Edit Properties”.

Assign IP addresses to all VMs using the “Auto IP” button.

At this point you should be able to submit your slice to ExoGENI and it will be instantiated.  However, you can only access the VMs on the dataplane (“Link0”) using IP addresses.  For many application this is sufficient, however some applications use hostnames and assume the existence of DNS to resolve the hostnames.   In addition, using “Auto IP” can make it difficult to find the IPs that ExoGENI chosen (especially for large slices).

Modifying the Postboot Script:

It is desirable to programmatically assign hostnames to VMs and configure the slice to resolve the names to the appropriate IPs.  We can accomplish this using postboot scripts.  Hostnames can be based on the node or group name and the index within the group.  Name resolution can be achieved by inserting the hostnames and IPs into /etc/hosts in each VM.   Since the entire slice is maintained by the user through ExoGENI it does not need a dynamic name resolution system like DNS.

Edit the Postboot Scripts for Node0 and NodeGroup0 (right-click the node or node group -> edit properties -> click “PostBoot Script”).  In the text box add the following script:

#!/bin/bash

echo $Node0.IP("Link0") Node0 >> /etc/hosts
#set ( $max = $NodeGroup0.size() - 1 )
#foreach ( $i in [0..$max] )
 echo $NodeGroup0.get($i).IP("Link0") `echo $NodeGroup0.get($i).Name() | sed 's/\//-/g'` >> /etc/hosts
#end

echo `echo $self.Name() | sed 's/\//-/g'` > /etc/hostname
/bin/hostname -F /etc/hostname

The script is a Velocity template that will be processed by ExoGENI for each VM and executed at startup.   Submit the request to ExoGENI.   The resulting slice should have assigned hostnames to each VM and have modified /etc/hosts on each VM.

For example, the script that was executed on Node0 looks like this:

root@Node0:~# neuca-user-script 
#!/bin/bash
echo 172.16.100.1 Node0 >> /etc/hosts
echo 172.16.100.2 `echo NodeGroup0/0 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.3 `echo NodeGroup0/1 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.4 `echo NodeGroup0/2 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.5 `echo NodeGroup0/3 | sed 's/\//-/g'` >> /etc/hosts
echo `echo Node0 | sed 's/\//-/g'` > /etc/hostname
/bin/hostname -F /etc/hostname

The resulting /etc/hosts file on Node0 looks like this:

root@Node0:~# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.16.100.1 Node0
172.16.100.2 NodeGroup0-0
172.16.100.3 NodeGroup0-1
172.16.100.4 NodeGroup0-2
172.16.100.5 NodeGroup0-3

This configuration allows for the addressing of other VMs by hostnames like:

root@Node0:~# ping -c 4 NodeGroup0-0
PING NodeGroup0-0 (172.16.100.2) 56(84) bytes of data.
64 bytes from NodeGroup0-0 (172.16.100.2): icmp_req=1 ttl=64 time=4.73 ms
64 bytes from NodeGroup0-0 (172.16.100.2): icmp_req=2 ttl=64 time=0.593 ms
64 bytes from NodeGroup0-0 (172.16.100.2): icmp_req=3 ttl=64 time=0.529 ms
64 bytes from NodeGroup0-0 (172.16.100.2): icmp_req=4 ttl=64 time=0.317 ms
--- NodeGroup0-0 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.317/1.543/4.736/1.846 ms

Postboot Script Explanation:

1.  Specify the interpreter

#!/bin/bash

2.  Add Node0’s IP and name to /etc/hosts.  The Velocity replacement refers to Node0 as “$Node0”.   The function IP(“Link0”) is called on Node0 to get the IP address of the interface connected to “Link0”.

echo $Node0.IP("Link0") Node0 >> /etc/hosts

After template expansion the line looks like:

echo 172.16.100.1 Node0 >> /etc/hosts

3.  Add NodeGroup0’s IPs to /etc/hosts.  This requires an entry for each VM in the group, but we do not know the number of VMs a priori.  A Velocity object for each VM in the group is accessed using an index.  We use the Velocity instructions to set a variable (“$max”) to the highest in index then loop through the group adding a line to the script for each VM.  Each VM in the group is accessed by the function get(<index>).  The IP is accessed in the same way as the stand-alone VM by using IP(“Link0”).  In this case we are also using the name of the VM by using the Name() function.

#set ( $max = $NodeGroup0.size() - 1 )
#foreach ( $i in [0..$max] )
 echo $NodeGroup0.get($i).IP("Link0") `echo $NodeGroup0.get($i).Name() | sed 's/\//-/g'` >> /etc/hosts
#end

After template expansion the line looks like:

echo 172.16.100.2 `echo NodeGroup0/0 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.3 `echo NodeGroup0/1 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.4 `echo NodeGroup0/2 | sed 's/\//-/g'` >> /etc/hosts
echo 172.16.100.5 `echo NodeGroup0/3 | sed 's/\//-/g'` >> /etc/hosts

Note that the ExoGENI names of the VMs in the group contain a “/” which is not allowed in a host name.  We pipe the name to sed in order to replace the “/” with a “-“.

4. Add the name of the VM to the /etc/hostname file and reset the hostname.   The variable “$self” refers to the current VM.

echo `echo $self.Name() | sed 's/\//-/g'` > /etc/hostname
/bin/hostname -F /etc/hostname