Introduction to using Hadoop on Gordon - In Person at SDSC

Host Site:

San Diego Supercomputer Center

Host site URL:

http://www.sdsc.edu

This workshop introduces participants to Hadoop and its use in scientific and data intensive computing. It will explain why computational and data-intensive investigators might be interested in knowing more about Hadoop and how it works on Gordon. SDSC’s introduction will be geared for researchers seeking to use Hadoop on XSEDE’s Gordon data intensive cluster at SDSC. During the 2-hour workshop, participants will get an introduction on the various options available for running hadoop within Gordon’s normal production environment . The configuration is based on using SSD storage on each compute node (available via iSER) to construct the Hadoop filesystem (HDFS) and the IPoIB interface for the network communication.

Agenda

9AM – 9:45AM (PT) Overview

  • Gordon Architecture
  • Details of typical Hadoop configuration
  • Gordon specific Hadoop options

9:45AM – 10:45AM (PT) Hands on examples

  • Interactive setup of a Hadoop cluster using iSER scratch and IPoIB
  • HDFS setup, simple operations, performance benchmarking
  • TeraSort example – used to illustrate configuration options

10:45AM – 11:00AM (PT) Questions and Discussion

More information: http://www.sdsc.edu/us/resources/gordon/gordon_hadoop.html

Sessions:

In person (San Diego Supercomputer Center)

01/31/2013 09:00 - 01/31/2013 11:00 PST (SESSION HAS ENDED)
View Session Details →
Registration CLOSED
Registration open date
12/27/2012 09:00 PST
Registration close date
01/30/2013 14:56 PST
Class size restriction
25 registrants

(5 spots left)

Waitlist

0 registrants

Contact Information
Contact
Susan Rathbun
Contact phone
858-534-8321
Contact email
susan@sdsc.edu
Location
Name
San Diego Supercomputer Center
Address
Synthesis Center - Room B143
10100 Hopkins Dr., UC San Diego
La Jolla, CA 92093-0505
Phone
858-534-8321
URL
www.sdsc.edu
Posted: 12/27/2012 01:42 UTC