Data Analysis With Hadoop And Spark

Host Site:

Texas Advanced Computing Center

Host site URL:

https://tacc.utexas.edu

The goal of this training workshop is to provide introduction and guidance on how to develop and run applications with Hadoop and Spark clusters. This course is being offered to in-person attendees at the Texas Advanced Computing Center and to remote attendees via webcast. In person registration is available at: https://portal.tacc.utexas.edu/training#/

Due to support considerations, access to Wrangler for hands on exercises will be restricted to confirmed in-person attendees.

There are four sessions in this training workshop. In the first session we will give an introduction on the MapReduce programming model and how to develop Java applications with the MapReduce library to use on a Hadoop cluster. In the second session we will demonstrate how to run Hadoop applications, using the Hadoop Streaming interface to utilize other programming languages and other Hadoop based libraries. In the third session we will give an introduction on developing Spark applications with Java. And in the fourth session of the workshop we will will demonstrate different ways to utilize Spark clusters including running Spark applications using spark-shell and other packages.

This training will primarily use the Java programing language. The participants are advised to have prior knowledge and experiences with Java application development. During the training sessions, in-person participants will have opportunities to practice with prepared exercises and examples on Wrangler cluster. In-person attendees who would like to participate in class exercises should also have basic knowledge on working with Wrangler cluster or review our previous training materials on this topic before the workshop starts.

Preliminary Agenda

- Introduction on Java programming with Hadoop
- Running applications and other ways to use Hadoop clusters
- Developing Java applications with Spark
- Running Spark applications and use of Spark packages

More information: https://portal.tacc.utexas.edu/training#/

Sessions:

Webcast

10/14/2016 09:00 - 10/14/2016 16:00 CDT (SESSION HAS ENDED)
View Session Details →
Registration CLOSED
Registration open date
09/16/2016 15:00 CDT
Registration close date
10/11/2016 17:00 CDT
Class size restriction
30 registrants

(0 spots left)

Waitlist

18 registrants

Contact Information
Contact
Jason Allison
Contact phone
5124759238
Contact email
jasona@tacc.utexas.edu
Location
Name
Texas Advanced Computing Center
Phone
5124759238
URL
https://portal.tacc.utexas.edu/training
Posted: 09/16/2016 19:58 UTC