Data Analysis With Hadoop And Spark

Host Site:

Texas Advanced Computing Center

Host site URL:

https://www.tacc.utexas.edu/

The goal of this training workshop is to provide introduction and guidance on how to develop and run applications with Hadoop and Spark clusters. This course is being offered to in-person attendees on UT’s main campus in POB room 2.402 and to remote attendees via webcast. Due to support considerations, access to Wrangler for hands on exercises will be restricted to confirmed in-person attendees.

There are four sessions in this training workshop. In the first session we will give an introduction on the MapReduce programming model and how to develop Java applications with the MapReduce library to use on a Hadoop cluster. In the second session we will demonstrate how to run Hadoop applications, using the Hadoop Streaming interface to utilize other programming languages and other Hadoop based libraries. In the third session we will give an introduction on developing Spark applications with Java. And in the fourth session of the workshop we will will demonstrate different ways to utilize Spark clusters including running Spark applications using spark-shell and other packages.

This training will primarily use the Java programing language. The participants are advised to have prior knowledge and experiences with Java application development. During the training sessions, in-person participants will have opportunities to practice with prepared exercises and examples on Wrangler cluster. In-person attendees who would like to participate in class exercises should also have basic knowledge on working with Wrangler cluster or review our previous training materials on this topic before the workshop starts at https://portal.tacc.utexas.edu/training#/session/18.

Preliminary Agenda

1:00-1:50 Introduction on Java programming with Hadoop
1:50-2:00 Break
2:00-2:50 Running applications and other ways to use Hadoop clusters
2:50-3:00 Break
3:00-3:50 Developing Java applications with Spark
3:50-4:00 Break
4:00-5:00 Running Spark applications and use of Spark packages

More information: https://www.tacc.utexas.edu/

Sessions:

In person (University of Texas at Austin)

04/22/2016 13:00 - 04/22/2016 17:00 CDT (SESSION HAS ENDED)
View Session Details →
Registration CLOSED
Registration open date
03/22/2016 09:00 CDT
Registration close date
04/20/2016 12:00 CDT
Class size restriction
30 registrants

(22 spots left)

Waitlist

0 registrants

Contact Information
Contact
Jason Allison
Contact phone
512-475-9238
Contact email
jasona@tacc.utexas.edu
Location
Name
University of Texas at Austin
Address
POB 2.402
201 E. 24th Street
Austin, TX 78712
Phone
512-475-9238

Webcast

04/22/2016 13:00 - 04/22/2016 17:00 CDT (SESSION HAS ENDED)
View Session Details →
Registration CLOSED
Registration open date
03/22/2016 09:00 CDT
Registration close date
04/20/2016 12:00 CDT
Class size restriction
30 registrants

(-3 spots left)

Waitlist

54 registrants

Contact Information
Contact
Jason Allison
Contact phone
512-475-9238
Contact email
jasona@tacc.utexas.edu
Location
Name
University of Texas at Austin
Phone
512-475-9238
URL
https://www.tacc.utexas.edu
Posted: 03/22/2016 18:23 UTC