XSEDE15 has ended
All dates, times and locations of tech program events or other scheduled plans are subject to change. Please check back regularly to ensure you view the most up-to-date version of the schedule.
Back To Schedule
Monday, July 27 • 8:00am - 12:00pm
Tutorial: Spark: Big Data processing framework

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This tutorial, suitable for attendees with a basic level knowledge in data analysis, will provide a comprehensive overview on the big data analysis framework called Spark. Participants will gain insight knowledge of large-scale data analysis tool along with hands-on experience of real world use cases on XSEDE resource.
Data is growing in different sectors such scientific domains, business and industries. There are many data processing frameworks available to deal with Big Data analysis. Among those, Hadoop is the most popular tool, although Hadoop cannot handle real-time, interactive data processing. Spark is the next generation data analytics framework which can overcome Hadoop drawbacks. It also provides in-memory computing which is orders of magnitude faster than Hadoop. This tutorial will focus on introduction to Spark and Spark libraries such as Spark SQL, Spark Streaming, MLlib (machine learning), and GraphX (graph). It will also cover deployment of Spark on XSEDE resources, case studies, and hands-on exercises. At the end of tutorial, attendees will have practical knowledge of Spark, its libraries, which could be applied to their relevant domains.

Monday July 27, 2015 8:00am - 12:00pm CDT
Majestic D

Attendees (0)