Introduction to Hadoop Administration (TTDS6503)
5122
Introduction to Hadoop Administration (TTDS6503)
Live Virtual
Private/On Site
Apache Hadoop is an open source framework for creating reliable and distributable compute clusters. Credited with the IBM Watson Jeopardy win in 2011, Hadoop can be used (with other related frameworks) to process large unstructured or semi-structured data sets from multiple sources to dissect, classify, learn from, and make suggestions for business analytics, decision support, and other advanced forms of machine intelligence. This introductory-level, hands-on lab-intensive course is geared for the administrator who is new to Hadoop and responsible for maintaining a Hadoop cluster and its related components. Hadoop is a system designed for massive scalability; its extremely fault-tolerant compared to other cluster architectures. As administrators, you will need to install, configure, and maintain Hadoop on Linux in various compute environments. This course agenda may be easily customized for addressing areas of specific interest to your team. There are lab variations that support Cloudera and Hortonworks distributions as well.
Administrators who need to maintain a Hadoop cluster and its related components in a Linux environment
1. Hadoop Overview
2. Hadoop Architecture
3. Installing Hadoop
4. Test-Running Hadoop Programs
5. Cloud Installations
6. Optimization and Tuning
7. Installing HBase
8. Previewing Hadoop 3
Questions?
Whether you need assistance scheduling a class for yourself or for your group, GCA's Education Account Manager's will craft a customized training solution to meet the needs of your organization.