MS-20775 Performing Data Engineering on Microsoft HD Insight

Kluczowe zagadnienia szkolenia wg trenera CBSG Polska:

  • wprowadzenie do HDInsight
  • wdrażanie klastrów HDInsight
  • przyznawanie dostepu do zasobów
  • ETL
  • analiza danych przy pomocy Spark SQL
  • przetwarzanie danych w czasie rzeczywistymPrzemysław Rosłon, Microsoft Certified Trainer

Certyfikacje zawodowe

Tematyka szkolenia pokrywa zagadnienia wymagane na ścieżce:

  • MCSA: Data Engineering with Azure
MTA MCS MCSA MCSE
MCSD

Dowiedz się więcej o certyfikacjach Microsoft

Poziom szkolenia wg standardów Microsoft

podstawowy średniozaawansowany zaawansowany ekspercki
100 200 300 400

Czas trwania

  • 40 godzin lekcyjnych (5 dni)

Tematyka zajęć

  • Module 1: Getting Started with HDInsight
    What is Big Data?
    Introduction to Hadoop
    Working with MapReduce Function
    Introducing HDInsight
    Lab : Working with HDInsight

 

  • Module 2: Deploying HDInsight Clusters
    Identifying HDInsight cluster types
    Managing HDInsight clusters by using the Azure portal
    Managing HDInsight Clusters by using Azure PowerShell
    Lab : Managing HDInsight clusters with the Azure Portal

 

  • Module 3: Authorizing Users to Access Resources
    Non-domain Joined clusters
    Configuring domain-joined HDInsight clusters
    Manage domain-joined HDInsight clusters
    Lab : Authorizing Users to Access Resources

 

  • Module 4: Loading data into HDInsight
    Storing data for HDInsight processing
    Using data loading tools
    Maximising value from stored data
    Lab : Loading Data into your Azure account

 

  • Module 5: Troubleshooting HDInsight
    Analyze HDInsight logs
    YARN logs
    Heap dumps
    Operations management suite
    Lab : Troubleshooting HDInsight

 

  • Module 6: Implementing Batch Solutions
    Apache Hive storage
    HDInsight data queries using Hive and Pig
    Operationalize HDInsight
    Lab : Implement Batch Solutions

 

  • Module 7: Design Batch ETL solutions for big data with Spark
    What is Spark?
    ETL with Spark
    Spark performance
    Lab : Design Batch ETL solutions for big data with Spark.

 

  • Module 8: Analyze Data with Spark SQL
    Implementing iterative and interactive queries
    Perform exploratory data analysis
    Lab : Performing exploratory data analysis by using iterative and interactive queries

 

  • Module 9: Analyze Data with Hive and Phoenix
    Implement interactive queries for big data with interactive hive.
    Perform exploratory data analysis by using Hive
    Perform interactive processing by using Apache Phoenix
    Lab : Analyze data with Hive and Phoenix

 

  • Module 10: Stream Analytics
    Stream analytics
    Process streaming data from stream analytics
    Managing stream analytics jobs
    Lab : Implement Stream Analytics

 

  • Module 11: Implementing Streaming Solutions with Kafka and HBase
    Building and Deploying a Kafka Cluster
    Publishing, Consuming, and Processing data using the Kafka Cluster
    Using HBase to store and Query Data
    Lab : Implementing Streaming Solutions with Kafka and HBase

 

  • Module 12: Develop big data real-time processing solutions with Apache Storm
    Persist long term data
    Stream data with Storm
    Create Storm topologies
    Configure Apache Storm
    Lab : Developing big data real-time processing solutions with Apache Storm

 

  • Module 13: Create Spark Streaming Applications
    Working with Spark Streaming
    Creating Spark Structured Streaming Applications
    Persistence and Visualization
    Lab : Building a Spark Streaming Application
supercena3 2299zł netto
MSFT2 Wyświetl oficjalny konspekt szkolenia na stronie Microsoft Learning
zgloszenie Rejestracja na szkolenie

Wszystkie dostępne terminy kursu:

  • 18.06.2018 - 22.06.2018
  • 27.08.2018 - 31.08.2018
  • 01.10.2018 - 05.10.2018
  • 12.11.2018 - 16.11.2018
  • 17.12.2018 - 21.12.2018