Course Outline
Introduction
- Apache Beam vs MapReduce, Spark Streaming, Kafka Streaming, Storm and Flink
Installing and Configuring Apache Beam
Overview of Apache Beam Features and Architecture
- Beam Model, SDKs, Beam Pipeline Runners
- Distributed processing back-ends
Understanding the Apache Beam Programming Model
- How a pipeline is executed
Running a sample pipeline
- Preparing a WordCount pipeline
- Executing the Pipeline locally
Designing a Pipeline
- Planning the structure, choosing the transforms, and determining the input and output methods
Creating the Pipeline
- Writing the driver program and defining the pipeline
- Using Apache Beam classes
- Data sets, transforms, I/O, data encoding, etc.
Executing the Pipeline
- Executing the pipeline locally, on remote machines, and on a public cloud
- Choosing a runner
- Runner-specific configurations
Testing and Debugging Apache Beam
- Using type hints to emulate static typing
- Managing Python Pipeline Dependencies
Processing Bounded and Unbounded Datasets
- Windowing and Triggers
Making Your Pipelines Reusable and Maintainable
Create New Data Sources and Sinks
- Apache Beam Source and Sink API
Integrating Apache Beam with other Big Data Systems
- Apache Hadoop, Apache Spark, Apache Kafka
Troubleshooting
Summary and Conclusion
Requirements
- Experience with Python Programming.
- Experience with the Linux command line.
Audience
- Developers
Testimonials (4)
Las explicaciones eran muy buenas, si bien algunas preguntas pudieron ahorrarse si se hubieran tocado esos puntos al inicio de los temas se notó un buen dominio y experiencia en el tema.
Alan Jaime Rodríguez García - BANCO DE MEXICO
Course - Stream Processing with Kafka Streams
Muy poco, se me dificulto mucho y mas por que entre desfasado, no tome los primeras sesiones.
Rolando García - OIT para México y Cuba
Course - Apache NiFi for Administrators
La exposicion del maestro
SANDRA RAMIREZ - Organización Internacional del Trabajo
Course - Apache NiFi for Developers
Sufficient hands on, trainer is knowledgable