Summary
Keywords
Full Transcript
In this course, you will create an end to end data engineering project with the combination of Apache Airflow, Docker, Spark Clusters, Scala, Python and Java. You will create basic jobs with multiple programming language, submit them to the spark cluster for processing and see live results. MORE DATA ENGINEERING COURSES: https://datamasterylab.com/sign_up ⏳ Timestamps: 00:00 Introduction 00:57 Creating The Spark Cluster and Airflow on Docker 11:00 Creating Spark Job with Python 28:51 Creating Spark Job with Scala 37:37 Building and Compiling Scala Jobs 43:23 Creating Spark Job with Java 58:51 Building and Compiling Java Jobs 1:06:15 Cluster computation results ✅ Don't forget to LIKE, COMMENT, SHARE and SUBSCRIBE to our channel for more data engineering projects. 🔗 Resource Links: Github Code: https://github.com/airscholar/SparkingFlow Java JDK: https://www.oracle.com/uk/java/technologies/downloads/ Scala SBT installation: https://www.scala-sbt.org/download.html Maven Installation: https://maven.apache.org/install.html Spark SQL mvn: https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.13/3.2.1 📢 Stay connected: Follow us on Twitter(X): https://twitter.com/datamasterylab Connect with us on LinkedIn: https://www.linkedin.com/in/yusuf-ganiyu-b90140107 Like us on Facebook: https://www.facebook.com/datamasterylab/ 🏷️ HashTags: #ApacheAirflowCourse #DataEngineeringWithAirflow #AirflowOnDocker #SparkDataProcessing #ScalaForSpark #JavaDataEngineering #MavenProjects #BigDataAnalytics #WorkflowAutomation #FullCourse #FreeCourse #Educational #dataengineering 👍 If you found this course helpful, please LIKE and SHARE the video, and leave your thoughts in the COMMENTS below. 🔔 For more tutorials and complete courses, make sure to SUBSCRIBE to our channel and hit the bell icon for notifications!
