PySpark - Zero to Hero | PySpark Tutorial 2025 | Spark Tutorial 2025 | Learn from Basics to Advanced Performance Optimization - 19 Understand and Optimize Shuffle in Spark
Video explains - How Shuffle works in Spark ? How to optimize Shuffle in Spark ?
Chapters
00:00 - Introduction
00:20 - Understand Pipelining in Spark
02:18 - Demonstration
11:40 - Performance with Partitioned Data
14:19 - Few More Tips
Local PySpark Jupyter Lab setup - https://youtu.be/WhxljT3IfdM
Python Basics - https://www.learnpython.org/
GitHub URL for code - https://github.com/subhamkharwal/pyspark-zero-to-hero/blob/master/15_optimizing_shuffles.ipynb
The series provides a step-by-step guide to learning PySpark, a popular open-source distributed computing framework that is used for big data processing.
New video in every 3 days ❤️
#spark #pyspark #python #dataengineering
Continue this lesson in the app
Install CourseHive on Android or iOS to keep learning while you move.