Summary
Keywords
Full Transcript
Video explains - How to read JSON files? How to parse JSON data? How to flatten JSON data? What is explode function? What is from_json function ? What is to_json function ? How to write complex schema for JSON ? Chapters 00:00 - Introduction 02:01 - Read Single Line JSON file 03:29 - Read Multiline JSON file 04:42 - Read JSON data in Single column 05:29 - Read JSON file with Schema 07:00 - Write Schema ddl String 09:20 - from_json function 11:00 - to_json function 12:39 - Flatten JSON data Spark JSON Documentation - https://spark.apache.org/docs/latest/sql-data-sources-json.html Local PySpark Jupyter Lab setup - https://youtu.be/WhxljT3IfdM Python Basics - https://www.learnpython.org/ GitHub URL for code - https://github.com/subhamkharwal/pyspark-zero-to-hero/blob/master/10_read_json_files.ipynb The series provides a step-by-step guide to learning PySpark, a popular open-source distributed computing framework that is used for big data processing. New video in every 3 days ❤️ #spark #pyspark #python #dataengineering
