DeepLearning.AI Courses - 📉 Turn your multimodal data into something you can actually query

5.0 (2)

18 learners

What you'll learn

This course includes

5.5 hours of video
Certificate of completion
Access on mobile and TV

Summary

Full Transcript

Learn more: https://bit.ly/3QcAj29 Images, audio, and video now make up a large share of the data teams work with, but most pipelines still assume everything is structured. Our latest course, Building Multimodal Data Pipelines, shows how to build pipelines that process multimodal data and turn it into LLM-ready text you can search, analyze, and use in applications. Built in collaboration with Snowflake and taught by Gilberto Hernandez, this course will teach you how to handle each modality and bring them together into a single system. What you’ll build: - Pipelines that convert images and audio into structured text using OCR and ASR - A Vision Language Model workflow that generates timestamped descriptions from video - A multimodal RAG system that retrieves across slides, audio, and video to answer questions with citations Along the way, you’ll see how to embed all modalities into a shared vector space, enabling cross-modal search and retrieval over real-world datasets like meeting recordings. Enroll now: https://bit.ly/3QcAj29

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

Welcome

DeepLearning.AI Courses - 📉 Turn your multimodal data into something you can actually query

What you'll learn

This course includes

Summary

Full Transcript

Continue this lesson in the app

Related Courses

In-Depth Graphic Design Courses — Satori Graphics

Free Game Design Courses

Confidence Courses

🎓 Free Professional Courses with Certificates | Skills for Career Growth

FAQs