Data Exchange Podcast (Episode 240): Chang She of LanceDB

Описание к видео Data Exchange Podcast (Episode 240): Chang She of LanceDB

Episode Notes: https://thedataexchange.media/lance-d...
In this episode we discuss Lance, an open-source columnar data format that tackles the unique challenges posed by modern AI and machine learning workloads.
*Sections*
Introduction to Lance and the Challenge of Unstructured Data - 00:00:05
Overcoming Limitations of Existing Formats (Parquet, ORC) - 00:02:56
Lance: A New Data Format for AI Workloads - 00:06:05
Efficient Metadata Handling and Wide Data Support in Lance - 00:07:20
Integrated Vector Indexing for AI Applications - 00:09:15
LanceDB: A Scalable Vector Database Built on Lance Format - 00:10:39
Real-World Use Cases: Images, Videos, and Large-Scale Datasets - 00:12:31
Lance as a "One-Stop Shop" for AI Data Lakes - 00:13:49
Comparison to Meta's Nimble: Similarities and Differences - 00:15:18
Open Source Ecosystem and Community Contributions - 00:18:48
Key Use Cases: Data Exploration, Training, and Vector Search - 00:21:31
Addressing the Limitations of Traditional Vector Search Systems - 00:24:32
Exploratory Data Analysis for Unstructured Data with Lance - 00:28:02
Multimodal Embeddings and Vector Search - 00:35:51
Feature Stores and Their Evolving Role in AI - 00:41:34
Putting LanceDB's Vector Search to the Test - 00:44:14
Embedding Pipelines, Ecosystem Integrations, and Deployment - 00:50:27
Open Source and Enterprise Offerings from LanceDB - 00:53:45
The Future of Lance: New Encodings, Integrations, and Governance - 00:56:38

Комментарии

Информация по комментариям в разработке