From Notebook to Pipeline in No Time with LineaPy and DVC - Thomas Fraunholz at PyData Berlin

Описание к видео From Notebook to Pipeline in No Time with LineaPy and DVC - Thomas Fraunholz at PyData Berlin

Community member Thomas Fraunholz presents how to use LineaPy to transform your notebook into a reproducible pipeline with DVC at PyCon/PyData Berlin.

The nightmare before data science production: You found a working prototype for your problem using a Jupyter notebook and now it's time to build a production grade solution from that notebook. Unfortunately, your notebook looks anything but production grade. The good news is, there's finally a cure!

The open-source python package LineaPy aims to automate data science workflow generation and expediting the process of going from data science development to production. And truly, it transforms messy notebooks into data pipelines like Apache Airflow, DVC, Argo, Kubeflow, and many more. And if you can't find your favorite orchestration framework, you are welcome to work with the creators of LineaPy to contribute a plugin for it!

In this talk, you will learn the basic concepts of LineaPy and how it supports your everyday tasks as a data practitioner. For this purpose, we will transform a notebook step by step together to create a DVC pipeline. Finally, we will discuss what place LineaPy will take in the MLOps universe. Will you only have to check in your notebook in the future?

The DVC team gratefully thanks Thomas and the LineaPy team for their work on this integration!! 🙏🏼

Timestamp Notes
0:00 Introduction
02:19 Project Overview
03:00 Field of Development
04:00 Jupyter Notebooks
05:26 Problem with Jupyter Notebooks
08:10 Simple example
09:47 Simple multiplication
10:59 Create a pipeline
12:18 Generate multiple files
13:16 Extract source code
15:00 Multistage DVC pipeline
19:30 Pickle
22:20 Multistage
28:46 Dockerfile
32:22 Conclusion and Questions
35:04 Thank you
_____

LineaPy website: https://lineapy.org/

Creating a pipeline with LineaPy and DVC: https://docs.lineapy.org/0.2/tutorial...

DVC Pipeline writer: https://docs.lineapy.org/0.2/referenc...
See more videos from PyData here:    / @pydatatv  

Try out the DVC Extension for VS Code here: https://marketplace.visualstudio.com/...

To learn more about Iterative's open-source and SaaS tools please visit:
🧑🏽‍💻 Our online course: https://learn.iterative.ai
✍🏼 Our docs: https://dvc.org/doc (Data Version Control, Pipelines, Experiments)
https://cml.dev/doc (CI/CD for Machine Learning)
https://mlem.ai/doc (Package and Serve your models)
https://studio.iterative.ai (Team Collaboration, Experiments, Model Registry)

Join our Discord server:   / discord  

#dvc #machinelearning #datascience #generativeai

Комментарии

Информация по комментариям в разработке