Structured Outputs with DSPy

Описание к видео Structured Outputs with DSPy

The code for this notebook can be found here! - https://github.com/weaviate/recipes/b...

Unfortunately, Large Language Models will not consistently follow the instructions that you give them. This is a massive problem when you are building AI systems that require a particular type of output from the previous step to feed into the next one!

For example, imagine you are building a blog post writing system that first takes a question and retrieved context to output a list of topics. These topics have to be formatted in a particular way, such as a comma-separated list or a JSON of Topic objects, such that the system can continue writing the blog post!

I am SUPER excited to share the 4th video in my DSPy series, diving into 3 solutions to structuring outputs in DSPy programs: (1) TypedPredictors, (2) DSPy Assertions, and (3) Custom Guardrails with the DSPy programming model!

TypedPredictors follow the line of thinking around JSON mode and using Pydantic BaseModels to interface types and custom objects into a JSON template for LLMs. The output can then be validated to provide a more structured retry prompt to correct the output structure!

DSPy Assertions are one of the core building blocks of DSPy, offering an interface to input a boolean-valued function and a retry prompt which is templated alongside the past output to retry the call to the LLM!

Custom Guardrails with the DSPy Programming Model are one of the things I love the most about DSPy — we have unlimited flexibility to control these systems however we want. The video will also show you how to write custom guardrails and retry Signatures and discussion around using TypedPredictors for your Custom Guardrails and potentially feeding your Custom Guardrails into a DSPy Assertion.

I had so much fun exploring this topic! Further seeing how well OpenAI’s GPT-4 and GPT-3.5-Turbo, Cohere’s Command R, and Mistral 7B hosted with Ollama perform with each Structured Output strategy! I also found monitoring structured output retries to be another fantastic application of Arize Phoenix! I hope you find the video useful!

If interested, all the code examples for this DSPy series can be found here! - https://github.com/weaviate/recipes/t...

OpenAI Function Calling: https://openai.com/blog/function-call...
Gorilla LLM Function Calling Leaderboard: https://gorilla.cs.berkeley.edu/leade...
Instructor Examples: https://github.com/jxnl/instructor/tr...

*ERRATA* (Massive thank you to Thomas Ahle for sending some notes and clarifications of content covered in the video)
1. `dspy.TypedPredictor` can be used directly instead of `dspy.functional.TypedPredictor`
2. When creating a pydantic type, `list[Topic]` can be used directly in the Signature without needing the `Topics` wrapper.
3. The default `max_retry` for TypedPredictor is 3, and can be set when creating the TypedPredictor.
4. Setting `TypedPredictor(explain_errors=True)` can help with retry errors by providing clearer descriptions of what needs to change.

Chapters
0:00 Welcome! Let’s Structure LLM Outputs!
3:12 Background (Instructor, Function Calling, JSON mode)
9:08 TypedPredictors Demo
16:00 Logging with Arize Phoenix
18:15 DSPy Assertions
22:15 Custom Guardrails

Комментарии

Информация по комментариям в разработке