What is DSPy?
Declarative Self-improving Python (DSPy) is an open-source python framework [paper, github] developed by researchers at Stanford, designed to enhance the way developers interact with language models (LMs). Unlike traditional methods that rely heavily on fixed prompt templates and manual adjustments, DSPy enables a declarative programming approach. This allows users to focus on defining tasks and metrics rather than crafting intricate prompts, ultimately leading to more reliable and scalable AI applications. It provides optimizers to tune both prompts and model weights.
DSPy is an open-source Python framework for programming, rather than prompting, language models.
The framework effectively shifts the focus from crafting perfect prompts to programming the models directly using structured and declarative natural-language modules written in Python.
Key Features of DSPy
Declarative Programming
At the core of DSPy is its declarative programming model. Users specify what they want the model to achieve and how success will be measured. This separation of logic from text allows DSPy to automatically optimize the underlying prompts, making it easier for developers to build complex AI systems without getting bogged down in the intricacies of prompt engineering.
Self-Improving Prompts
One of the standout features of DSPy is its ability to automatically refine prompts over time. By leveraging feedback and evaluation mechanisms, DSPy continuously improves the performance of language models, reducing the need for constant manual adjustments. This self-improving capability ensures that applications become more effective with each iteration.
Modular Architecture
DSPy employs a modular architecture that allows developers to create reusable components for various natural language processing (NLP) tasks. These modules can be combined and customized according to specific needs, facilitating a more streamlined development process. Each module encapsulates a prompting technique and can be adapted to fit different tasks, making it versatile for a range of applications such as question answering, text summarization, and code generation.
How does DSPy work?
Instead of relying on manually written prompts, DSPy utilizes a modular and declarative approach to define tasks, construct pipelines, and optimize prompts automatically, offering a more streamlined and efficient way to work with LLMs.
signatures
At the heart of DSPy lies the concept of signatures, which are essentially natural language typed declarations that define the input/output behaviour of a module.
Modules
Modules are reusable building blocks for various NLP tasks. They can be combined and customised to fit different needs. Think of modules as prefabricated blocks you can snap together to build your program.
At the heart of all these modules is the dspy.Predict module, which all modules, including Signature, invoke through their forward() function call. Creating your own modules is encouraged, and this is the core way to construct complex data pipeline programs in DSPy.
Teleprompters
Inspired by PyTorch optimizers, Teleprompters play a crucial role in automatically improving prompts over time using feedback and evaluation to ensure that the model performs better with each iteration.
These are general-purpose optimization strategies that determine how the modules should learn from data. Given a training set of examples and a metric to evaluate performance, the DSPy compiler leverages teleprompters to generate an optimized instance of a program module, akin to how learning optimizers like SGD are used in ML frameworks like PyTorch.
Different optimizers in DSPy work by synthesizing good few-shot examples for every module, proposing and intelligently exploring better natural-language instructions for every prompt, and building datasets for your modules and using them to finetune the LM weights in your system.
DSPy’s typical workflow
- Task Definition: Clearly define the task goal and metrics to measure the model’s success.
- Pipeline Construction: Select and configure appropriate modules for the task, connecting them in a logical sequence to form a pipeline.
- Optimization and Compilation: Employ teleprompters to optimize prompts through in-context learning and few-shot example generation, and optionally fine-tune smaller models. The entire pipeline is then compiled into executable Python code.
- Evaluation and Iteration: Evaluate the performance of the compiled program against the defined metrics and iterate on the data, program structure, metrics, and optimizers to refine the system.
Example Implementations
answerBot = dspy.ChainOfThought("question -> answer")
answerBot(question="What is 3+5?")
### answer='The result of adding 3 and 5 is 8.'
The interesting part is the prompt that is generated automatically, which is given below:
[{"content": """Your input fields are:
1. `question` (str)
Your output fields are:
1. `reasoning` (str)
2. `answer` (str)
All interactions will be structured in the following way, with the appropriate values filled in.
[[ ## question ## ]]
{question}
[[ ## reasoning ## ]]
{reasoning}
[[ ## answer ## ]]
{answer}
[[ ## completed ## ]]
In adhering to this structure, your objective is:
Given the fields `question`, produce the fields `answer`.""",
"role": "system"},
{"content": """[[ ## question ## ]]
What is 3+5?
Respond with the corresponding output fields,
starting with the field `[[ ## reasoning ## ]]`,
then `[[ ## answer ## ]]`, and then ending with the
marker for `[[ ## completed ## ]]`.""",
"role": "user"}
]
Find the detailed code for other examples and their responses below.
Example 1:
import dspy
lm = dspy.LM('ollama_chat/llama3.2', api_base='http://localhost:11434', api_key='')
dspy.configure(lm=lm)
answerBot = dspy.ChainOfThought("question -> answer: float")
answerBot(question="What is 3+5?")
# Prediction(
# reasoning='The calculation 3+5 is a basic arithmetic operation that can be solved by adding the two numbers together.',
# answer=8.0
# )
pprint(lm.history[-1])
Generated Prompt:
{'cost': None,
'kwargs': {'max_tokens': 1000, 'temperature': 0.0},
'messages': [{'content': 'Your input fields are:\n'
'1. `question` (str)\n'
'\n'
'Your output fields are:\n'
'1. `reasoning` (str)\n'
'2. `answer` (float)\n'
'\n'
'All interactions will be structured in the '
'following way, with the appropriate values filled '
'in.\n'
'\n'
'[[ ## question ## ]]\n'
'{question}\n'
'\n'
'[[ ## reasoning ## ]]\n'
'{reasoning}\n'
'\n'
'[[ ## answer ## ]]\n'
'{answer} # note: the value you produce must '
'be a single float value\n'
'\n'
'[[ ## completed ## ]]\n'
'\n'
'In adhering to this structure, your objective is: \n'
' Given the fields `question`, produce the '
'fields `answer`.',
'role': 'system'},
{'content': '[[ ## question ## ]]\n'
'What is 3+5?\n'
'\n'
'Respond with the corresponding output fields, '
'starting with the field `[[ ## reasoning ## ]]`, '
'then `[[ ## answer ## ]]` (must be formatted as a '
'valid Python float), and then ending with the '
'marker for `[[ ## completed ## ]]`.',
'role': 'user'}],
'model': 'ollama_chat/llama3.2',
'model_type': 'chat',
'outputs': ['[[ ## reasoning ## ]]\n'
'The calculation 3+5 is a basic arithmetic operation that can be '
'solved by adding the two numbers together.\n'
'\n'
'[[ ## answer ## ]]\n'
'8.0\n'
'\n'
'[[ ## completed ## ]]'],
'prompt': None,
'response': ModelResponse(id='chatcmpl-0573813f-e8b3-45c8-ba61-91b4a5cea9fe', created=1736159100, model='ollama_chat/llama3.2', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='stop', index=0, message=Message(content='[[ ## reasoning ## ]]\nThe calculation 3+5 is a basic arithmetic operation that can be solved by adding the two numbers together.\n\n[[ ## answer ## ]]\n8.0\n\n[[ ## completed ## ]]', role='assistant', tool_calls=None, function_call=None))], usage=Usage(completion_tokens=42, prompt_tokens=210, total_tokens=252, completion_tokens_details=None, prompt_tokens_details=None)),
'timestamp': '2025-01-06T10:25:00.807277',
'usage': {'completion_tokens': 42,
'completion_tokens_details': None,
'prompt_tokens': 210,
'prompt_tokens_details': None,
'total_tokens': 252},
'uuid': 'd753ea62-cf99-4235-aeb9-bd1c3a061204'}
Example 2:
document = """The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."""
summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document=document)
print(response.summary)
# Generated Summary: Lee is a 21-year-old football player who has played for several teams, including the Hammers, Blackpool, and Colchester United. He currently plays for the promoted Tykes.
pprint(lm.history[-1])
Generated Prompt:
{'cost': None,
'kwargs': {'max_tokens': 1000, 'temperature': 0.0},
'messages': [{'content': 'Your input fields are:\n'
'1. `document` (str)\n'
'\n'
'Your output fields are:\n'
'1. `reasoning` (str)\n'
'2. `summary` (str)\n'
'\n'
'All interactions will be structured in the '
'following way, with the appropriate values filled '
'in.\n'
'\n'
'[[ ## document ## ]]\n'
'{document}\n'
'\n'
'[[ ## reasoning ## ]]\n'
'{reasoning}\n'
'\n'
'[[ ## summary ## ]]\n'
'{summary}\n'
'\n'
'[[ ## completed ## ]]\n'
'\n'
'In adhering to this structure, your objective is: \n'
' Given the fields `document`, produce the '
'fields `summary`.',
'role': 'system'},
{'content': '[[ ## document ## ]]\n'
'The 21-year-old made seven appearances for the '
'Hammers and netted his only goal for them in a '
'Europa League qualification round match against '
'Andorran side FC Lustrains last season. Lee had two '
'loan spells in League One last term, with Blackpool '
'and then Colchester United. He scored twice for the '
"U's but was unable to save them from relegation. "
"The length of Lee's contract with the promoted "
'Tykes has not been revealed. Find all the latest '
'football transfers on our dedicated page.\n'
'\n'
'Respond with the corresponding output fields, '
'starting with the field `[[ ## reasoning ## ]]`, '
'then `[[ ## summary ## ]]`, and then ending with '
'the marker for `[[ ## completed ## ]]`.',
'role': 'user'}],
'model': 'ollama_chat/llama3.2',
'model_type': 'chat',
'outputs': ['[[ ## reasoning ## ]]\n'
'The document provides information about a 21-year-old football '
'player named Lee, who has made appearances for several teams '
'including the Hammers, Blackpool, and Colchester United. The '
'document also mentions his loan spells and his current contract '
'with the promoted Tykes.\n'
'\n'
'[[ ## summary ## ]]\n'
'Lee is a 21-year-old football player who has played for several '
'teams, including the Hammers, Blackpool, and Colchester United. '
'He currently plays for the promoted Tykes.\n'
'\n'
'[[ ## completed ## ]]'],
'prompt': None,
'response': ModelResponse(id='chatcmpl-e0b0aec0-ca3d-49df-ad13-1bbc9825605d', created=1736157660, model='ollama_chat/llama3.2', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='stop', index=0, message=Message(content='[[ ## reasoning ## ]]\nThe document provides information about a 21-year-old football player named Lee, who has made appearances for several teams including the Hammers, Blackpool, and Colchester United. The document also mentions his loan spells and his current contract with the promoted Tykes.\n\n[[ ## summary ## ]]\nLee is a 21-year-old football player who has played for several teams, including the Hammers, Blackpool, and Colchester United. He currently plays for the promoted Tykes.\n\n[[ ## completed ## ]]', role='assistant', tool_calls=None, function_call=None))], usage=Usage(completion_tokens=106, prompt_tokens=281, total_tokens=387, completion_tokens_details=None, prompt_tokens_details=None)),
'timestamp': '2025-01-06T10:05:58.914785',
'usage': {'completion_tokens': 106,
'completion_tokens_details': None,
'prompt_tokens': 281,
'prompt_tokens_details': None,
'total_tokens': 387},
'uuid': '43bd24f7-3c8f-444d-ba9d-6ecaf2722a12'}
Example 3:
from typing import Literal
class Emotion(dspy.Signature):
"""Classify emotion."""
sentence: str = dspy.InputField()
sentiment: Literal['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'] = dspy.OutputField()
sentence = "i started feeling a little vulnerable when the giant spotlight started blinding me"
classify = dspy.Predict(Emotion)
classify(sentence=sentence)
# Prediction(sentiment='fear')
pprint(lm.history[-1])
Generated prompt:
{'cost': None,
'kwargs': {'max_tokens': 1000,
'response_format': <class 'dspy.adapters.json_adapter.DSPyProgramOutputs'>,
'temperature': 0.0},
'messages': [{'content': 'Your input fields are:\n'
'1. `sentence` (str)\n'
'\n'
'Your output fields are:\n'
'1. `sentiment` (Literal[sadness, joy, love, anger, '
'fear, surprise])\n'
'\n'
'All interactions will be structured in the '
'following way, with the appropriate values filled '
'in.\n'
'\n'
'Inputs will have the following structure:\n'
'\n'
'[[ ## sentence ## ]]\n'
'{sentence}\n'
'\n'
'Outputs will be a JSON object with the following '
'fields.\n'
'\n'
'{\n'
' "sentiment": "{sentiment} # note: the '
'value you produce must be one of: sadness; joy; '
'love; anger; fear; surprise"\n'
'}\n'
'\n'
'In adhering to this structure, your objective is: \n'
' Classify emotion.',
'role': 'system'},
{'content': '[[ ## sentence ## ]]\n'
'i started feeling a little vulnerable when the '
'giant spotlight started blinding me\n'
'\n'
'Respond with a JSON object in the following order '
'of fields: `sentiment` (must be formatted as a '
'valid Python Literal[sadness, joy, love, anger, '
'fear, surprise]).',
'role': 'user'}],
'model': 'ollama_chat/llama3.2',
'model_type': 'chat',
'outputs': ['{\n "sentiment": "fear"\n}'],
'prompt': None,
'response': ModelResponse(id='chatcmpl-b85b3001-dab1-4133-8039-c58a7fe04b9a', created=1736157622, model='ollama_chat/llama3.2', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='stop', index=0, message=Message(content='{\n "sentiment": "fear"\n}', role='assistant', tool_calls=None, function_call=None))], usage=Usage(completion_tokens=12, prompt_tokens=220, total_tokens=232, completion_tokens_details=None, prompt_tokens_details=None)),
'timestamp': '2025-01-06T10:18:28.884511',
'usage': {'completion_tokens': 12,
'completion_tokens_details': None,
'prompt_tokens': 220,
'prompt_tokens_details': None,
'total_tokens': 232},
'uuid': 'd69357a8-0482-4ca5-a00b-469b202b8a35'}
Advantages of Using DSPy
DSPy presents a number of compelling advantages over traditional prompt engineering methods, making it a powerful tool for working with LLMs:
- Improved Reliability: DSPy’s declarative approach and automatic prompt optimization result in more reliable and predictable LLM behaviour. By focusing on defining the desired task outcome, developers can rely on DSPy to handle the complexities of prompt creation and optimization, ensuring consistency and reducing unexpected outputs.
- Simplified Development: The modular architecture and automatic prompt optimization in DSPy significantly simplify LLM application development. Developers can build complex applications by combining pre-built modules and concentrate on the application’s logic without getting bogged down in prompt engineering intricacies.
- Adaptability: DSPy readily adapts to new tasks and domains by simply adjusting the task definition, metrics, and potentially providing a few new examples. The framework automatically reconfigures itself to meet the updated requirements, promoting flexibility and reusability across different use cases.
- Scalability: DSPy’s optimization techniques shine when handling large-scale tasks and datasets. The framework automatically refines prompts and adjusts the model’s behaviour as needed, enabling seamless scalability and ensuring consistent performance even as tasks grow in complexity.
Real-World Applications of DSPy
DSPy’s versatile capabilities enable its application across a wide range of NLP tasks:
- Question Answering: Build robust QA systems combining RAG with chain-of-thought prompting.
- Text Summarization: Easily create summarization pipelines adaptable to various input lengths and writing styles.
- Code Generation: Generate code snippets from descriptions.
- Language Translation: Develop smarter, context-aware translation systems that consider cultural nuances and specialized domains.
- Chatbots and Conversational AI: Create more engaging and human-like chatbot experiences with improved contextual understanding and response generation capabilities.
Closing Thoughts
DSPy signifies a transition from tedious prompt engineering to a more intuitive and efficient programming approach. By abstracting away the complexities of prompt optimization and providing a modular framework, DSPy empowers developers to focus on high-level design and application logic, unlocking the true potential of LLMs for a wide range of NLP tasks.
While DSPy is a relatively new framework, its community is rapidly growing, and its impact on the LLM landscape is undeniable.