【Python】PydanticOutputParserについて試してみた！

2025年1月29日
2025年1月29日
python

こんにちは。

野中やすおです。

今回の記事では、PydanticOutputParserについて手元で試してみたので備忘録がてら記事にしておきます。

1 How to use output parsers to parse an LLM response into structured format
2 PydanticOutputParserの定義ドキュメント

この記事の目次

PydanticOutputParserとは何か？

PydanticOutputParserとは、LangChainが提供する Output Parserの１つで、Pydanticモデルを使用してデータのバリデーションとパースをしてくれます。

またPydanticOutputParser.get_format_instructionsメソッドは、出力をPydantic モデルのスキーマに従った JSON形式で整形するためのガイドラインを提供するメソッドです。

このガイドラインは、主にLangChainでLLMにプロンプトを送る際に、期待するデータフォーマットを明示的に伝える目的で使われます。

PydanticOutputParserの基本的な使い方

以下のような実装をすることで、

JSON形式で出力するべきであるという説明
JSON スキーマの説明
出力するべきJSON スキーマ

が返ってきます。

from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field


class Product(BaseModel):
    name: str = Field(..., description="商品の名前")
    price: float = Field(..., description="商品の価格")

# PydanticOutputParser のインスタンスを作成
output_parser = PydanticOutputParser(pydantic_object=Product)

format_instructions = output_parser.get_format_instructions()

print(format_instructions)

#The output should be formatted as a JSON instance that conforms to the JSON schema below.
#
#As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
#the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
#
#Here is the output schema:
#```
#{"properties": {"name": {"description": "商品の名前", "title": "Name", "type": "string"}, "price": {"description": "商品の価格", "title": "Price", "type": "number"}}, "required": ["name", "price"]}

from langchain_core.output_parsers import PydanticOutputParser

from pydantic import BaseModel, Field

class Product(BaseModel):

name: str = Field(..., description="商品の名前")

price: float = Field(..., description="商品の価格")

# PydanticOutputParser のインスタンスを作成

output_parser = PydanticOutputParser(pydantic_object=Product)

format_instructions = output_parser.get_format_instructions()

print(format_instructions)

#The output should be formatted as a JSON instance that conforms to the JSON schema below.

#As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}

#the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

#Here is the output schema:

#```

#{"properties": {"name": {"description": "商品の名前", "title": "Name", "type": "string"}, "price": {"description": "商品の価格", "title": "Price", "type": "number"}}, "required": ["name", "price"]}

PydanticOutputParserと連鎖との組み合わせ

上記を踏まえて、chainを使いながら具体的な指示をしてあげます。

そうすると以下のような実装になります。

from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

class Product(BaseModel):
    name: str = Field(..., description="商品の名前")
    price: float = Field(..., description="商品の価格")

output_parser = PydanticOutputParser(pydantic_object=Product)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "あなたは優秀な商人AIエージェントです。ユーザーの入力に対して適切な回答を返してください。\n\n{format_instructions}"),
        ("human", "{product_name}"),
    ]
)

prompt_with_format_instructions = prompt.partial(
    format_instructions=output_parser.get_format_instructions()
)

model = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind(
    response_format={"type": "json_object"}
)

chain = prompt_with_format_instructions | model | output_parser

# チェーンを実行し、レスポンスを取得
response = chain.invoke({"product_name": "MacBook Pro 16インチの価格を円ベースで教えてください。"})
print(response) # Product(name='MacBook Pro 16インチ' price=250000.0)
print(type(response)) # <class '__main__.Product'>

from langchain_core.output_parsers import PydanticOutputParser

from langchain_core.prompts import ChatPromptTemplate

from langchain_openai import ChatOpenAI

from pydantic import BaseModel, Field

class Product(BaseModel):

name: str = Field(..., description="商品の名前")

price: float = Field(..., description="商品の価格")

output_parser = PydanticOutputParser(pydantic_object=Product)

prompt = ChatPromptTemplate.from_messages(

[

("system", "あなたは優秀な商人AIエージェントです。ユーザーの入力に対して適切な回答を返してください。\n\n{format_instructions}"),

("human", "{product_name}"),

]

)

prompt_with_format_instructions = prompt.partial(

format_instructions=output_parser.get_format_instructions()

)

model = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind(

response_format={"type": "json_object"}

)

chain = prompt_with_format_instructions | model | output_parser

# チェーンを実行し、レスポンスを取得

response = chain.invoke({"product_name": "MacBook Pro 16インチの価格を円ベースで教えてください。"})

print(response) # Product(name='MacBook Pro 16インチ' price=250000.0)

print(type(response)) # <class '__main__.Product'>

AIエージェントがMacbook Proの16インチの価格を教えてくれました！

まとめ

PydanticOutputParserの基本的な使い方をやってみました！

Pydantic のバリデーション機能を活用しつつ、LangChainのエコシステムと統合することで、安全かつ確実にAIのレスポンスを活用できるな〜と思いました！

特に、今後ますますAIを使ったプロジェクトが増えてくると思うので、出力が不定形になりがちなプロジェクトでは、PydanticOutputParserを導入することでデータの一貫性を確保しやすくなるはずです！

参考

How to use output parsers to parse an LLM response into structured format

How to use output parsers to parse an LLM response into structured format | 🦜️🔗 LangChain

Language models output text. But there are times where you w…

PydanticOutputParserの定義ドキュメント

PydanticOutputParser — 🦜🔗 LangChain documentation

…

【Python】PydanticOutputParserについて試してみた！

PydanticOutputParserとは何か？

PydanticOutputParserの基本的な使い方

PydanticOutputParserと連鎖との組み合わせ

まとめ

参考

How to use output parsers to parse an LLM response into structured format

PydanticOutputParserの定義ドキュメント

野中やすおの年末年始の過ごし方！

【Python】with_structured_outputがとても便利！

pythonの最新記事8件

【LangGraph】LangGraphで学ぶエージェント設計の基本：単一エージェント構成をGoogle Colabで実装してみた！

【Google Colab対応】Pythonでコードを書きながらRAGの仕組みを理解してみた！

【Python】StrOutputParserを試してみる

【Python】with_structured_outputがとても便利！