⭐作者介绍:大二本科网络工程专业在读,持续学习Java,努力输出优质文章
⭐作者主页:@逐梦苍穹
⭐所属专栏:人工智能。
下面我们来介绍主要类型的输出解析器,PydanticOutputParser。
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List
model_name = 'text-davinci-003'
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
# Define your desired data structure.
class Joke(BaseModel):
setup: str = Field(description="question to set up a joke")
punchline: str = Field(description="answer to resolve the joke")
# You can add custom validation logic easily with Pydantic.
@validator('setup')
def question_ends_with_question_mark(cls, field):
if field[-1] != '?':
raise ValueError("Badly formed question!")
return field
# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
template="Answer the user query.\n{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": parser.get_format_instructions()}
)
# And a query intented to prompt a language model to populate the data structure.
joke_query = "Tell me a joke."
_input = prompt.format_prompt(query=joke_query)
output = model(_input.to_string())
parser.parse(output)
Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')
列表解析器 comma_separated:
您想返回一个逗号分隔项的列表时,可以使用此输出解析器。
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
output_parser = CommaSeparatedListOutputParser()
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
template="List five {subject}.\n{format_instructions}",
input_variables=["subject"],
partial_variables={"format_instructions": format_instructions}
)
model = OpenAI(temperature=0)
_input = prompt.format(subject="ice cream flavors")
output = model(_input)
output_parser.parse(output)
['Vanilla',
'Chocolate',
'Strawberry',
'Mint Chocolate Chip',
'Cookies and Cream']
自动修复解析器 output_fixing_parser:
此输出解析器包装另一个输出解析器,如果第一个解析器失败,则调用另一个LLM来修复任何错误。
但除了抛出错误之外,我们还可以做其他事情。具体来说,我们可以将格式不正确的输出和格式化的指令一起传递给模型,并要求它修复它。
对于这个例子,我们将使用上面的 Pydantic 输出解析器。如果我们传递给它一个不符合模式的结果,会发生以下情况:
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List
class Actor(BaseModel):
name: str = Field(description="name of an actor")
film_names: List[str] = Field(description="list of names of films they starred in")
actor_query = "Generate the filmography for a random actor."
parser = PydanticOutputParser(pydantic_object=Actor)
misformatted = "{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}"
parser.parse(misformatted)
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
File ~/workplace/langchain/langchain/output_parsers/pydantic.py:23, in PydanticOutputParser.parse(self, text)
22 json_str = match.group()
---> 23 json_object = json.loads(json_str)
24 return self.pydantic_object.parse_obj(json_object)
File ~/.pyenv/versions/3.9.1/lib/python3.9/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:
File ~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
333 """Return the Python representation of ``s`` (a ``str`` instance
334 containing a JSON document).
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
File ~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:353, in JSONDecoder.raw_decode(self, s, idx)
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
During handling of the above exception, another exception occurred:
OutputParserException Traceback (most recent call last)
Cell In[6], line 1
----> 1 parser.parse(misformatted)
File ~/workplace/langchain/langchain/output_parsers/pydantic.py:29, in PydanticOutputParser.parse(self, text)
27 name = self.pydantic_object.__name__
28 msg = f"Failed to parse {name} from completion {text}. Got: {e}"
---> 29 raise OutputParserException(msg)
OutputParserException: Failed to parse Actor from completion {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
现在我们可以构建并使用一个 OutputFixingParser。这个输出解析器接受另一个输出解析器作为参数,还有一个 LLM,用来尝试纠正任何格式错误。
from langchain.output_parsers import OutputFixingParser
new_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI())
new_parser.parse(misformatted)
Actor(name='Tom Hanks', film_names=['Forrest Gump'])
结构化输出解析器 structured:
当您想要返回多个字段时,可以使用此输出解析器。尽管Pydantic/JSON解析器更强大,但我们最初尝试的数据结构仅具有文本字段。
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
这里我们定义了我们想要接收的响应模式。
response_schemas = [
ResponseSchema(name="answer", description="answer to the user's question"),
ResponseSchema(name="source", description="source used to answer the user's question, should be a website.")
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
我们现在获得一个包含响应格式化指令的字符串,然后将其插入到我们的提示中。
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
template="answer the users question as best as possible.\n{format_instructions}\n{question}",
input_variables=["question"],
partial_variables={"format_instructions": format_instructions}
)
我们现在可以使用这个来格式化一个提示,发送给语言模型,然后解析返回的结果。
model = OpenAI(temperature=0)
_input = prompt.format_prompt(question="what's the capital of france?")
output = model(_input.to_string())
output_parser.parse(output)
{'answer': 'Paris',
'source': 'https://www.worldatlas.com/articles/what-is-the-capital-of-france.html'}
这里是一个在聊天模型中使用它的例子
chat_model = ChatOpenAI(temperature=0)
prompt = ChatPromptTemplate(
messages=[
HumanMessagePromptTemplate.from_template("answer the users question as best as possible.\n{format_instructions}\n{question}")
],
input_variables=["question"],
partial_variables={"format_instructions": format_instructions}
)
_input = prompt.format_prompt(question="what's the capital of france?")
output = chat_model(_input.to_messages())
output_parser.parse(output.content)
{'answer': 'Paris', 'source': 'https://en.wikipedia.org/wiki/Paris'}