Building Systems with the ChatGPT API 课程链接:https://learn.deeplearning.ai/chatgpt-building-system/
介绍了两种 LLM 的情况:Base LLM 使用监督学习进行训练,其开发周期相当漫长,而使用 Instruction tuned LLM 开发 prompt-based AI 则可以将开发过程极大程度缩短。
讲到了 LLM 的 tokenizor 机制,导致 AI 看到的英文是 sub-word 级别而不是单个字母,进而导致 AI 没办法完成将一个单词按字母顺序倒序输出等这类字母基本的任务。我认为在中文中也会遇到类似的问题,我使用 tiktokenizor 对中文做过测试,实践表明有些中文被一整个切割,而有些可能被切为好几份。
然后讲到了 Chat Format,将对话分为三个角色 system、user、assistant。其中涉及对于 assistant 的角色风格和行为的设定最好放到 system 中。
讲到了使用 GPT 对用户的问题进行分类的实践,例子是以一个客服角色对用户问题进行二级分类并要求 GPT 以JSON 格式返回。
其中强调了 delimiter (分隔符)的作用,将需要被分类的用户问题用 delimiter 包裹效果会更好。
import os
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']
def get_completion_from_messages(messages,
model="gpt-3.5-turbo",
temperature=0,
max_tokens=500):
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
)
return response.choices[0].message["content"]
delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with \
{delimiter} characters.
Classify each query into a primary category \
and a secondary category.
Provide your output in json format with the \
keys: primary and secondary.
Primary categories: Billing, Technical Support, \
Account Management, or General Inquiry.
Billing secondary categories:
Unsubscribe or upgrade
Add a payment method
Explanation for charge
Dispute a charge
Technical Support secondary categories:
General troubleshooting
Device compatibility
Software updates
Account Management secondary categories:
Password reset
Update personal information
Close account
Account security
General Inquiry secondary categories:
Product information
Pricing
Feedback
Speak to a human
"""
user_message = f"""\
I want you to delete my profile and all of my user data"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': f"{delimiter}{user_message}{delimiter}"},
]
response = get_completion_from_messages(messages)
print(response)
user_message = f"""\
Tell me more about your flat screen tvs"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': f"{delimiter}{user_message}{delimiter}"},
]
response = get_completion_from_messages(messages)
print(response)
讲到可以使用 openai 的 Moderation API 对 GPT 生成的内容进行审核,会返回一系列的打分,主要是四类:仇恨(威胁)、自残、色情、暴力。
然后讲到可以构建 Prompt 让 GPT 帮助对用户的输入进行判断是否为 prompt injection,以及一些预处理,比如可以预先移除用户的输入中可能存在的 delimiter 等。
提到了一个思维链的用例,GPT 作为一个导购分了 5 个步骤进行思考:
然后提到在 Prompt 中要求了 LLM 使用 delimiter 对每一步进行分割,那么在实际输出给用户看的时候可以将输出切分然后输出最后一段即可。
提到了可以对 Prompts 进行链式处理,用以替代 CoT 进行开发。
其好处包括:
以及更便于开发和调试等优点。
讲解了一个使用多个 prompts 来处理复杂任务的例子:
其中最后一步的代码如下:
system_message = f"""
You are a customer service assistant for a \
large electronic store. \
Respond in a friendly and helpful tone, \
with very concise answers. \
Make sure to ask the user relevant follow up questions.
"""
user_message_1 = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': user_message_1},
{'role':'assistant',
'content': f"""Relevant product information:\n\
{product_information_for_user_message_1}"""},
]
final_response = get_completion_from_messages(messages)
print(final_response)
这一节讲了检查模型输出的几个办法,如使用 openai 的Moderation api 审核内容是否有害,也可以写个Promp让GPT根据提供的产品信息检查输出是否真实,案例中的代码如下:
system_message = f"""
You are an assistant that evaluates whether \
customer service agent responses sufficiently \
answer customer questions, and also validates that \
all the facts the assistant cites from the product \
information are correct.
The product information and user and customer \
service agent messages will be delimited by \
3 backticks, i.e. ```.
Respond with a Y or N character, with no punctuation:
Y - if the output sufficiently answers the question \
AND the response correctly uses product information
N - otherwise
Output a single letter only.
"""
customer_message = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
product_information = """{ "name": "SmartX ProPhone", "category": "Smartphones and Accessories", "brand": "SmartX", "model_number": "SX-PP10", "warranty": "1 year", "rating": 4.6, "features": [ "6.1-inch display", "128GB storage", "12MP dual camera", "5G" ], "description": "A powerful smartphone with advanced camera features.", "price": 899.99 } { "name": "FotoSnap DSLR Camera", "category": "Cameras and Camcorders", "brand": "FotoSnap", "model_number": "FS-DSLR200", "warranty": "1 year", "rating": 4.7, "features": [ "24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses" ], "description": "Capture stunning photos and videos with this versatile DSLR camera.", "price": 599.99 } { "name": "CineView 4K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-4K55", "warranty": "2 years", "rating": 4.8, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "A stunning 4K TV with vibrant colors and smart features.", "price": 599.99 } { "name": "SoundMax Home Theater", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-HT100", "warranty": "1 year", "rating": 4.4, "features": [ "5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth" ], "description": "A powerful home theater system for an immersive audio experience.", "price": 399.99 } { "name": "CineView 8K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-8K65", "warranty": "2 years", "rating": 4.9, "features": [ "65-inch display", "8K resolution", "HDR", "Smart TV" ], "description": "Experience the future of television with this stunning 8K TV.", "price": 2999.99 } { "name": "SoundMax Soundbar", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-SB50", "warranty": "1 year", "rating": 4.3, "features": [ "2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth" ], "description": "Upgrade your TV's audio with this sleek and powerful soundbar.", "price": 199.99 } { "name": "CineView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-OLED55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "Experience true blacks and vibrant colors with this OLED TV.", "price": 1499.99 }"""
q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{final_response_to_customer}```
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages, max_tokens=1)
print(response)
another_response = "life is like a box of chocolates"
q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{another_response}```
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question?
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages)
print(response)
展示了一个链式Prompt处理用户请求的系统的完整案例
import os
import openai
import sys
sys.path.append('../..')
import utils
import panel as pn # GUI
pn.extension()
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']
def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0, max_tokens=500):
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
)
return response.choices[0].message["content"]
def process_user_message(user_input, all_messages, debug=True):
delimiter = "```"
# Step 1: Check input to see if it flags the Moderation API or is a prompt injection
response = openai.Moderation.create(input=user_input)
moderation_output = response["results"][0]
if moderation_output["flagged"]:
print("Step 1: Input flagged by Moderation API.")
return "Sorry, we cannot process this request."
if debug: print("Step 1: Input passed moderation check.")
category_and_product_response = utils.find_category_and_product_only(user_input, utils.get_products_and_category())
#print(print(category_and_product_response)
# Step 2: Extract the list of products
category_and_product_list = utils.read_string_to_list(category_and_product_response)
#print(category_and_product_list)
if debug: print("Step 2: Extracted list of products.")
# Step 3: If products are found, look them up
product_information = utils.generate_output_string(category_and_product_list)
if debug: print("Step 3: Looked up product information.")
# Step 4: Answer the user question
system_message = f"""
You are a customer service assistant for a large electronic store. \
Respond in a friendly and helpful tone, with concise answers. \
Make sure to ask the user relevant follow-up questions.
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': f"{delimiter}{user_input}{delimiter}"},
{'role': 'assistant', 'content': f"Relevant product information:\n{product_information}"}
]
final_response = get_completion_from_messages(all_messages + messages)
if debug:print("Step 4: Generated response to user question.")
all_messages = all_messages + messages[1:]
# Step 5: Put the answer through the Moderation API
response = openai.Moderation.create(input=final_response)
moderation_output = response["results"][0]
if moderation_output["flagged"]:
if debug: print("Step 5: Response flagged by Moderation API.")
return "Sorry, we cannot provide this information."
if debug: print("Step 5: Response passed moderation check.")
# Step 6: Ask the model if the response answers the initial user query well
user_message = f"""
Customer message: {delimiter}{user_input}{delimiter}
Agent response: {delimiter}{final_response}{delimiter}
Does the response sufficiently answer the question?
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': user_message}
]
evaluation_response = get_completion_from_messages(messages)
if debug: print("Step 6: Model evaluated the response.")
# Step 7: If yes, use this answer; if not, say that you will connect the user to a human
if "Y" in evaluation_response: # Using "in" instead of "==" to be safer for model output variation (e.g., "Y." or "Yes")
if debug: print("Step 7: Model approved the response.")
return final_response, all_messages
else:
if debug: print("Step 7: Model disapproved the response.")
neg_str = "I'm unable to provide the information you're looking for. I'll connect you with a human representative for further assistance."
return neg_str, all_messages
user_input = "tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also what tell me about your tvs"
response,_ = process_user_message(user_input,[])
print(response)
def collect_messages(debug=False):
user_input = inp.value_input
if debug: print(f"User Input = {user_input}")
if user_input == "":
return
inp.value = ''
global context
#response, context = process_user_message(user_input, context, utils.get_products_and_category(),debug=True)
response, context = process_user_message(user_input, context, debug=False)
context.append({'role':'assistant', 'content':f"{response}"})
panels.append(
pn.Row('User:', pn.pane.Markdown(user_input, width=600)))
panels.append(
pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))
return pn.Column(*panels)
剩下的部分是对于LLM输出进行评估的办法,以及总结。