datamonday

吴恩达ChatGPT《Finetuning Large Language Models》笔记

课程地址：https://learn.deeplearning.ai/finetuning-large-language-models/lesson/1/introduction

Introduction

动机：虽然编写提示词（Prompt）可以让LLM按照指示执行任务，比如提取文本中的关键词，或者对文本进行情绪分类。但是，微调LLM，可以让其更一致地做具体的任务。例如，微调LLM对话时的语气。

课程大纲：

Why finetune

简单理解，微调（fine-tuning）就是利用特有数据和技巧将通用模型转换为能执行具体任务的一种方式。例如，将 GPT-3 这种通用模型转换为诸如 ChatGPT 这样的专门用于聊天的模型。或者将 GPT-4 转换为诸如 GitHub Coplot 这样的专门用于写代码的模型。

课程中举了一个家庭医生的例子来说明。家庭医生可以类比为一个通用模型，微调后的模型或者说专业模型就像是具有特定能力的医生，例如心脏病专家，皮肤病专家。

微调对模型做了什么？

使得模型能够处理比提示词（Prompt）更长的数据
使得模型能够从数据中学习
使得模型能够产生更一致的输出，一个例子：
减少模型幻觉（hallucination）
可以让用户将通用模型转换为特定用途的模型
微调的过程与模型早期的训练方法非常相似

提示词工程和微调的优缺点

下面是提示词工程（Prompt Engineering）和微调（Fine-tuning）的优缺点对比

	Prompt Engineering	Fine-tuning
Pros	1）开箱即用，无需数据 2）使用成本低 3）使用门槛低 4）可以通过检索增强生成（RAG）技术连接用户的数据	1）理论上可以输入无限量的数据进行微调 2）模型可以从数据中进行学习新的信息 3）纠正模型存在的错误信息 4）后续使用的成本更低 5）可以通过检索增强生成（RAG）技术连接用户的数据
Cons	1）无法输入大量的数据，token有限制 2）如果输入大量数据，模型可能会遗忘 3）模型幻觉，而且难以纠正 4）使用RAG可能会遗漏数据或者获取错误的数据，导致模型输出错误的结果	1）需要高质量的数据 2）前期微调需要成本 3）需要特定的技术能力，例如数据获取和预处理
Scenario	通用场景，快速启动项目和产品原型	行业应用，企业级应用

微调的好处

性能方面：

减少模型幻觉问题，防止LLM生成与领域无关的内容。
给模型注入更多的领域知识，让生成的结果更加一致（consistency），更加可靠。例如，有些模型在初期可以生成高质量的内容，但是之后却无法持续生成同样高质量的内容了。

隐私方面：

本地微调，可以防止数据泄露

成本方面：

单词请求成本更低
增加透明性（transparency）
可以对成本和其他因素有更大的控制权，例如运行时间和延迟时间

可靠性方面：

控制运行时间
更低的延迟
更加无害

微调的第三方库

PyTorch

HuggingFace

Llama Library

微调后的模型对比

作者用 Llama 库比较了微调后的模型，没有微调的模型和ChatGPT的表现。

from llama import BasicModelRunner

non_finetuned = BasicModelRunner("meta-llama/Llama-2-7b-hf")
finetuned_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
chatgpt = BasicModelRunner("chat-gpt")

question1 = "Tell me how to train my dog to sit"
question2 = "What do you think of Mars?"
question3 = "taylor swift's best friend"
question4 = """Agent: I'm here to help you with your Amazon deliver order.
Customer: I didn't get my item
Agent: I'm sorry to hear that. Which item was it?
Customer: the blanket
Agent:"""

non_finetuned(question1)
non_finetuned(question2)
non_finetuned(question3)
non_finetuned(question4)

	Non-fine-tuned	fine-tuned	ChatGPT
示例1：Tell me how to train my dog to sit	Tell me how to train my dog to sit. I have a 10 month old puppy and I want to train him to sit. I have tried the treat method and he just sits there and looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks at me like I am crazy. I have tried the “sit” command and he just looks	on command. How to Train Your Dog to Sit on Command Training your dog to sit on command is a basic obedience command that can be achieved with patience, consistency, and positive reinforcement. Here’s a step-by-step guide on how to train your dog to sit on command: 1. Choose a Quiet and Distraction-Free Area: Find a quiet area with minimal distractions where your dog can focus on you. 2. Have Treats Ready: Choose your dog’s favorite treats and have them ready to use as rewards. 3. Stand in Front of Your Dog: Stand in front of your dog and hold a treat close to their nose. 4. Move the Treat Above Your Dog’s Head: Slowly move the treat above your dog’s head, towards their tail. As your dog follows the treat with their nose, their bottom will naturally lower into a sitting position. 5. Say “Sit” and Reward: As soon as your dog’s butt touches the ground, say “Sit” and give them the treat. It’s important to say the command word as they’re performing	Training a dog to sit is a basic command that can be taught using positive reinforcement techniques. Here’s a step-by-step guide on how to train your dog to sit: 1. Choose a quiet and distraction-free environment: Find a calm area in your home or a quiet outdoor space where your dog can focus on the training without any distractions. 2. Gather treats: Use small, soft, and tasty treats that your dog loves. These treats will serve as rewards for your dog’s correct behavior. 3. Get your dog’s attention: Call your dog’s name or use a clicker to get their attention. Make sure they are looking at you before proceeding. 4. Lure your dog into a sitting position: Hold a treat close to your dog’s nose and slowly move it upwards and slightly backward over their head. As their nose follows the treat, their bottom will naturally lower into a sitting position. Once they are sitting, say “sit” in a clear and firm voice. 5. Reward and praise: As soon as your dog sits, give them the treat and offer verbal praise such as “good sit” or “well done.” This positive reinforcement helps them associate the action with the reward. 6. Repeat the process: Practice the sit command multiple times in short training
	对于这个指令，模型并未进行过相关的训练。	自动补全的指令，并详细地介绍了如何一步一步地训练狗坐下。在Llama中，如果不需要模型自动补全，可以在指令前后增加 [INST] 标识符。	并没有比微调过的模型生成的答案更好
示例2：What do you think of Mars?	I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think it’s a great planet. I think	Mars is a fascinating planet that has captured the imagination of humans for centuries. It is the fourth planet from the Sun in our solar system and is known for its reddish appearance. Mars is a rocky planet with a thin atmosphere, and its surface is characterized by volcanoes, canyons, and impact craters. One of the most intriguing aspects of Mars is its potential for supporting life. While there is currently no evidence of life on Mars, the planet’s atmosphere and geology suggest that it may have been habitable in the past. NASA’s Curiosity rover has been exploring Mars since 2012, and has discovered evidence of water on the planet, which is a key ingredient for life. Mars is also a popular target for space missions and future human settlements. Several space agencies and private companies are planning missions to Mars in the coming years, with the goal of establishing a human presence on the planet. The challenges of establishing a human settlement on Mars are significant, including the harsh environment, lack of resources, and distance from Earth. However, many experts believe that Mars is the next logical step in the expl
	模型可以回答模型，但是并不好，很短。	模型生成了更长的回答，并且质量更高。
示例3：taylor swift’s best friend	I’m not sure if I’ve mentioned this before, but I’m a huge Taylor Swift fan. I’ve been a fan since her first album, and I’ve been a fan ever since. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I’ve been a fan of her music, her style, her personality, and her music. I	Taylor Swift’s best friend is a person who has been by her side through thick and thin. Here are some possible candidates: 1. Abigail Anderson - Abigail is Taylor’s childhood friend and has been a constant presence in her life. The two have been inseparable since they met in kindergarten and have shared countless memories together. 2. Selena Gomez - Selena and Taylor have been friends for over a decade and have been through a lot together. They have collaborated on music projects, gone on vacation together, and have been there for each other through personal struggles. 3. Liz Rose - Liz is a songwriter and producer who has worked with Taylor on many of her hit songs. The two have a close creative partnership and have been friends for many years. 4. Joe Jonas - Joe and Taylor were friends for several years before they started dating in 2008. They have been on-again, off-again since then and have been there for each other through many ups and downs. 5. Calvin Harris - Calvin and Taylor started dating in 2015 and
	模型答非所问	模型提供了几个候选人
示例4：“”“Agent: I’m here to help you with your Amazon deliver order. Customer: I didn’t get my item Agent: I’m sorry to hear that. Which item was it? Customer: the blanket Agent:”“”	I’m sorry to hear that. I’ll look into it. Customer: I’m not sure if I got the right blanket. Agent: I’m sorry to hear that. I’ll look into it. Customer: I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. Agent: I’m sorry to hear that. I’ll look into it. I’ll look into it. Customer: I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. Agent: I’m sorry to hear that. I’ll look into it. I’ll look into it. I’ll look into it. Customer: I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. I’m not sure if I got the right blanket. I’m not sure if I got the right blank	I see. Can you please provide me with your order number so I can look into this for you? Customer: I don’t have the order number. Agent: Okay, no worries. Can you please tell me the name of the item you didn’t receive? Customer: Blanket Agent: And what was the shipping address for this item? Customer: 123 Main St Agent: Okay, thank you. I’m going to check on the status of your order. Can you please hold for just a moment? Customer: (hangs up) This is a common scenario in customer service, where the agent is trying to gather information from the customer to help resolve their issue. The agent is using active listening skills to gather information from the customer, such as the order number, the name of the item, and the shipping address. The agent is also using empathy to acknowledge the customer’s concern and let them know that they are there to help.
	模型能够模仿ChatGPT那种对话的形式生成答案，但是内容都是无意义的。	跟真实的对话更像，并且更合理。

Where finetune fits in

微调步骤在预训练之后。

预训练

模型一开始是完全随机的（权重随机初始化），没有单词组装能力。

预训练（Pre-training）的目标是：预测下一个token（Next token prediction），这通常称为自监督学习（self-supervised learning）。图中的例子，输入Once，我们希望模型能够预测出Upon，但是却组装出来了四不像的一串字符。

数据集的来源：互联网上抓取的庞大的语料库，是无标签的数据集。为了生成高质量的数据集，需要做大量的数据处理工作。对于很多大模型，其数据集一般不会开源，但其实这才是核心竞争力所在。好在一些开源组织或者公司开源了数据集，旨在推动领域发展。例如，EleutherAI开源一个名为 The Pile 的数据集，它包含了从互联网上抓取的22个不同的数据集。这种经过精心整理的数据集，可以为模型提供知识。

经过预训练之后，模型可以正确预测下一个单词。

微调改变了什么

微调是预训练之后的步骤，但是也可以使用微调过的模型再进行微调。

数据集可以是用于自监督学习的没有标签的数据
数据集也可以是有标签的数据
数据量比预训练时小的多

这里的微调特指生成式任务上的微调。在这种方式中，

需要更新整个模型的权重，而不是像其他模型一样只更新部分权重
微调的训练目标与预训练时的目标相同，目的是让模型的输出更加一致
有许多先进的方法可以减少对模型的更新

在模型的行为改变方面

学习如何让模型的输出更加一致；
学习如何让模型的输出更加无害；
激发模型的潜力，例如提升模型的对话能力，而之前我们需要大量的提示工程来提炼这些信息。

在学习新知识方面

模型可以学习在预训练中没有学习过的领域知识
模型可以在这个过程中纠正之前的错误信息

微调任务的设计

输入文本，输出文本类的任务

提炼：将大量的输入文本总结为更简短的文本
- “Reading”
- 提炼关键词，提炼主题，如何将聊天路由到某个API或者智能体
扩展：输入少量文本，输出更多的文本
- “Writing”
- 聊天，写邮件，写代码

关键点

明确的任务是模型是否微调成功的关键
明确意味着清晰定义了模型输出的好和坏的标准

微调数据集构建

import jsonlines
import itertools
import pandas as pd
from pprint import pprint

import datasets
from datasets import load_dataset


pretrained_dataset = load_dataset("c4", "en", split="train", streaming=True)

n = 2
print("Pretrained dataset:")
top_n = itertools.islice(pretrained_dataset, n)
for i in top_n:
    print(i)
    
"""
{'text': 'Foil plaid lycra and spandex shortall with metallic slinky insets. Attached metallic elastic belt with O-ring. Headband included. Great hip hop or jazz dance costume. Made in the USA.', 'timestamp': '2019-04-25T10:40:23Z', 'url': 'https://awishcometrue.com/Catalogs/Clearance/Tweens/V1960-Find-A-Way'}
{'text': "How many backlinks per day for new site?\nDiscussion in 'Black Hat SEO' started by Omoplata, Dec 3, 2010.\n1) for a newly created site, what's the max # backlinks per day I should do to be safe?\n2) how long do I have to let my site age before I can start making more blinks?\nI did about 6000 forum profiles every 24 hours for 10 days for one of my sites which had a brand new domain.\nThere is three backlinks for every of these forum profile so thats 18 000 backlinks every 24 hours and nothing happened in terms of being penalized or sandboxed. This is now maybe 3 months ago and the site is ranking on first page for a lot of my targeted keywords.\nbuild more you can in starting but do manual submission and not spammy type means manual + relevant to the post.. then after 1 month you can make a big blast..\nWow, dude, you built 18k backlinks a day on a brand new site? How quickly did you rank up? What kind of competition/searches did those keywords have?", 'timestamp': '2019-04-21T12:46:19Z', 'url': 'https://www.blackhatworld.com/seo/how-many-backlinks-per-day-for-new-site.258615/'}
"""

filename = "lamini_docs.jsonl"
instruction_dataset_df = pd.read_json(filename, lines=True)
instruction_dataset_df

在这个例子中使用了一个对话数据集进行演示。这个数据集中包含了多种不同类型的对话任务，例如：

输入问题（Question），输出答案（Answer）
输入指令（Instruction），输出响应（Response）
输入文本，输出文本
其他

if "question" in examples and "answer" in examples:
    text = examples["question"][0] + examples["answer"][0]
elif "instruction" in examples and "response" in examples:
    text = examples["instruction"][0] + examples["response"][0]
elif "input" in examples and "output" in examples:
    text = examples["input"][0] + examples["output"][0]
else:
    text = examples["text"][0]

为了让数据集更加结构化，按照一定的结构来处理一下数据集，下面是一种常见的结构：

prompt_template_qa = """### Question:
{question}

### Answer:
{answer}"""

为了将输入和输出分开，移除了答案answer。这样在评估模型或者将数据集拆分为训练和测试时会更加方便。

prompt_template_q = """### Question:
{question}

### Answer:"""

其中，三个#是用于告诉模型接下来的内容。对上述操作进行批量处理：

num_examples = len(examples["question"])
finetuning_dataset_text_only = []
finetuning_dataset_question_answer = []
for i in range(num_examples):
    question = examples["question"][i]
    answer = examples["answer"][i]

    text_with_prompt_template_qa = prompt_template_qa.format(question=question, answer=answer)
    finetuning_dataset_text_only.append({"text": text_with_prompt_template_qa})

    text_with_prompt_template_q = prompt_template_q.format(question=question)
    finetuning_dataset_question_answer.append({"question": text_with_prompt_template_q, "answer": answer})

最后将这些数据保存为json格式，也可以将他们保存到huggingface，以便从云端加载这些数据。

with jsonlines.open(f'lamini_docs_processed.jsonl', 'w') as writer:
    writer.write_all(finetuning_dataset_question_answer)

# Pssst! If you were curious how to upload your own dataset to Huggingface
# Here is how we did it

# !pip install huggingface_hub
# !huggingface-cli login

# import pandas as pd
# import datasets
# from datasets import Dataset

# finetuning_dataset = Dataset.from_pandas(pd.DataFrame(data=finetuning_dataset))
# finetuning_dataset.push_to_hub(dataset_path_hf)

从 huggingface 加载上传的数据集：

finetuning_dataset_name = "lamini/lamini_docs"
finetuning_dataset = load_dataset(finetuning_dataset_name)
print(finetuning_dataset)

Instruction-tuning

指令微调（Instruction finetuning）是一种微调技术，也是ChatGPT使用的技术之一。它被广泛应用于推理（reasoning），路由（routing），代码生成（copilot），聊天（chat）和智能体（agent）。

指令微调（instruction-tuned）也称为指令遵循（instruction-following）。通过这种方式可以让模型如同聊天机器人一样遵循指令，这也为用户提供了更好的互动形式，降低了大模型的使用门槛。

数据集说明

已经存在一些这种类型的数据集，例如：

FAQs
客服对话
即时通讯软件的信息

如果没有上述数据也不用担心，第一种方式是可以使用提示词模板（prompt template）将数据转换为更像回答或者遵循指令的格式，如下图所示，一个README文件，转换成了一个对话形式的样本。

第二种方式是通过其他LLM模型来执行这种转换，斯坦福大学在 Alpaca 模型的工作中使用了这种方式，他们借助 ChatGPT 来产生样本。

泛化能力

经过指令微调之后的模型，其泛化能力表现在：

能够获取模型之前已经存在的知识（预训练数据集）
能够通过指令泛化到其他数据，而不仅仅是微调的数据集

微调流程

微调是一个不断迭代优化的过程。

指令微调与其他类型的微调的主要区别是：数据准备。训练和评估过程基本相同。

下面的代码展示了为指令微调准备的 Alpaca 数据集，比较了指令微调前后模型的表现。

import itertools
import jsonlines

from datasets import load_dataset
from pprint import pprint

from llama import BasicModelRunner
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer


instruction_tuned_dataset = load_dataset("tatsu-lab/alpaca", split="train", streaming=True)

为了处理两种不同类型的提示词和任务（一种需要输入，一种不需要输入），Alpaca论文中提供了两套提示词模板：

prompt_template_with_input = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input}

### Response:"""


prompt_template_without_input = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:"""

批量处理数据：

processed_data = []
for j in top_m:
    if not j["input"]:
        processed_prompt = prompt_template_without_input.format(instruction=j["instruction"])
    else:
        processed_prompt = prompt_template_with_input.format(instruction=j["instruction"], input=j["input"])

    processed_data.append({"input": processed_prompt, "output": j["output"]})

pprint(processed_data[0])

{'input': 'Below is an instruction that describes a task. Write a response '
          'that appropriately completes the request.\n'
          '\n'
          '### Instruction:\n'
          'Give three tips for staying healthy.\n'
          '\n'
          '### Response:',
 'output': '1.Eat a balanced diet and make sure to include plenty of fruits '
           'and vegetables. \n'
           '2. Exercise regularly to keep your body active and strong. \n'
           '3. Get enough sleep and maintain a consistent sleep schedule.'}

保存为json格式

with jsonlines.open(f'alpaca_processed.jsonl', 'w') as writer:
    writer.write_all(processed_data)

指令微调前后模型对比

dataset_path_hf = "lamini/alpaca"
dataset_hf = load_dataset(dataset_path_hf)
print(dataset_hf)
"""
DatasetDict({
    train: Dataset({
        features: ['input', 'output'],
        num_rows: 52002
    })
})
"""

non_instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-hf")
non_instruct_output = non_instruct_model("Tell me how to train my dog to sit")
print("Not instruction-tuned output (Llama 2 Base):", non_instruct_output)


"""

Not instruction-tuned output (Llama 2 Base): .
Tell me how to train my dog to sit. I have a 10 month old puppy and I want to train him to sit. I have tried the treat method and he just sits there and looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks

"""

instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
instruct_output = instruct_model("Tell me how to train my dog to sit")
print("Instruction-tuned output (Llama 2): ", instruct_output)


"""
instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
instruct_output = instruct_model("Tell me how to train my dog to sit")
print("Instruction-tuned output (Llama 2): ", instruct_output)
instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
instruct_output = instruct_model("Tell me how to train my dog to sit")
print("Instruction-tuned output (Llama 2): ", instruct_output)
Instruction-tuned output (Llama 2):  on command.
How to Train Your Dog to Sit on Command
Training your dog to sit on command is a basic obedience command that can be achieved with patience, consistency, and positive reinforcement. Here's a step-by-step guide on how to train your dog to sit on command:
1. Choose a Quiet and Distraction-Free Area: Find a quiet area with minimal distractions where your dog can focus on you.
2. Have Treats Ready: Choose your dog's favorite treats and have them ready to use as rewards.
3. Stand in Front of Your Dog: Stand in front of your dog and hold a treat close to their nose.
4. Move the Treat Above Your Dog's Head: Slowly move the treat above your dog's head, towards their tail. As your dog follows the treat with their nose, their bottom will naturally lower into a sitting position.
5. Say "Sit" and Reward: As soon as your dog's butt touches the ground, say "Sit" and give them the treat. It's important to say the command word as they're performing
"""

chatgpt = BasicModelRunner("chat-gpt")
instruct_output_chatgpt = chatgpt("Tell me how to train my dog to sit")
print("Instruction-tuned output (ChatGPT): ", instruct_output_chatgpt)



"""
Instruction-tuned output (ChatGPT):  Training a dog to sit is a basic command that can be taught using positive reinforcement techniques. Here's a step-by-step guide on how to train your dog to sit:

1. Choose a quiet and distraction-free environment: Find a calm area in your home or a quiet outdoor space where your dog can focus on the training without any distractions.

2. Gather treats: Use small, soft, and tasty treats that your dog loves. These treats will serve as rewards for your dog's correct behavior.

3. Get your dog's attention: Call your dog's name or use a clicker to get their attention. Make sure they are looking at you before proceeding.

4. Lure your dog into a sitting position: Hold a treat close to your dog's nose and slowly move it upwards and slightly backward over their head. As their nose follows the treat, their bottom will naturally lower into a sitting position. Once they are sitting, say "sit" in a clear and firm voice.

5. Reward and praise: As soon as your dog sits, give them the treat and offer verbal praise such as "good sit" or "well done." This positive reinforcement helps them associate the action with the reward.

6. Repeat the process: Practice the sit command multiple times in short training
"""

小模型上的表现

样本中的问题是：Can Lamini generate technical documentation or user manuals for software projects?

def inference(text, model, tokenizer, max_input_tokens=1000, max_output_tokens=100):
    # Tokenize
    input_ids = tokenizer.encode(
          text,
          return_tensors="pt",
          truncation=True,
          max_length=max_input_tokens
    )

    # Generate
    device = model.device
    generated_tokens_with_prompt = model.generate(
    input_ids=input_ids.to(device),
    max_length=max_output_tokens
    )

    # Decode
    generated_text_with_prompt = tokenizer.batch_decode(generated_tokens_with_prompt, skip_special_tokens=True)

    # Strip the prompt
    generated_text_answer = generated_text_with_prompt[0][len(text):]

    return generated_text_answer

finetuning_dataset_path = "lamini/lamini_docs"
finetuning_dataset = load_dataset(finetuning_dataset_path)
print(finetuning_dataset)

test_sample = finetuning_dataset["test"][0]

# 没有经过指令微调的，参数数量为 7000 万的小模型
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-70m")
model = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-70m")
print(inference(test_sample["question"], model, tokenizer))

"""
I have a question about the following:
How do I get the correct documentation to work?

A: I think you need to use the following code:

A: You can use the following code to get the correct documentation.

A: You can use the following code to get the correct documentation.

A: You can use the following
"""

instruction_model = AutoModelForCausalLM.from_pretrained("lamini/lamini_docs_finetuned")
print(inference(test_sample["question"], instruction_model, tokenizer))

"""
Yes, Lamini can generate technical documentation or user manuals for software projects. This can be achieved by providing a prompt for a specific technical question or question to the LLM Engine, or by providing a prompt for a specific technical question or question. Additionally, Lamini can be trained on specific technical questions or questions to help users understand the process and provide feedback to the LLM Engine. Additionally, Lamini
"""

Data Preparation

收集什么样的数据？

怎么处理数据？

Tokenizing 干了啥？

注意：每个分词器都与它训练的特定模型相关联，不能误用。

import pandas as pd
import datasets

from pprint import pprint
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-70m")

text = "Hi, how are you?"
encoded_text = tokenizer(text)["input_ids"]
print(encoded_text)

"""
[12764, 13, 849, 403, 368, 32]
"""

decoded_text = tokenizer.decode(encoded_text)
print("Decoded tokens back into text: ", decoded_text)

"""
Decoded tokens back into text:  Hi, how are you?
"""

模型在训练过程中，处理的是固定长度的张量，因此让输入数据保持相同的文本编码长度至关重要。一种常用的方法是填充（Padding）。

模型的输入和输出的长度是有限制的，因此需要处理这种情况。一种常用的方案就是截断（Truncation），缩短文本的编码。

tokenizer.pad_token = tokenizer.eos_token 
encoded_texts_longest = tokenizer(list_texts, padding=True)
print("Using padding: ", encoded_texts_longest["input_ids"])

"""
Using truncation:  [[12764, 13, 849], [42, 1353, 1175], [4374]]
"""

tokenizer.truncation_side = "left"
encoded_texts_truncation_left = tokenizer(list_texts, max_length=3, truncation=True)
print("Using left-side truncation: ", encoded_texts_truncation_left["input_ids"])

"""
Using left-side truncation:  [[403, 368, 32], [42, 1353, 1175], [4374]]
"""

encoded_texts_both = tokenizer(list_texts, max_length=3, truncation=True, padding=True)
print("Using both padding and truncation: ", encoded_texts_both["input_ids"])

"""
Using both padding and truncation:  [[403, 368, 32], [42, 1353, 1175], [4374, 0, 0]]
"""

加载数据集并按照模板处理成固定格式

import pandas as pd

filename = "lamini_docs.jsonl"
instruction_dataset_df = pd.read_json(filename, lines=True)
examples = instruction_dataset_df.to_dict()

if "question" in examples and "answer" in examples:
    text = examples["question"][0] + examples["answer"][0]
elif "instruction" in examples and "response" in examples:
    text = examples["instruction"][0] + examples["response"][0]
elif "input" in examples and "output" in examples:
    text = examples["input"][0] + examples["output"][0]
else:
    text = examples["text"][0]

prompt_template = """### Question:
{question}

### Answer:"""

num_examples = len(examples["question"])
finetuning_dataset = []
for i in range(num_examples):
    question = examples["question"][i]
    answer = examples["answer"][i]
    text_with_prompt_template = prompt_template.format(question=question)
    finetuning_dataset.append({"question": text_with_prompt_template, "answer": answer})

from pprint import pprint
print("One datapoint in the finetuning dataset:")
pprint(finetuning_dataset[0])

tokenize instruction 数据集

def tokenize_function(examples):
    if "question" in examples and "answer" in examples:
        text = examples["question"][0] + examples["answer"][0]
    elif "input" in examples and "output" in examples:
        text = examples["input"][0] + examples["output"][0]
    else:
        text = examples["text"][0]

    tokenizer.pad_token = tokenizer.eos_token
    tokenized_inputs = tokenizer(
        text,
        return_tensors="np",
        padding=True,
    )

    max_length = min(
        tokenized_inputs["input_ids"].shape[1],
        2048
    )
    tokenizer.truncation_side = "left"
    tokenized_inputs = tokenizer(
        text,
        return_tensors="np",
        truncation=True,
        max_length=max_length
    )

    return tokenized_inputs

切分训练集和测试集

split_dataset = tokenized_dataset.train_test_split(test_size=0.1, shuffle=True, seed=123)
print(split_dataset)


"""
split_dataset = tokenized_dataset.train_test_split(test_size=0.1, shuffle=True, seed=123)
print(split_dataset)
split_dataset = tokenized_dataset.train_test_split(test_size=0.1, shuffle=True, seed=123)
print(split_dataset)
DatasetDict({
    train: Dataset({
        features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 1260
    })
    test: Dataset({
        features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 140
    })
})
"""

Training Process

与训练神经网络一样。

Llama 微调步骤 pipeline：

Choose base model.
Load data.
Train it. Returns a model ID, dashboard, and playground interface.

3 行代码完成微调：

from llama import BasicModelRunner

model = BasicModelRunner("EleutherAI/pythia-410m")
model.load_data_from_jsonlines("lamini_docs.jsonl")
model.train()

import datasets
import tempfile
import logging
import random
import config
import os
import yaml
import logging
import time
import torch
import transformers
import pandas as pd

from utilities import *
from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM
from transformers import TrainingArguments
from transformers import AutoModelForCausalLM
from llama import BasicModelRunner
from llama import BasicModelRunner

logger = logging.getLogger(__name__)
global_config = None

加载数据集

# Load the Lamini docs dataset
dataset_name = "lamini_docs.jsonl"
dataset_path = f"/content/{dataset_name}"
use_hf = False

dataset_path = "lamini/lamini_docs"
use_hf = True

设置模型，训练配置和tokenizer

# Set up the model, training config, and tokenizer
model_name = "EleutherAI/pythia-70m"

training_config = {
    "model": {
        "pretrained_name": model_name,
        "max_length" : 2048
    },
    "datasets": {
        "use_hf": use_hf,
        "path": dataset_path
    },
    "verbose": True
}

tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
train_dataset, test_dataset = tokenize_and_split_data(training_config, tokenizer)

print(train_dataset)
print(test_dataset)

"""
Dataset({
    features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
    num_rows: 1260
})

Dataset({
    features: ['question', 'answer', 'input_ids', 'attention_mask', 'labels'],
    num_rows: 140
})
"""

加载基础模型

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(model_name)
device_count = torch.cuda.device_count()
if device_count > 0:
    logger.debug("Select GPU device")
    device = torch.device("cuda")
else:
    logger.debug("Select CPU device")
    device = torch.device("cpu")
    
base_model.to(device)

"""
GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(50304, 512)
    (emb_dropout): Dropout(p=0.0, inplace=False)
    (layers): ModuleList(
      (0-5): 6 x GPTNeoXLayer(
        (input_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_dropout): Dropout(p=0.0, inplace=False)
        (post_mlp_dropout): Dropout(p=0.0, inplace=False)
        (attention): GPTNeoXAttention(
          (rotary_emb): GPTNeoXRotaryEmbedding()
          (query_key_value): Linear(in_features=512, out_features=1536, bias=True)
          (dense): Linear(in_features=512, out_features=512, bias=True)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): GPTNeoXMLP(
          (dense_h_to_4h): Linear(in_features=512, out_features=2048, bias=True)
          (dense_4h_to_h): Linear(in_features=2048, out_features=512, bias=True)
          (act): GELUActivation()
        )
      )
    )
    (final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=512, out_features=50304, bias=False)
)
"""

定义推理函数

def inference(text, model, tokenizer, max_input_tokens=1000, max_output_tokens=100):
    # Tokenize
    input_ids = tokenizer.encode(
          text,
          return_tensors="pt",
          truncation=True,
          max_length=max_input_tokens
    )

    # Generate
    device = model.device
    generated_tokens_with_prompt = model.generate(
    input_ids=input_ids.to(device),
    max_length=max_output_tokens
    )

    # Decode
    generated_text_with_prompt = tokenizer.batch_decode(generated_tokens_with_prompt, skip_special_tokens=True)

    # Strip the prompt
    generated_text_answer = generated_text_with_prompt[0][len(text):]

    return generated_text_answer

设置训练参数

max_steps = 3
trained_model_name = f"lamini_docs_{max_steps}_steps"
output_dir = trained_model_name

training_args = TrainingArguments(
    # Learning rate
    learning_rate=1.0e-5,

    # Number of training epochs
    num_train_epochs=1,

    # Max steps to train for (each step is a batch of data)
    # Overrides num_train_epochs, if not -1
    max_steps=max_steps,

    # Batch size for training
    per_device_train_batch_size=1,

    # Directory to save model checkpoints
    output_dir=output_dir,

    # Other arguments
    overwrite_output_dir=False, # Overwrite the content of the output directory
    disable_tqdm=False, # Disable progress bars
    eval_steps=120, # Number of update steps between two evaluations
    save_steps=120, # After # steps model is saved
    warmup_steps=1, # Number of warmup steps for learning rate scheduler
    per_device_eval_batch_size=1, # Batch size for evaluation
    evaluation_strategy="steps",
    logging_strategy="steps",
    logging_steps=1,
    optim="adafactor",
    gradient_accumulation_steps = 4,
    gradient_checkpointing=False,

    # Parameters for early stopping
    load_best_model_at_end=True,
    save_total_limit=1,
    metric_for_best_model="eval_loss",
    greater_is_better=False
)

model_flops = (
  base_model.floating_point_ops(
    {
       "input_ids": torch.zeros(
           (1, training_config["model"]["max_length"])
      )
    }
  )
  * training_args.gradient_accumulation_steps
)

print(base_model)
print("Memory footprint", base_model.get_memory_footprint() / 1e9, "GB")
print("Flops", model_flops / 1e9, "GFLOPs")

"""
GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(50304, 512)
    (emb_dropout): Dropout(p=0.0, inplace=False)
    (layers): ModuleList(
      (0-5): 6 x GPTNeoXLayer(
        (input_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_dropout): Dropout(p=0.0, inplace=False)
        (post_mlp_dropout): Dropout(p=0.0, inplace=False)
        (attention): GPTNeoXAttention(
          (rotary_emb): GPTNeoXRotaryEmbedding()
          (query_key_value): Linear(in_features=512, out_features=1536, bias=True)
          (dense): Linear(in_features=512, out_features=512, bias=True)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): GPTNeoXMLP(
          (dense_h_to_4h): Linear(in_features=512, out_features=2048, bias=True)
          (dense_4h_to_h): Linear(in_features=2048, out_features=512, bias=True)
          (act): GELUActivation()
        )
      )
    )
    (final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=512, out_features=50304, bias=False)
)
Memory footprint 0.30687256 GB
Flops 2195.667812352 GFLOPs
"""

开始训练

trainer = Trainer(
    model=base_model,
    model_flops=model_flops,
    total_steps=max_steps,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
)

training_output = trainer.train()

2023-09-17 13:15:30,575 - DEBUG - utilities - Step (1) Logs: {'loss': 3.3406, 'learning_rate': 1e-05, 'epoch': 0.0, 'iter_time': 0.0, 'flops': 0.0, 'remaining_time': 0.0}
2023-09-17 13:15:36,577 - DEBUG - utilities - Step (2) Logs: {'loss': 3.2429, 'learning_rate': 5e-06, 'epoch': 0.01, 'iter_time': 6.002344846725464, 'flops': 365801677247.8227, 'remaining_time': 6.002344846725464}
2023-09-17 13:15:42,266 - DEBUG - utilities - Step (3) Logs: {'loss': 3.4016, 'learning_rate': 0.0, 'epoch': 0.01, 'iter_time': 5.845620155334473, 'flops': 375609046432.537, 'remaining_time': 0.0}
2023-09-17 13:15:42,267 - DEBUG - utilities - Step (3) Logs: {'train_runtime': 18.0618, 'train_samples_per_second': 0.664, 'train_steps_per_second': 0.166, 'total_flos': 262933364736.0, 'train_loss': 3.3283629417419434, 'epoch': 0.01, 'iter_time': 5.84603750705719, 'flops': 375582231503.20966, 'remaining_time': 0.0}

保存模型

save_dir = f'{output_dir}/final'

trainer.save_model(save_dir)
print("Saved model to:", save_dir)

finetuned_slightly_model = AutoModelForCausalLM.from_pretrained(save_dir, local_files_only=True)

finetuned_slightly_model.to(device)

"""
GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(50304, 512)
    (emb_dropout): Dropout(p=0.0, inplace=False)
    (layers): ModuleList(
      (0-5): 6 x GPTNeoXLayer(
        (input_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_layernorm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (post_attention_dropout): Dropout(p=0.0, inplace=False)
        (post_mlp_dropout): Dropout(p=0.0, inplace=False)
        (attention): GPTNeoXAttention(
          (rotary_emb): GPTNeoXRotaryEmbedding()
          (query_key_value): Linear(in_features=512, out_features=1536, bias=True)
          (dense): Linear(in_features=512, out_features=512, bias=True)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): GPTNeoXMLP(
          (dense_h_to_4h): Linear(in_features=512, out_features=2048, bias=True)
          (dense_4h_to_h): Linear(in_features=2048, out_features=512, bias=True)
          (act): GELUActivation()
        )
      )
    )
    (final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=512, out_features=50304, bias=False)
)
"""

主题纠偏（moderation），孤立模型不要偏离主题太远。

Evaluation and iteration

基准测试

评估生成式模型是非常困难的，因为缺乏清晰的度量标准，并且模型性能提升迅速，评价指标很难保持同步。因此，人工评估通常是最可靠的方式。这意味着需要该领域的专家来评估模型的输出。拥有一个好的数据集（高质量的，准确的，足够泛化的，没有在训练集中出现过的）是利用好专家经验的基础。

目前流行的一种方法是Eleuther AI开发的ELO对比，类似于多模型间的AB test。普遍采用的一个开放LLM基准测试利用了多种评估方法。它集合了各种评估方法并取平均值以此来排序模型。包括：

ARC：主要是小学问题
HellaSwag：常识
MMLU：多个小学学科
TruthfulQA：评估模型在复制常见的在线错误信息上的表现

FreeWilly模型是在 Llama-2 模型基础上进行微调得到的，使用的是 ORCA 方法。

错误分析

在模型微调前进行错误分析，有助于理解基础模型的表现，确定哪种数据会在微调中有最好的效果。

一些常见的错误类别如下图所示：

第一种：拼写错误。

第二种：长度过长。简洁的数据集可以帮助模型更准确地回答问题。

第三种：生成重复。解决方法是更明确地使用停止标记或者提示词模板。确保数据既有多样性又不过于重复。

注意，不需要过度关注模型在这些基准测试上的表现，因为他们可能与业务场景无关。因此，真正要关心的是在真实业务场景上的表现。上述基准测试只有在你研究的是通用模型时才更具有参考价值。也就是说，这个基准测试对于你找基础的模型有参考价值，对于具体地微调任务上意义不大。

Consideration and practical tips

微调的实用步骤：

1）明确任务

2）收集与任务输入和输出相关的数据，并对数据进行组织整理

3）如果数据不够，可以借助AI生成或实用提示词模板来创建

4）建议先微调一个小模型（例如 4亿-10亿参数），看一下模型的表现

5）调整微调模型时的数据量，并观察对微调结果的影响

6）评估模型，看看哪些做得好，哪些做得不好

7）收集更多的数据，通过评估结果来持续改进模型

8）提高任务的复杂度

9）增加模型规模以适应这种复杂的任务

微调任务和模型大小的权衡：

不同任务复杂度所需的模型大小：

参数高效微调方法 PEFT：

课程作者推荐使用低秩自适应（Low Rank Adaptation，LoRA）：

LoRA的核心思想是冻结主要的预训练的权重（蓝色），在模型的部分层中训练新的权重（橙色）。新的权重是原始权重变动的秩分解矩阵。其优势是可以分别训练这些权重，然后与预训练的权重相结合，在推理时能将它们合并回主要的预训练权重，从而高效地得到微调模型。

Conclusion

该课程介绍了什么是微调，微调的作用和重要性。此外，还介绍了从数据准备到训练遭到评估模型的所有步骤。

你可能感兴趣的:(AIGC,chatgpt,prompt,llm,llama,finetune)

从零手撕 LLaMa3 项目爆火（图解+代码）机器学习社区大模型深度学习大模型算法人工智能 RAG 多模态大模型 Llama 面试题
节前，我们组织了一场算法岗技术&面试讨论会，邀请了一些互联网大厂朋友、今年参加社招和校招面试的同学。针对大模型技术趋势、大模型落地项目经验分享、新手如何入门算法岗、该如何准备面试攻略、面试常考点等热门话题进行了深入的讨论。汇总合集《大模型面试宝典》(2024版)发布！一个月前，Meta发布了开源大模型llama3系列，在多个关键基准测试中优于业界SOTA模型，并在代码生成任务上全面领先。此后，开发
从零开始构建大模型(LLM)应用和老莫一起学AI 人工智能 ai 大模型语言模型 llm 自然语言处理学习
大模型（LLM）已经成为当前人工智能的重要部分。但是，在这个领域还没有固定的操作标准，开发者们往往没有明确的指导，需要不断尝试和摸索。在过去两年中，我帮助了许多公司利用LLM来开发了很多创新的应用产品。基于这些经验，我形成了一套实用的方法，并准备在这篇文章中与大家分享。这套方法将提供一些步骤，帮助需要的小伙伴在LLM应用开发的复杂环境中找到方向。从最初的构思到PoC、评估再到产品化，了解如何将创意
《颠覆认知，我用大模型+Redis实现SQL智能补全，开发效率暴涨500%》煜bart mysql AI编程人工智能 redis
一、前言：当SQL补全遇到大模型（插入传统SQL补全工具与ChatGPT对比图）你是否还在为这些场景抓狂？-凌晨3点记不清HiveQL的窗口函数语法-面对新接触的ClickHouse方言不知所措-团队新人总把STR_TO_DATE写成DATE_FORMAT传统IDE的SQL补全就像"人工智障"，直到我把大模型装进Redis…##二、效果展示：智能补全的降维打击（GIF动图展示输入SELECT*FR
【LLM】从零开始实现 LLaMA3 FOUR_A LLM 人工智能机器学习大模型 llama 算法
分词器在这里，我们不会实现一个BPE分词器（但AndrejKarpathy有一个非常简洁的实现）。BPE（BytePairEncoding，字节对编码）是一种数据压缩算法，也被用于自然语言处理中的分词方法。它通过逐步将常见的字符或子词组合成更长的词元（tokens），从而有效地表示文本中的词汇。在自然语言处理中的BPE分词器的工作原理如下：初始化：首先，将所有词汇表中的单词分解为单个字符或符号。例
如果，你想找 AI大模型相关的工作，这三个建议你一定要看！我爱学大模型人工智能 chatgpt AI大模型 AI 大模型入门转行程序员
01各种大厂小厂创业团队和AI擦边的面试难度，由难到简单，依次是：大模型算法（⭐⭐⭐⭐⭐）模型部署加速（⭐⭐⭐⭐）RAG等相关技术（⭐⭐⭐）纯应用（⭐⭐）Prompt工程师等其他自媒体（⭐）会简单应用就行02这结果方向，B站找几个视频看看，这里推荐用Qwen7B，开源的模型，一个3060都能跑。例如这个，如何微调Qwen开源模型。https://www.bilibili.com/video/BV1
大语言模型(LLM)入门学习路线图_llm教程，从零基础到精通，理论与实践结合的最佳路径！ AGI学习社语言模型学习人工智能 LLM 大模型大数据自然语言处理
Github项目上有一个大语言模型学习路线笔记，它全面涵盖了大语言模型的所需的基础知识学习，LLM前沿算法和架构，以及如何将大语言模型进行工程化实践。这份资料是初学者或有一定基础的开发/算法人员入门活深入大型语言模型学习的优秀参考。这份资料重点介绍了我们应该掌握哪些核心知识，并推荐了一系列优质的学习视频和博客，旨在帮助大家系统性地掌握大型语言模型的相关技术。大语言模型（LargeLanguageM
AI大模型零基础金融人如何一周自学大模型，从零基础到入门，看这篇就够了！冻感糕人~ 人工智能金融 AI大模型 LLM 大模型技术大模型学习路线大模型基础
前几天参加了字节跳动在上海举办的火山引擎Force原动力大会，OpenAI也连续开了12天发布会，最近堪称科技界的春晚了。如果说2022年ChatGPT横空出世把人工智能的发展带上了一个新的台阶，那么2024年末，大模型对工作、生活的全面“侵入”让我们越来越接近库兹韦尔所描述的那个奇点时刻。作为金融民工，我们想通过这篇文章讲讲从用户的角度如何一周快速掌握大模型，以及为什么我建议每一个金融从业人员（
Llama3.1是AI界的Linux？先部署起来再说！ AI大模型探索者人工智能 linux 运维语言模型 ai LLama llama
前言就在昨天，Meta发布了Llama3.1，这次带来的中杯、大杯和超大杯3个版本。从纸面数据来看，Llama3.1超大杯已经能跟GPT-4Omni、Claude3.5Sonnet分庭抗礼了。而中杯和大杯更是将同量级的对手摁在地上摩擦。要知道，Llama的对手可是闭源模型啊工友们！小扎同志说，开源AI会成为行业的标准，就像Linux一样！不管怎么说，既然你开源了，那我就在本地部署起来吧。本文使用O
使用LangChain访问个人数据第一章-简介明志刘明大模型学习手册 langchain
需要学习提示词工程的同学请看面向开发者的提示词工程需要学习ChatGPT的同学请查看搭建基于ChatGPT的问答系统需要学习LangChian开发的同学请查看基于LangChain开发应用程序正文在大数据时代，数据价值逐渐凸显，打造定制化、个性化服务，个人数据尤为重要。要开发一个具备较强服务能力、能够充分展现个性化智能的应用程序，大模型与个人数据的对齐是一个重要步骤。作为针对大模型开发应运而生的框
使用LangChain访问个人数据第八章-总结明志刘明大模型学习手册 langchain 人工智能
需要学习提示词工程的同学请看面向开发者的提示词工程需要学习ChatGPT的同学请查看搭建基于ChatGPT的问答系统需要学习LangChian开发的同学请查看基于LangChain开发应用程序本部分前几个章节请查看使用LangChain访问个人数据第一章-简介使用LangChain访问个人数据第二章-文档加载使用LangChain访问个人数据第三章-文档分割使用LangChain访问个人数据第四章
基于 LangChain 开发应用程序第一章-简介明志刘明大模型学习手册 langchain 人工智能
需要学习提示词工程的同学请看面向开发者的提示词工程需要学习ChatGPT的同学请查看搭建基于ChatGPT的问答系统本部分章节目录如下：基于LangChain开发应用程序第一章-简介基于LangChain开发应用程序第二章-提示和输出基于LangChain开发应用程序第三章-储存基于LangChain开发应用程序第四章-模型链基于LangChain开发应用程序第五章-基于文档的问答基于LangCh
向 state 字典中的 “messages“ 键添加一条新的用户消息，提示模型返回实际的输出。背太阳的牧羊人 langgraph langgraph tools Agent
完整代码：fromdatetimeimportdatetimefromlangchain_core.runnablesimportRunnable,RunnableConfigfromlangchain_core.promptsimportChatPromptTemplateprimary_assistant_prompt=ChatPromptTemplate.from_messages([("s
【Hugging Face】transformers 库中 model.generate() 方法：自回归模型的文本生成方法彬彬侠 Hugging Face model.generate transformers Hugging Face 文本生成自回归模型 GPT LLAMA
HuggingFacemodel.generate方法model.generate是transformers库中的文本生成（TextGeneration）方法，适用于自回归模型（如GPT-2、T5、BART、LLAMA），用于生成文本、摘要、翻译、问答等。1.适用于哪些模型？generate适用于基于Transformer生成文本的模型，例如：GPT-2(AutoModelForCausalLM)
linux下搭建Llama3 念去去~ Llama 大模型 llama 语言模型 ubuntu linux
安装软件：Ollama，官方网站：https://ollama.com/可以再下载win、mac和linux版本linux安装命令为：curl-fsSLhttps://ollama.com/install.sh|sh由于我的机器是linux不联网机器，网上没找到下载离线方式，查看https://ollama.com/install.sh脚本发现有这句话："https://ollama.com/do
PyTorch实现CNN：CIFAR-10图像分类实战教程吴师兄大模型 PyTorch pytorch cnn CIFAR-10图像分类人工智能 python 卷积神经网络开发语言
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
llama.cpp编译 1nv1s1ble llama
llam.cpp编译1.下载&编译gitclonehttps://github.com/ggml-org/llama.cppcmake-S.-Bbuild2.下载模型验证#下载地址https://huggingface.co/filipealmeida/open-llama-7b-v2-open-instruct-GGUF/blob/main/ggml-model-Q4_0.gguf#验证./ll
基于llama_cpp 调用本地模型（llama）实现基本推理月光技术杂谈大模型初探 llama llama.cpp python LLM 集成显卡本地模型 AI
零基础实践本地推理模型基本应用：基于llama_cpp的本地模型调用。本文先安装llama_cpppython库，再编写程序，利用其调用llama-2-7b-chat.Q4_K_M.ggu模型。背景llama_cpp是一个基于C++的高性能库（llama.cpp）的Python绑定，支持在CPU或GPU上高效运行LLaMA及其衍生模型（如LLaMA2），并通过量化技术（如GGUF格式）优化内存使用
“大语言模型微调”（Fine-tuning）与“大语言模型应用”（LLM Applications）之间的区别 AI Echoes 人工智能机器学习深度学习
1.概念与定义大语言模型微调微调指的是在一个经过大规模预训练的通用语言模型基础上，利用针对性较强的小规模数据集对模型进行进一步训练，从而使模型在特定领域或任务上表现得更优秀。目标：使模型更好地适应特定任务（如医疗问答、法律咨询、编程辅助等），提高准确性和专业性。方法：可以是全参数微调，也可以采用参数高效微调（如LoRA、Adapter、PrefixTuning等），后者只调整部分参数而保持原有权重
【大模型开发】大模型背后的基础组件与生态概览云博士的AI课堂深度学习哈佛博后带你玩转机器学习大模型技术开发与实践大模型开发 Hugging Face DeepSpeed 大模型生态机器学习深度学习大模型技术栈
支撑大模型开发与部署的关键组件与生态系统当今大模型（LLM,LargeLanguageModel）在工业与学术界的应用日益广泛，从ChatGPT、BERT到DeepSeek等新兴模型，背后离不开一整套成熟的技术生态和工具链支持。本文将介绍其中几大核心组件和框架，包括HuggingFaceTransformers、DeepSpeed、Megatron-LM，以及其他相关工具和方法，展示它们在训练效率
LangChain 发布政策详解 VYSAHF langchain 人工智能深度学习 python
技术背景介绍LangChain是一个用于构建和部署大型语言模型（LLM）应用的生态系统。它由多个组件包组成，例如langchain-core、langchain、langchain-community、langgraph和langserve等。随着应用需求的快速变化，LangChain的开发与发布策略也相应调整，以便更好地服务于用户社区。核心原理解析LangChain生态系统采用语义版本控制（Se
领域大模型之微调技术和最佳实践程序员莫玛人工智能深度学习语言模型金融
BERT和GPT-3等语言模型针对语言任务进行了预训练。微调使它们适应特定领域，如营销、医疗保健、金融。在本指南中，您将了解LLM架构、微调过程以及如何为NLP任务微调自己的预训练模型。-介绍-大型语言模型（LLM）的特别之处可以概括为两个关键词——大型和通用。“大”是指它们训练的海量数据集及其参数的大小，即模型在训练过程中学习的记忆和知识;“通用”意味着他们具有广泛的语言任务能力。更明确地说，L
LangChain大模型应用开发指南-大模型Memory不止于对话喝不喝奶茶丫 langchain 人工智能大模型大模型应用 AI大模型 Memory 大语言模型
上节课，我我为您介绍了LangChain中最基本的链式结构，以及基于这个链式结构演化出来的ReAct对话链模型。今天我将由简入繁，为大家拆解LangChain内置的多种记忆机制。本教程将详细介绍这些记忆组件的工作原理、特性以及使用方法。【一一AGI大模型学习所有资源获取处一一】①人工智能/大模型学习路线②AI产品经理资源合集③200本大模型PDF书籍④超详细海量大模型实战项目⑤LLM大模型系统学习
llama.cpp框架下GGUF格式及量化参数全解析 Black_Rock_br 人工智能
前言：在人工智能领域，语言模型的高效部署和推理一直是研究热点。随着模型规模的不断扩大，如何在有限的硬件资源上实现快速、高效的推理，成为了一个关键问题。`llama.cpp`框架以其出色的性能和灵活性，为这一问题提供了有效的解决方案。其中，GGUF格式和模型量化参数是实现高效推理的重要技术手段。本文将对`llama.cpp`框架下的GGUF格式及量化参数进行详细解析，帮助读者更好地理解和应用这些技术
我与DeepSeek的深度实践：重新定义智能编程的边界一叶孤舟111 python 人工智能
引言：从质疑到依赖的认知跃迁在ChatGPT掀起AI编程革命之初，我曾对代码生成工具持保留态度。直到2023年接触DeepSeek，这个来自中国的AI编程助手彻底改变了我的开发模式。经过200+小时的深度使用，我在实际项目中验证了其惊人潜力，本文将分享最具实践价值的经验总结。一、效率革命：实测数据背后的生产力跃升1.1代码生成效率对比任务类型传统耗时DeepSeek耗时准确率CRUD接口开发2.5
如何对大模型进行微调？从原理到实战全解析挣扎与觉醒中的技术人人工智能外包转型集成学习 chatgpt gpt-3 软件工程
随着大语言模型（LLM）的快速发展，微调（Fine-tuning）已成为将通用模型转化为垂直领域专家的核心技术。本文将从原理、方法到实战步骤，结合OpenAI、HuggingFace等平台的最佳实践，详解大模型微调全流程。文末附赠独家资料包，助你快速上手！一、什么是大模型微调？微调指在预训练大模型（如GPT-3.5、LLaMA）的基础上，使用特定领域的数据进行二次训练，使模型适应新任务或领域需求。
程序员如何利用 AI 辅助编程，提升效率并摆脱 996 fxrz12 AI 人工智能
——从AI编程助手到高效提示词技巧在过去，程序员遇到问题时，会优先选择Google、StackOverflow、必应、百度等搜索引擎。然而，AI的崛起正在改变这一模式。越来越多的IT人开始直接向AI询问问题，而不再只是搜索代码片段。如何有效地向AI提问，写出精准的Prompt（提示词），决定了AI能否真正成为你的高效助手。本文将探讨如何利用AI提升编程效率，并深入讲解如何向AI提问，以便获得最佳答
从零起步：LangChain ChatPromptTemplate基础使用软件不硬 langchain
在上篇文章中，我们已经学习PromptTemplate。现在，我们继续学习ChatPromptTemplate。ChatPromptTemplate是LangChain框架中用于构建对话提示的强大工具。它专为多轮对话场景设计，能将不同角色的消息整合为连贯提示，助力开发者精准引导语言模型生成符合预期的回复。通过定义角色、消息内容及灵活的模板变量，ChatPromptTemplate让创建复杂对话提示
程序员提示词使用指南：从入门到精通的Prompt技巧 shandianfk_com ChatGPT AI prompt 人工智能深度学习
在当今的编程世界里，编写高效、简洁的代码一直是每个程序员追求的目标。但随着AI技术的发展，程序员们有了新的“武器”——Prompt提示词。Prompt，即提示词，是指在编程过程中，给AI模型提供的指令或问题，以便获得所需的代码或答案。今天，我们就来详细探讨一下如何从入门到精通地使用Prompt提示词，帮助你在编程过程中事半功倍。首先，我们来了解一下Prompt的基础知识。一、什么是Prompt提示
【RAG 论文】Program-of-Thoughts（PoT）提示：让 LLM 生成 Python 代码来解决复杂的数字计算问题 yubinCloud LLM Research 自然语言处理人工智能语言模型算法
论文：ProgramofThoughtsPrompting:DisentanglingComputationfromReasoningforNumericalReasoningTasks⭐⭐⭐⭐TMLR2023Code：Program-of-Thoughts|GitHub论文速读文章提出了PoTPrompting方法，PoT可以看作是CoT（Chain-of-Thoughts）的改进，该方法通过生
2.langchain中的prompt模板 (FewShotPromptTemplate) ZHOU_CAMP langchain实践 langchain prompt
本教程将介绍如何使用LangChain库中的PromptTemplate和FewShotPromptTemplate来构建和运行提示（prompt），并通过示例数据展示其应用。安装依赖首先，确保你已经安装了langchain和相关依赖：pipinstalllangchainlangchain_corelangchain_chromalangchain_community1.创建PromptTemp
eclipse maven IXHONG eclipse
eclipse中使用maven插件的时候，运行run as maven build的时候报错 -Dmaven.multiModuleProjectDirectory system propery is not set. Check $M2_HOME environment variable and mvn script match. 可以设一个环境变量M2_HOME指
timer cancel方法的一个小实例 alleni123 多线程 timer
package com.lj.timer; import java.util.Date; import java.util.Timer; import java.util.TimerTask; public class MyTimer extends TimerTask { private int a; private Timer timer; pub
MySQL数据库在Linux下的安装 ducklsl mysql
1.建好一个专门放置MySQL的目录 /mysql/db数据库目录 /mysql/data数据库数据文件目录 2.配置用户，添加专门的MySQL管理用户 >groupadd mysql ----添加用户组 >useradd -g mysql mysql ----在mysql用户组中添加一个mysql用户 3.配置，生成并安装MySQL >cmake -D
spring------>>cvc-elt.1: Cannot find the declaration of element Array_06 spring bean
将-------- <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3
maven发布第三方jar的一些问题 cugfy maven
maven中发布第三方jar到nexus仓库使用的是 deploy:deploy-file命令有许多参数，具体可查看 http://maven.apache.org/plugins/maven-deploy-plugin/deploy-file-mojo.html 以下是一个例子： mvn deploy:deploy-file -DgroupId=xpp3
MYSQL下载及安装 357029540 mysql
好久没有去安装过MYSQL，今天自己在安装完MYSQL过后用navicat for mysql去厕测试链接的时候出现了10061的问题，因为的的MYSQL是最新版本为5.6.24，所以下载的文件夹里没有my.ini文件，所以在网上找了很多方法还是没有找到怎么解决问题，最后看到了一篇百度经验里有这个的介绍，按照其步骤也完成了安装，在这里给大家分享下这个链接的地址
ios TableView cell的布局张亚雄 tableview
cell.imageView.image = [UIImage imageNamed:[imageArray objectAtIndex:[indexPath row]]]; CGSize itemSize = CGSizeMake(60, 50); &nbs
Java编码转义 adminjun java 编码转义
import java.io.UnsupportedEncodingException; /** * 转换字符串的编码 */ public class ChangeCharset { /** 7位ASCII字符，也叫作ISO646-US、Unicode字符集的基本拉丁块 */ public static final Strin
Tomcat 配置和spring aijuans spring
简介 Tomcat启动时，先找系统变量CATALINA_BASE，如果没有，则找CATALINA_HOME。然后找这个变量所指的目录下的conf文件夹，从中读取配置文件。最重要的配置文件：server.xml 。要配置tomcat，基本上了解server.xml，context.xml和web.xml。 Server.xml -- tomcat主
Java打印当前目录下的所有子目录和文件 ayaoxinchao 递归 File
其实这个没啥技术含量，大湿们不要操笑哦，只是做一个简单的记录，简单用了一下递归算法。 import java.io.File; /** * @author Perlin * @date 2014-6-30 */ public class PrintDirectory { public static void printDirectory(File f
linux安装mysql出现libs报冲突解决 BigBird2012 linux
linux安装mysql出现libs报冲突解决安装mysql出现 file /usr/share/mysql/ukrainian/errmsg.sys from install of MySQL-server-5.5.33-1.linux2.6.i386 conflicts with file from package mysql-libs-5.1.61-4.el6.i686
jedis连接池使用实例 bijian1013 redis jedis连接池 jedis
实例代码： package com.bijian.study; import java.util.ArrayList; import java.util.List; import redis.clients.jedis.Jedis; import redis.clients.jedis.JedisPool; import redis.clients.jedis.JedisPoo
关于朋友 bingyingao 朋友兴趣爱好维持
成为朋友的必要条件：志相同，道不合，可以成为朋友。譬如马云、周星驰一个是商人，一个是影星，可谓道不同，但都很有梦想，都要在各自领域里做到最好，当他们遇到一起，互相欣赏，可以畅谈两个小时。志不同，道相合，也可以成为朋友。譬如有时候看到两个一个成绩很好每次考试争做第一，一个成绩很差的同学是好朋友。他们志向不相同，但他
【Spark七十九】Spark RDD API一 bit1129 spark
aggregate package spark.examples.rddapi import org.apache.spark.{SparkConf, SparkContext} //测试RDD的aggregate方法 object AggregateTest { def main(args: Array[String]) { val conf = new Spar
ktap 0.1 released bookjovi kernel tracing
Dear, I'm pleased to announce that ktap release v0.1, this is the first official release of ktap project, it is expected that this release is not fully functional or very stable and we welcome bu
能保存Properties文件注释的Properties工具类 BrokenDreams properties
今天遇到一个小需求：由于java.util.Properties读取属性文件时会忽略注释，当写回去的时候，注释都没了。恰好一个项目中的配置文件会在部署后被某个Java程序修改一下，但修改了之后注释全没了，可能会给以后的参数调整带来困难。所以要解决这个问题。 &nb
读《研磨设计模式》-代码笔记-外观模式-Facade bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /* * 百度百科的定义： * Facade（外观）模式为子系统中的各类（或结构与方法）提供一个简明一致的界面， * 隐藏子系统的复杂性，使子系统更加容易使用。他是为子系统中的一组接口所提供的一个一致的界面 * * 可简单地
After Effects教程收集 cherishLC After Effects
1、中文入门 http://study.163.com/course/courseMain.htm?courseId=730009 2、videocopilot英文入门教程（中文字幕） http://www.youku.com/playlist_show/id_17893193.html 英文原址： http://www.videocopilot.net/basic/ 素
Linux Apache 安装过程 crabdave apache
Linux Apache 安装过程下载新版本： apr-1.4.2.tar.gz（下载网站：http://apr.apache.org/download.cgi） apr-util-1.3.9.tar.gz（下载网站：http://apr.apache.org/download.cgi） httpd-2.2.15.tar.gz（下载网站：http://httpd.apac
Shell学习之变量赋值和引用 daizj shell 变量引用赋值
本文转自：http://www.cnblogs.com/papam/articles/1548679.html Shell编程中，使用变量无需事先声明，同时变量名的命名须遵循如下规则：首个字符必须为字母（a-z，A-Z）中间不能有空格，可以使用下划线（_）不能使用标点符号不能使用bash里的关键字（可用help命令查看保留关键字）需要给变量赋值时，可以这么写：
Java SE 第一讲（Java SE入门、JDK的下载与安装、第一个Java程序、Java程序的编译与执行） dcj3sjt126com java jdk
Java SE 第一讲： Java SE：Java Standard Edition Java ME: Java Mobile Edition Java EE：Java Enterprise Edition Java是由Sun公司推出的（今年初被Oracle公司收购）。收购价格：74亿美金 J2SE、J2ME、J2EE JDK：Java Development
YII给用户登录加上验证码 dcj3sjt126com yii
1、在SiteController中添加如下代码： /** * Declares class-based actions. */ public function actions() { return array( // captcha action renders the CAPTCHA image displ
Lucene使用说明 dyy_gusi Lucene search 分词器
Lucene使用说明 1、lucene简介 1.1、什么是lucene Lucene是一个全文搜索框架，而不是应用产品。因此它并不像baidu或者googleDesktop那种拿来就能用，它只是提供了一种工具让你能实现这些产品和功能。 1.2、lucene能做什么要回答这个问题，先要了解lucene的本质。实际
学习编程并不难,做到以下几点即可! gcq511120594 数据结构编程算法
不论你是想自己设计游戏，还是开发iPhone或安卓手机上的应用，还是仅仅为了娱乐，学习编程语言都是一条必经之路。编程语言种类繁多，用途各异，然而一旦掌握其中之一，其他的也就迎刃而解。作为初学者，你可能要先从Java或HTML开始学，一旦掌握了一门编程语言，你就发挥无穷的想象，开发各种神奇的软件啦。 1、确定目标学习编程语言既充满乐趣，又充满挑战。有些花费多年时间学习一门编程语言的大学生到
Java面试十问之三：Java与C++内存回收机制的差别 HNUlanwei java C++finalize()堆栈内存回收
大家知道， Java 除了那 8 种基本类型以外，其他都是对象类型（又称为引用类型）的数据。 JVM 会把程序创建的对象存放在堆空间中，那什么又是堆空间呢？其实，堆（ Heap）是一个运行时的数据存储区，从它可以分配大小各异的空间。一般，运行时的数据存储区有堆（ Heap）和堆栈（ Stack），所以要先看它们里面可以分配哪些类型的对象实体，然后才知道如何均衡使用这两种存储区。一般来说，栈中存放的
第二章 Nginx+Lua开发入门 jinnianshilongnian nginx lua
Nginx入门本文目的是学习Nginx+Lua开发，对于Nginx基本知识可以参考如下文章： nginx启动、关闭、重启 http://www.cnblogs.com/derekchen/archive/2011/02/17/1957209.html agentzh 的 Nginx 教程 http://openresty.org/download/agentzh-nginx-tutor
MongoDB windows安装基本命令 liyonghui160com
windows安装安装目录： D:\MongoDB\ 新建目录 D:\MongoDB\data\db 4.启动进城： cd D:\MongoDB\bin mongod -dbpath D:\MongoDB\data\db &n
Linux下通过源码编译安装程序 pda158 linux
一、程序的组成部分　　Linux下程序大都是由以下几部分组成：　　二进制文件：也就是可以运行的程序文件　　库文件：就是通常我们见到的lib目录下的文件　　配置文件：这个不必多说，都知道　　帮助文档：通常是我们在linux下用man命令查看的命令的文档　　二、linux下程序的存放目录　　linux程序的存放目录大致有三个地方：　　/etc, /b
WEB开发编程的职业生涯４个阶段 shw3588 编程 Web 工作生活
觉得自己什么都会 2007年从学校毕业，凭借自己原创的ASP毕业设计，以为自己很厉害似的，信心满满去东莞找工作，找面试成功率确实很高，只是工资不高，但依旧无法磨灭那过分的自信，那时候什么考勤系统、什么OA系统、什么ERP，什么都觉得有信心，这样的生涯大概持续了约一年。根本不是自己想的那样 2008年开始接触很多工作相关的东西，发现太多东西自己根本不会，都需要去学，不管是asp还是js，
遭遇jsonp同域下变作post请求的坑 vb2005xu jsonp 同域post
今天迁移一个站点时遇到一个坑爹问题,同一个jsonp接口在跨域时都能调用成功,但是在同域下调用虽然成功,但是数据却有问题. 此处贴出我的后端代码片段 $mi_id = htmlspecialchars(trim($_GET['mi_id '])); $mi_cv = htmlspecialchars(trim($_GET['mi_cv '])); 贴出我前端代码片段: $.aj