小胡说人工智能

OpenAI-ChatGPT最新官方接口《从0到1生产最佳实例》全网最详细中英文实用指南和教程，助你零基础快速轻松掌握全新技术（十一）（附源码）

Production Best Practices 生产最佳实例

前言
Introduction 导言
Setting up your organization 设置您的组织
- Managing billing limits 管理计费限额
- API keys API密钥
- Staging accounts 演示账户
Building your prototype 构建您的原型
- Additional tips 其它技巧
Techniques for improving reliability around prompts 用于提高提示周围的可靠性的技术
Evaluation and iteration 评估和迭代
- Evaluating language models 评估语言模型
- Automated evaluations 自动化评价
- Example procedure for evaluating a GPT-3-based system 用于评估基于GPT-3的系统的示例程序
Scaling your solution architecture 扩展您的解决方案架构
Managing rate limits 管理速率限制
Improving latencies 改善延迟
- Common factors affecting latency and possible mitigation techniques 影响延迟的常见因素和可能的缓解技术
- - Model 模型
  - Number of completion tokens 完成令牌数
  - Streaming 串流
  - Infrastructure 基础设施
  - Batching 批处理
Managing costs 管理成本
- - Text generation 文本生成
MLOps strategy 机器学习操作策略
Security and compliance 安全性和合规性
Safety best practices 安全最佳实践
其它资料下载

前言

作为高级开发工程师，如果你需要开发一个使用ChatGPT的应用程序并部署到生产环境上，那么在此之前，你需要提前考虑完善各项工作。比如如何做好相应的成本控制、并发性能监控，如何持续评估和迭代机器学习模型，以及数据安全性和合规性等方面。

值得一提的是，OpenAI关于ChatGPT的生产最佳实践官方指南覆盖了以上所有内容。相信这一最佳实践指南能够帮助我们从0到1打造出一个高水平的产品。

Introduction 导言

This guide provides a comprehensive set of best practices to help you transition from prototype to production. Whether you are a seasoned machine learning engineer or a recent enthusiast, this guide should provide you with the tools you need to successfully put the platform to work in a production setting: from securing access to our API to designing a robust architecture that can handle high traffic volumes. Use this guide to help develop a plan for deploying your application as smoothly and effectively as possible.
本指南提供了一套全面的最佳实例，可帮助您从原型过渡到生产。无论您是经验丰富的机器学习工程师还是最近的爱好者，本指南都应该为您提供成功将平台投入生产环境所需的工具：从保护对我们API的访问到设计一个可以处理高流量的强大架构。使用本指南可以帮助您制定尽可能平稳有效地部署应用程序的计划。

Setting up your organization 设置您的组织

Once you log in to your OpenAI account, you can find your organization name and ID in your organization settings. The organization name is the label for your organization, shown in user interfaces. The organization ID is the unique identifier for your organization which can be used in API requests.
登录OpenAI帐户后，您可以在组织设置中找到您的组织名称和ID。组织名称是组织的标签，显示在用户界面中。组织ID是您的组织的唯一标识符，可用于API请求。

Users who belong to multiple organizations can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization’s quota. If no header is provided, the default organization will be billed. You can change your default organization in your user settings.
属于多个组织的用户可以传递一个标头，以指定哪个组织用于API请求。这些API请求的使用量将计入指定组织的配额。如果未提供标题，则将对默认组织开单。您可以在用户设置中更改默认组织。

You can invite new members to your organization from the members settings page. Members can be readers or owners. Readers can make API requests and view basic organization information, while owners can modify billing information and manage members within an organization.
您可以从成员设置页面邀请新成员加入组织。成员可以是读者或所有者。读者可以发出API请求并查看基本组织信息，而所有者可以修改计费信息并管理组织内的成员。

Managing billing limits 管理计费限额

New free trial users receive an initial credit of $5 that expires after three months. Once the credit has been used or expires, you can choose to enter billing information to continue your use of the API. If no billing information is entered, you will still have login access but will be unable to make any further API requests.
新的免费试用用户将获得5美元的初始信用，三个月后到期。信用额度用完或到期后，您可以选择输入账单信息以继续使用API。如果未输入任何计费信息，您仍将具有登录访问权限，但将无法进行任何进一步的API请求。

Once you’ve entered your billing information, you will have an approved usage limit of $120 per month, which is set by OpenAI. To increase your quota beyond the $120 monthly billing limit, please submit a quota increase request.
一旦您输入了账单信息，您将获得每月120美元的批准使用限额，这是由OpenAI设置的。要将您的配额增加到超过每月120美元的账单限额，请提交配额增加请求。

If you’d like to be notified when your usage exceeds a certain amount, you can set a soft limit through the usage limits page. When the soft limit is reached, the owners of the organization will receive an email notification. You can also set a hard limit so that, once the hard limit is reached, any subsequent API requests will be rejected. Note that these limits are best effort, and there may be 5 to 10 minutes of delay between the usage and the limits being enforced.
如果您希望在使用量超过一定数量时收到通知，您可以通过使用限制页面设置软限制。当达到软限制时，组织的所有者将收到电子邮件通知。您还可以设置硬限制，以便一旦达到硬限制，将拒绝任何后续API请求。请注意，这些限制是尽力而为的，在使用和强制执行的限制之间可能有5到10分钟的延迟。

API keys API密钥

The OpenAI API uses API keys for authentication. Visit your API keys page to retrieve the API key you’ll use in your requests.
OpenAI API使用API密钥进行身份验证。访问您的API密钥页面以检索您将在请求中使用的API密钥。

This is a relatively straightforward way to control access, but you must be vigilant about securing these keys. Avoid exposing the API keys in your code or in public repositories; instead, store them in a secure location. You should expose your keys to your application using environment variables or secret management service, so that you don’t need to hard-code them in your codebase. Read more in our Best practices for API key safety.
这是控制访问的一种相对简单的方法，但您必须对保护这些密钥保持警惕。避免在代码或公共存储库中暴露API密钥;而是将它们存储在安全位置。您应该使用环境变量或秘密管理服务将密钥公开给应用程序，这样就不需要在代码库中硬编码它们。请阅读我们的API密钥安全最佳实践。

Staging accounts 演示账户

As you scale, you may want to create separate organizations for your staging and production environments. Please note that you can sign up using two separate email addresses like [email protected] and [email protected] to create two organizations. This will allow you to isolate your development and testing work so you don’t accidentally disrupt your live application. You can also limit access to your production organization this way.
随着扩展，您可能希望为临时环境和生产环境创建单独的组织。请注意，您可以使用两个单独的电子邮件地址（如[email protected]和[email protected]）注册，以创建两个组织。这将允许您隔离开发和测试工作，这样您就不会意外地中断活动应用程序。您还可以通过这种方式限制对生产组织的访问。

Building your prototype 构建您的原型

If you haven’t gone through the quickstart guide, we recommend you start there before diving into the rest of this guide.
如果您还没有浏览过快速入门指南，我们建议您在深入阅读本指南的其余部分之前先从快速入门指南开始。

For those new to the OpenAI API, our playground can be a great resource for exploring its capabilities. Doing so will help you learn what’s possible and where you may want to focus your efforts. You can also explore our example prompts.
对于那些OpenAI API的新手来说，我们的游乐场可以成为探索其功能的绝佳资源。这样做可以帮助你了解什么是可能的，以及你可能想把精力集中在哪里。您也可以浏览我们的示例提示。

While the playground is a great place to prototype, it can also be used as an incubation area for larger projects. The playground also makes it easy to export code snippets for API requests and share prompts with collaborators, making it an integral part of your development process.
虽然游乐场是一个很好的原型制作场所，但它也可以用作大型项目的孵化区。游乐场还可以轻松导出API请求的代码段，并与协作者共享提示，使其成为开发过程中不可或缺的一部分。

Additional tips 其它技巧

Start by determining the core functionalities you want your application to have. Consider the types of data inputs, outputs, and processes you will need. Aim to keep the prototype as focused as possible, so that you can iterate quickly and efficiently.
首先确定您希望应用程序具有的核心功能。考虑您将需要的数据输入、输出和处理的类型。目标是尽可能地保持原型的重点，以便您可以快速有效地迭代。
Choose the programming language and framework that you feel most comfortable with and that best aligns with your goals for the project. Some popular options include Python, Java, and Node.js. See library support page to learn more about the library bindings maintained both by our team and by the broader developer community.
选择你觉得最舒服的编程语言和框架，并且最符合你的项目目标。一些流行的选项包括Python，Java和Node.js。请参阅库支持页面，了解有关我们团队和更广泛的开发人员社区维护的库绑定的更多信息。
Development environment and support: Set up your development environment with the right tools and libraries and ensure you have the resources you need to train your model. Leverage our documentation, community forum and our help center to get help with troubleshooting. If you are developing using Python, take a look at this structuring your project guide (repository structure is a crucial part of your project’s architecture). In order to connect with our support engineers, simply log in to your account and use the “Help” button to start a conversation.
开发环境及支持：使用正确的工具和库设置您的开发环境，并确保您拥有训练模型所需的资源。利用我们的文档、社区论坛和帮助中心获取故障排除帮助。如果您正在使用Python进行开发，请查看此项目结构指南（存储库结构是项目架构的关键部分）。要与我们的支持工程师联系，只需登录您的帐户并使用“帮助”按钮开始对话。

Techniques for improving reliability around prompts 用于提高提示周围的可靠性的技术

Even with careful planning, it’s important to be prepared for unexpected issues when using GPT-3 in your application. In some cases, the model may fail on a task, so it’s helpful to consider what you can do to improve the reliability of your application.
即使经过仔细的规划，在应用程序中使用GPT-3时，为意外问题做好准备也很重要。在某些情况下，模型可能会在某个任务上失败，因此考虑如何提高应用程序的可靠性是很有帮助的。

If your task involves logical reasoning or complexity, you may need to take additional steps to build more reliable prompts. For some helpful suggestions, consult our Techniques to improve reliability guide. Overall the recommendations revolve around:
如果您的任务涉及逻辑推理或复杂性，则可能需要采取其他步骤来构建更可靠的提示。有关一些有用的建议，请参阅我们的提高可靠性技术指南。总的来说，这些建议围绕着：

Decomposing unreliable operations into smaller, more reliable operations (e.g., selection-inference prompting)
将不可靠的操作分解成更小的、更可靠的操作（例如，选择推理提示）
Using multiple steps or multiple relationships to make the system’s reliability greater than any individual component (e.g., maieutic prompting)
使用多个步骤或多个关系来使系统的可靠性大于任何单个组件（例如，医疗提示）

Evaluation and iteration 评估和迭代

One of the most important aspects of developing a system for production is regular evaluation and iterative experimentation. This process allows you to measure performance, troubleshoot issues, and fine-tune your models to improve accuracy and efficiency. A key part of this process is creating an evaluation dataset for your functionality. Here are a few things to keep in mind:
开发用于生产的系统的最重要方面之一是定期评估和迭代实验。此过程允许您测量性能、解决问题并微调模型以提高准确性和效率。此过程的关键部分是为您的功能创建评估数据集。以下是需要牢记的几点：

Make sure your evaluation set is representative of the data your model will be used on in the real world. This will allow you to assess your model’s performance on data it hasn’t seen before and help you understand how well it generalizes to new situations.
请确保评估集代表真实的世界中将使用模型的数据。这将允许您评估模型在以前没有见过的数据上的性能，并帮助您了解它对新情况的泛化能力。
Regularly update your evaluation set to ensure that it stays relevant as your model evolves and as new data becomes available.
定期更新您的评估集，以确保它随着模型的发展和新数据的可用而保持相关性。
Use a variety of metrics to evaluate your model’s performance. Depending on your application and business outcomes, this could include accuracy, precision, recall, F1 score, or mean average precision (MAP). Additionally, you can sync your fine-tunes with Weights & Biases to track experiments, models, and datasets.
使用各种指标来评估模型的性能。根据您的应用程序和业务成果，这可能包括准确度、精确度、召回率、F1分数或平均精度（MAP）。此外，您还可以使用权重和偏差同步微调，以跟踪实验、模型和数据集。
Compare your model’s performance against baseline. This will give you a better understanding of your model’s strengths and weaknesses and can help guide your future development efforts.
将模型的性能与基线进行比较。这将给予您更好地了解模型的优点和缺点，并有助于指导您未来的开发工作。

By conducting regular evaluation and iterative experimentation, you can ensure that your GPT-powered application or prototype continues to improve over time.
通过进行定期评估和迭代实验，您可以确保GPT驱动的应用程序或原型随着时间的推移不断改进。

Evaluating language models 评估语言模型

Language models can be difficult to evaluate because evaluating the quality of generated language is often subjective, and there are many different ways to communicate the same message correctly in language. For example, when evaluating a model on the ability to summarize a long passage of text, there are many correct summaries. That being said, designing good evaluations is critical to making progress in machine learning.
语言模型可能很难评估，因为评估生成的语言的质量通常是主观的，并且有许多不同的方法可以用语言正确地传达相同的消息。例如，当评估一个模型总结一长段文本的能力时，有许多正确的总结。话虽如此，设计良好的评估对于机器学习取得进展至关重要。

An eval suite needs to be comprehensive, easy to run, and reasonably fast (depending on model size). It also needs to be easy to continue to add to the suite as what is comprehensive one month will likely be out of date in another month. We should prioritize having a diversity of tasks and tasks that identify weaknesses in the models or capabilities that are not improving with scaling.
一个eval套件需要全面、易于运行，并且相当快（取决于模型大小）。它还需要很容易继续添加到套件中，因为一个月的全面内容可能在另一个月就过时了。我们应该优先考虑任务的多样性，这些任务可以识别模型中的弱点或无法随着扩展而改进的功能。

The simplest way to evaluate your system is to manually inspect its outputs. Is it doing what you want? Are the outputs high quality? Are they consistent?
评估系统的最简单方法是手动检查其输出。它在做你想做的事吗？产出是否高质量？它们是一致的吗？

Automated evaluations 自动化评价

The best way to test faster is to develop automated evaluations. However, this may not be possible in more subjective applications like summarization tasks.
加快测试速度的最佳方法是开发自动评估。然而，这在更主观的应用（如摘要任务）中可能是不可能的。

Automated evaluations work best when it’s easy to grade a final output as correct or incorrect. For example, if you’re fine-tuning a classifier to classify text strings as class A or class B, it’s fairly simple: create a test set with example input and output pairs, run your system on the inputs, and then grade the system outputs versus the correct outputs (looking at metrics like accuracy, F1 score, cross-entropy, etc.).
当很容易将最终输出分为正确或不正确时，自动评估工作最好。例如，如果你正在微调一个分类器，将文本字符串分类为A类或B类，这相当简单：使用示例输入和输出对创建一个测试集，在输入上运行系统，然后将系统输出与正确的输出进行比较（查看准确性，F1得分，交叉熵等指标）。

If your outputs are semi open-ended, as they might be for a meeting notes summarizer, it can be trickier to define success: for example, what makes one summary better than another? Here, possible techniques include:
如果您的输出是半开放式的，就像会议记录摘要器一样，那么定义成功可能会更棘手：例如，是什么让一个总结比另一个更好？这里，可能的技术包括：

Writing a test with ‘gold standard’ answers and then measuring some sort of similarity score between each gold standard answer and the system output (we’ve seen embeddings work decently well for this)
用“黄金标准”答案编写一个测试，然后测量每个黄金标准答案和系统输出之间的某种相似性分数（我们已经看到嵌入在这方面工作得很好）
Building a discriminator system to judge / rank outputs, and then giving that discriminator a set of outputs where one is generated by the system under test (this can even be GPT model that is asked whether the question is answered correctly by a given output)
构建一个鉴别器系统来判断/排序输出，然后给该鉴别器一组输出，其中一个输出由被测系统生成（这甚至可以是GPT模型，该模型被问及给定输出是否正确回答了问题）
Building an evaluation model that checks for the truth of components of the answer; e.g., detecting whether a quote actually appears in the piece of given text
建立一个评估模型，检查答案组成部分的真实性;例如，检测引用是否实际上出现在给定文本的片段中

For very open-ended tasks, such as a creative story writer, automated evaluation is more difficult. Although it might be possible to develop quality metrics that look at spelling errors, word diversity, and readability scores, these metrics don’t really capture the creative quality of a piece of writing. In cases where no good automated metric can be found, human evaluations remain the best method.
对于非常开放式的任务，例如创造性的故事作者，自动评估就比较困难。尽管可以开发出质量指标来衡量拼写错误、单词多样性和可读性得分，但这些指标并不能真正反映一篇文章的创造性质量。在无法找到良好的自动化指标的情况下，人工评估仍然是最佳方法。

Example procedure for evaluating a GPT-3-based system 用于评估基于GPT-3的系统的示例程序

As an example, let’s consider the case of building a retrieval-based Q&A system.
作为一个例子，让我们考虑构建一个基于检索的问答系统的情况。

A retrieval-based Q&A system has two steps. First, a user’s query is used to rank potentially relevant documents in a knowledge base. Second, GPT-3 is given the top-ranking documents and asked to generate an answer to the query.
基于检索的Q&A系统有两个步骤。首先，使用用户的查询来对知识库中的潜在相关文档进行排名。第二，GPT-3被赋予最高排名的文档，并被要求生成查询的答案。

Evaluations can be made to measure the performance of each step.
可以进行评估以测量每个步骤的性能。

For the search step, one could:
对于搜索步骤，可以：

First, generate a test set with ~100 questions and a set of correct documents for each
首先，生成一个包含约100个问题的测试集，并为每个问题生成一组正确的文档
The questions can be sourced from user data if you have any; otherwise, you can invent a set of questions with diverse styles and difficulty.
问题可以来源于你有的任何用户数据；否则，你可以发明一套不同风格和难度的问题。
For each question, have a person manually search through the knowledge base and record the set of documents that contain the answer.
对于每个问题，让一个人手动搜索知识库并记录包含答案的文档集。
Second, use the test set to grade the system’s performance
其次，使用测试集对系统的性能进行分级
For each question, use the system to rank the candidate documents (e.g., by cosine similarity of the document embeddings with the query embedding).
对于每个问题，使用系统对候选文档进行排名（例如，通过文档嵌入与查询嵌入的余弦相似性）。
You can score the results with a binary accuracy score of 1 if the candidate documents contain at least 1 relevant document from the answer key and 0 otherwise
如果候选文档至少包含1个答案关键字的相关文档，则可以使用二进制准确性得分1对结果进行评分，否则为0
You can also use a continuous metric like Mean Reciprocal Rank which can help distinguish between answers that were close to being right or far from being right (e.g., a score of 1 if the correct document is rank 1, a score of ½ if rank 2, a score of ⅓ if rank 3, etc.)
您还可以使用连续指标，如平均倒数排名，它可以帮助区分接近正确或远离正确的答案（例如，如果正确的文档是等级1，则得分为1，如果等级2，则得分为1/2，如果等级3，则得分为1/2，等等）。

For the question answering step, one could:
对于问题回答步骤，可以：

First, generate a test set with ~100 sets of {question, relevant text, correct answer}
首先，生成一个包含约100组{问题，相关文本，正确答案}的测试集
-For the questions and relevant texts, use the above data
对于问题和相关文本，使用上述数据
For the correct answers, have a person write down ~100 examples of what a great answer looks like.
对于正确的答案，让一个人写下100个例子，说明一个伟大的答案是什么样子的。

Second, use the test set to grade the system’s performance
其次，使用测试集对系统的性能进行分级

For each question & text pair, combine them into a prompt and submit the prompt to GPT-3
对于每个问题和文本对，将它们组合成一个提示，并将提示提交给GPT-3
Next, compare GPT-3’s answers to the gold-standard answer written by a human
接下来，将GPT-3的答案与人类写的黄金标准答案进行比较
This comparison can be manual, where humans look at them side by side and grade whether the GPT-3 answer is correct/high quality
这种比较可以是手动的，人类将它们并排看，并对GPT-3答案是否正确/高质量进行评分
This comparison can also be automated, by using embedding similarity scores or another method (automated methods will likely be noisy, but noise is ok as long as it’s unbiased and equally noisy across different types of models that you’re testing against one another)
这种比较也可以通过使用嵌入相似性得分或其他方法来自动化（自动化方法可能会有噪声，但噪声是可以的，只要它是无偏差的，并且在不同类型的模型之间具有相同的噪声）
Of course, N=100 is just an example, and in early stages, you might start with a smaller set that’s easier to generate, and in later stages, you might invest in a larger set that’s more costly but more statistically reliable.
当然， N=100只是一个例子，在早期阶段，你可能会从一个更容易生成的较小集合开始，在后期阶段，你可能会投资一个更大的集合，成本更高，但在统计上更可靠。

Scaling your solution architecture 扩展您的解决方案架构

When designing your application or service for production that uses our API, it’s important to consider how you will scale to meet traffic demands. There are a few key areas you will need to consider regardless of the cloud service provider of your choice:
在设计使用我们的API的生产应用或服务时，重要的是要考虑如何扩展以满足流量需求。无论您选择哪种云服务提供商，都需要考虑以下几个关键领域：

Horizontal scaling: You may want to scale your application out horizontally to accommodate requests to your application that come from multiple sources. This could involve deploying additional servers or containers to distribute the load. If you opt for this type of scaling, make sure that your architecture is designed to handle multiple nodes and that you have mechanisms in place to balance the load between them.
水平缩放：您可能希望横向扩展应用程序，以适应来自多个源的应用程序请求。这可能涉及部署额外的服务器或容器来分配负载。如果您选择这种类型的扩展，请确保您的架构设计为处理多个节点，并且您有适当的机制来平衡它们之间的负载。
Vertical scaling: Another option is to scale your application up vertically, meaning you can beef up the resources available to a single node. This would involve upgrading your server’s capabilities to handle the additional load. If you opt for this type of scaling, make sure your application is designed to take advantage of these additional resources.
垂直缩放：另一种选择是垂直扩展应用程序，这意味着您可以增加单个节点的可用资源。这将涉及到升级服务器的功能以处理额外的负载。如果您选择这种类型的扩展，请确保您的应用程序被设计为利用这些额外的资源。
Caching: By storing frequently accessed data, you can improve response times without needing to make repeated calls to our API. Your application will need to be designed to use cached data whenever possible and invalidate the cache when new information is added. There are a few different ways you could do this. For example, you could store data in a database, filesystem, or in-memory cache, depending on what makes the most sense for your application.
缓存：通过存储频繁访问的数据，您可以缩短响应时间，而无需重复调用我们的API。您的应用程序需要设计为尽可能使用缓存数据，并在添加新信息时该高速缓存无效。有几种不同的方法可以做到这一点。例如，您可以将数据存储在数据库、文件系统或内存缓存中，这取决于什么对您的应用程序最有意义。
Load balancing: Finally, consider load-balancing techniques to ensure requests are distributed evenly across your available servers. This could involve using a load balancer in front of your servers or using DNS round-robin. Balancing the load will help improve performance and reduce bottlenecks.
负载均衡：最后，考虑负载平衡技术，以确保请求在可用服务器上均匀分布。这可能涉及在服务器前使用负载平衡器或使用DNS轮询。平衡负载将有助于提高性能和减少瓶颈。

Managing rate limits 管理速率限制

When using our API, it’s important to understand and plan for rate limits.
在使用我们的API时，了解和规划速率限制非常重要。

Improving latencies 改善延迟

Latency is the time it takes for a request to be processed and a response to be returned. In this section, we will discuss some factors that influence the latency of our text generation models and provide suggestions on how to reduce it.
延迟是处理请求和返回响应所花费的时间。在本节中，我们将讨论影响文本生成模型延迟的一些因素，并提供有关如何减少延迟的建议。

The latency of a completion request is mostly influenced by two factors: the model and the number of tokens generated. The life cycle of a completion request looks like this:
完成请求的延迟主要受两个因素的影响：模型和生成的token的数量。完成请求的生命周期如下所示：

The bulk of the latency typically arises from the token generation step.
大部分延迟通常由token生成步骤引起。

Intuition: Prompt tokens add very little latency to completion calls. Time to generate completion tokens is much longer, as tokens are generated one at a time. Longer generation lengths will accumulate latency due to generation required for each token.
直觉：提示符token 几乎不会给完成调用增加延迟。生成完成token的时间要长得多，因为token是一次生成一个。更长的生成长度将由于每个令牌所需的生成而累积延迟。

Common factors affecting latency and possible mitigation techniques 影响延迟的常见因素和可能的缓解技术

Now that we have looked at the basics of latency, let’s take a look at various factors that can affect latency, broadly ordered from most impactful to least impactful.
现在我们已经了解了延迟的基本知识，让我们来看看可能影响延迟的各种因素，从最具影响力到最不具影响力大致排序。

Model 模型

Our API offers different models with varying levels of complexity and generality. The most capable models, such as gpt-4, can generate more complex and diverse completions, but they also take longer to process your query. Models such as gpt-3.5-turbo, can generate faster and cheaper chat completions, but they may generate results that are less accurate or relevant for your query. You can choose the model that best suits your use case and the trade-off between speed and quality.
我们的API提供不同的模型，具有不同的复杂性和通用性。功能最强大的模型（如 gpt-4 ）可以生成更复杂和更多样化的补全，但它们也需要更长的时间来处理您的查询。 gpt-3.5-turbo 等模型可以生成更快、更便宜的聊天完成，但它们可能生成不太准确或与您的查询相关的结果。您可以选择最适合您的用例的模型，并在速度和质量之间进行权衡。

Number of completion tokens 完成令牌数

Requesting a large amount of generated tokens completions can lead to increased latencies:
请求大量生成的令牌完成可能会导致延迟增加：

Lower max tokens: for requests with a similar token generation count, those that have a lower max_tokens parameter incur less latency.
最大令牌数下限：对于具有类似令牌生成计数的请求，具有较低 max_tokens 参数的那些请求招致较少等待时间。
Include stop sequences: to prevent generating unneeded tokens, add a stop sequence. For example, you can use stop sequences to generate a list with a specific number of items. In this case, by using 11. as a stop sequence, you can generate a list with only 10 items, since the completion will stop when 11. is reached. Read our help article on stop sequences for more context on how you can do this.
包括终止序列：要防止生成不需要的令牌，请添加停止序列。例如，可以使用停止序列生成包含特定数量项的列表。在这种情况下，通过使用 11. 作为停止序列，您可以生成一个只有10个项目的列表，因为完成将在达到 11. 时停止。请阅读我们关于停止序列的帮助文章，了解如何执行此操作的更多上下文。
Generate fewer completions: lower the values of n and best_of when possible where n refers to how many completions to generate for each prompt and best_of is used to represent the result with the highest log probability per token.
生成更少的完成：尽可能降低 n 和 best_of 的值，其中 n 是指为每个提示生成多少个完成， best_of 用于表示每个令牌具有最高对数概率的结果。

If n and best_of both equal 1 (which is the default), the number of generated tokens will be at most, equal to max_tokens.
如果 n 和best_of 都等于1（这是默认值），则生成的令牌的数量将最多等于 max_tokens 。

If n (the number of completions returned) or best_of (the number of completions generated for consideration) are set to > 1, each request will create multiple outputs. Here, you can consider the number of generated tokens as [ max_tokens * max (n, best_of) ]
如果将 n （返回的完成数）或 best_of （生成的完成数）设置为 > 1 ，则每个请求将创建多个输出。在这里，您可以将生成的令牌数视为 [ max_tokens * max (n, best_of) ]

Streaming 串流

Setting stream: true in a request makes the model start returning tokens as soon as they are available, instead of waiting for the full sequence of tokens to be generated. It does not change the time to get all the tokens, but it reduces the time for first token for an application where we want to show partial progress or are going to stop generations. This can be a better user experience and a UX improvement so it’s worth experimenting with streaming.
在请求中设置 stream: true 会使模型在令牌可用时立即开始返回令牌，而不是等待生成完整的令牌序列。它不会改变获取所有令牌的时间，但它减少了我们想要显示部分进度或将要停止生成的应用程序的第一个令牌的时间。这可能是一个更好的用户体验和UX改进，所以值得尝试串流。

Infrastructure 基础设施

Our servers are currently located in the US. While we hope to have global redundancy in the future, in the meantime you could consider locating the relevant parts of your infrastructure in the US to minimize the roundtrip time between your servers and the OpenAI servers.
我们的服务器目前位于美国。虽然我们希望在未来实现全球冗余，但与此同时，您可以考虑将基础设施的相关部分放在美国，以最大限度地减少服务器和OpenAI服务器之间的往返时间。

Batching 批处理

Depending on your use case, batching may help. If you are sending multiple requests to the same endpoint, you can batch the prompts to be sent in the same request. This will reduce the number of requests you need to make. The prompt parameter can hold up to 20 unique prompts. We advise you to test out this method and see if it helps. In some cases, you may end up increasing the number of generated tokens which will slow the response time.
根据您的用例，批处理可能会有所帮助。如果要向同一端点发送多个请求，则可以批处理要在同一请求中发送的提示。这将减少您需要提出的请求的数量。prompt参数最多可以保存20个唯一提示。我们建议您测试一下这个方法，看看是否有帮助。在某些情况下，您最终可能会增加生成的令牌的数量，这将减慢响应时间。

Managing costs 管理成本

To monitor your costs, you can set a soft limit in your account to receive an email alert once you pass a certain usage threshold. You can also set a hard limit. Please be mindful of the potential for a hard limit to cause disruptions to your application/users. Use the usage tracking dashboard to monitor your token usage during the current and past billing cycles.
为了监控您的成本，您可以在帐户中设置软限制，以便在超过特定使用阈值时收到电子邮件提醒。您也可以设置一个硬限制。请注意硬限制可能会对您的应用程序/用户造成中断。使用使用情况跟踪仪表板监控当前和过去计费周期内的令牌使用情况。

Text generation 文本生成

One of the challenges of moving your prototype into production is budgeting for the costs associated with running your application. OpenAI offers a pay-as-you-go pricing model, with prices per 1,000 tokens (roughly equal to 750 words). To estimate your costs, you will need to project the token utilization. Consider factors such as traffic levels, the frequency with which users will interact with your application, and the amount of data you will be processing.
将原型投入生产的挑战之一是为运行应用程序的相关成本进行预算。OpenAI提供了一个按需付费的定价模型，每1,000个token（大约等于750个单词）的价格。要估计成本，您需要预测token利用率。考虑一些因素，如流量水平、用户与应用程序交互的频率以及您将处理的数据量。

One useful framework for thinking about reducing costs is to consider costs as a function of the number of tokens and the cost per token. There are two potential avenues for reducing costs using this framework. First, you could work to reduce the cost per token by switching to smaller models for some tasks in order to reduce costs. Alternatively, you could try to reduce the number of tokens required. There are a few ways you could do this, such as by using shorter prompts, fine-tuning models, or caching common user queries so that they don’t need to be processed repeatedly.
考虑降低成本的一个有用框架是将成本视为token数量和每个token成本的函数。有两个潜在的途径来降低使用该框架的成本。首先，您可以通过为某些任务切换到较小的模型来降低每个令牌的成本，以降低成本。或者，您可以尝试减少所需的令牌数量。有几种方法可以做到这一点，例如使用更短的提示，微调模型，或缓存常见的用户查询，以便它们不需要重复处理。

You can experiment with our interactive tokenizer tool to help you estimate costs. The API and playground also returns token counts as part of the response. Once you’ve got things working with our most capable model, you can see if the other models can produce the same results with lower latency and costs. Learn more in our token usage help article.
您可以尝试使用我们的交互式符分词工具来帮助您估算成本。API和playground还返回令牌计数作为响应的一部分。一旦你使用我们最强大的模型，你就可以看到其他模型是否可以以更低的延迟和成本产生相同的结果。在我们的token 使用帮助文章中了解更多信息。

MLOps strategy 机器学习操作策略

As you move your prototype into production, you may want to consider developing an MLOps strategy. MLOps (machine learning operations) refers to the process of managing the end-to-end life cycle of your machine learning models, including any models you may be fine-tuning using our API. There are a number of areas to consider when designing your MLOps strategy. These include
当您将原型投入生产时，您可能需要考虑开发一个MLOps策略。MLOps（机器学习操作策略）是指管理机器学习模型的端到端生命周期的过程，包括您可能使用我们的API进行微调的任何模型。在设计MLOps策略时，有许多方面需要考虑。其中包括

Data and model management: managing the data used to train or fine-tune your model and tracking versions and changes.
数据和模型管理：管理用于训练或微调模型的数据，并跟踪版本和更改。
Model monitoring: tracking your model’s performance over time and detecting any potential issues or degradation.
模型监测：跟踪模型随时间推移的性能，并检测任何潜在问题或性能下降。
Model retraining: ensuring your model stays up to date with changes in data or evolving requirements and retraining or fine-tuning it as needed.
模型再训练：确保您的模型保持与数据或不断变化的需求的变化同步，并根据需要对其进行重新训练或微调。
Model deployment: automating the process of deploying your model and related artifacts into production.
模型部署：自动化将模型和相关工件部署到生产中的过程。

Thinking through these aspects of your application will help ensure your model stays relevant and performs well over time.
仔细考虑应用程序的这些方面将有助于确保您的模型保持相关性，并随着时间的推移表现良好。

Security and compliance 安全性和合规性

As you move your prototype into production, you will need to assess and address any security and compliance requirements that may apply to your application. This will involve examining the data you are handling, understanding how our API processes data, and determining what regulations you must adhere to. For reference, here is our Privacy Policy and Terms of Use.
当您将原型投入生产时，您需要评估和解决可能适用于您的应用程序的任何安全性和合规性要求。这将涉及检查您正在处理的数据，了解我们的API如何处理数据，并确定您必须遵守的法规。以下是我们的隐私政策和使用条款，供您参考。

Some common areas you’ll need to consider include data storage, data transmission, and data retention. You might also need to implement data privacy protections, such as encryption or anonymization where possible. In addition, you should follow best practices for secure coding, such as input sanitization and proper error handling.
您需要考虑的一些常见领域包括数据存储、数据传输和数据保留。您可能还需要实施数据隐私保护，例如在可能的情况下进行加密或匿名化。此外，您应该遵循安全编码的最佳实践，例如输入清理和合适的错误处理。

Safety best practices 安全最佳实践

When creating your application with our API, consider our safety best practices to ensure your application is safe and successful. These recommendations highlight the importance of testing the product extensively, being proactive about addressing potential issues, and limiting opportunities for misuse.
使用我们的API创建应用程序时，请考虑我们的安全最佳实践，以确保您的应用程序安全且成功。这些建议强调了广泛测试产品的重要性，积极主动地解决潜在问题，并限制误用的机会。

其它资料下载

如果大家想继续了解人工智能相关学习路线和知识体系，欢迎大家翻阅我的另外一篇博客《重磅 | 完备的人工智能AI 学习——基础知识学习路线，所有资料免关注免套路直接网盘下载》
这篇博客参考了Github知名开源平台，AI技术平台以及相关领域专家：Datawhale，ApacheCN，AI有道和黄海广博士等约有近100G相关资料，希望能帮助到所有小伙伴们。

你可能感兴趣的:(ChatGPT,chatgpt,人工智能,自然语言处理,nlp,python)

Python编程菜鸟教程：从入门到精通的完全指南_python菜鸟教程 2401_89285717 python 开发语言
我们将介绍Python在数据科学、机器学习、Web开发等方面的应用，并带你了解Python社区和生态系统。基础入门Python安装：在官方网站下载安装包，根据不同操作系统进行安装。Mac用户可直接使用Homebrew进行安装Windows用户需下载安装包后进行手动安装Linux用户可使用apt-get或yum进行安装基础语法：Python是一种解释型语言，支持面向对象、函数式和面向过程等多种编程范
Python Pandas库超详细教程：从入门到精通实战指南 stormsha Python python pandas 开发语言 python3.11 数据分析
欢迎莅临我的博客，很高兴能够在这里和您见面！希望您在这里可以感受到一份轻松愉快的氛围，不仅可以获得有趣的内容和知识，也可以畅所欲言、分享您的想法和见解。推荐：「stormsha的主页」，「stormsha的知识库」持续学习，不断总结，共同进步，为了踏实，做好当下事儿~非常期待和您一起在这个小小的网络世界里共同探索、学习和成长。✨✨欢迎订阅本专栏✨✨TheStart点点关注，收藏不迷路文章目录Pyt
python中的元类Metaclass ReedSun python python
python中的元类Metaclass理解元类之前需要学习的知识如果说让我们创建一个类，最先想到的肯定是用class创建，当我们使用class创建类的时候，python解释器自动创建这个对象，但是python同样也提供了手动处理的方法来创建类，这就是用python的自建函数type()。我们所熟知的type()函数的作用是返回一个参数的类型，但是实际上，它也有一种完全不同的能力，即接受一个类的一些
python 元类的继承_Python学习_13_继承和元类五伤先生 python 元类的继承
继承继承的含义就是子类继承父类的命名空间，子类中可以调用父类的属性和方法，由于命名空间的查找方式，当子类中定义和父类同名属性或者方法时，子类的实例调用的是子类中的属性，而不是父类，这就形成了python中的多态：defSuperClass:defa_method:passdefSubClass(SuperClass):defa_method:passobj=SubClass()obj.a_meth
编程效率的飞跃、创新驱动的测试与行业应用的新篇章
###引言在人工智能技术飞速发展的今天，AI工具、大模型及行业应用正在深刻改变着开发者的工作模式与各领域的发展格局。从智能编码助手到自动化测试平台，从大模型落地实践到垂直行业解决方案，AI正成为提升效率、驱动创新的核心引擎。本文将围绕“AI技术如何重塑你的工作与行业”这一主题，探讨AI工具、AI编程、AI测试以及AI行业应用和大模型落地等方面的影响。 ###一、AI工具重塑开发工作 #
网络安全用什么编程语言_网络安全的5种最佳编程语言程序员羊羊 web安全网络安全开发语言数据库
网络安全用什么编程语言要成为网络安全专家，要取得成功，需要多种技能。全方位的专业人员可以放心地实施和监视安全措施，以保护计算机系统免受攻击和未经授权的访问。总部位于巴西的Python专家Henrique教人们如何使用该语言创建应用程序，他强调“除了紧跟网络安全领域的最新动态，您还需要熟悉各种编程语言。”这里有5种最佳编程语言，可帮助您提高网络安全职业的学习能力。1.C和C++C和C++是网络安全专
Python面试题：使用Python进行元编程：元类和元编程技巧
在Python中，元编程是一种编程技巧，它涉及到代码本身的结构和行为的编程。元编程允许你编写能够操作、修改或生成代码的代码。最常见的元编程技术包括使用元类、装饰器和类装饰器。以下是对Python元编程的详细讲解，包括元类和一些常用的元编程技巧。1.元类（Metaclasses）1.1定义和概念元类是用来创建类的类。换句话说，元类定义了类的行为，就像类定义了对象的行为一样。在Python中，type
Python元类基础知识示例深度剖析，从新手小白成为Python编程高手只存在于虚拟的King python 开发语言深度学习学习经验分享计算机网络程序人生
文章目录引言一、什么是元类？二、元类的工作原理三、如何定义元类四、元类的应用场景五、元类的注意事项六、结论关于Python技术储备一、Python所有方向的学习路线二、Python基础学习视频三、精品Python学习书籍四、Python工具包+项目源码合集①Python工具包②Python实战案例③Python小游戏源码五、面试资料六、Python兼职渠道引言Python是一种强大的编程语言，一部
中电金信：十问高质量数据集：金融大模型价值重塑有“据”可循
2025年，随着大模型在金融领域的深度应用，高质量数据集已逐渐成为决定模型性能的“基石”。面对数据要素价值释放的关键机遇期，国家政策不断深入推进：2月，国务院国资委启动“AI+”专项行动，着力攻克数据难题；5月，数字中国峰会发布了首批30项央企AI高质量数据集成果；6月，在央国企金融领域人工智能高质量数据集工作推进会上，14家企业共同签署了“央国企金融数据产业共同体倡议书”，旨在推动人工智能与数据
stm32 micropython vscode_VS Code 上最硬核的 MicroPython 插件 weixin_39968309 stm32 micropython vscode
介绍VSCode上最硬核的MicroPython插件——RT-ThreadMicroPython，为MicroPython开发提供了强大的开发环境，主要特性如下：设备快速连接(串口、网络、USB)支持基于MicroPython的代码智能补全与语法检查支持MicroPythonREPL交互环境提供丰富的代码示例与demo程序提供工程同步功能支持下载单个文件或文件夹至开发板支持在内存中快速运行代码文件
毕业论文 | 人工智能侵权责任法律问题研究——以无人驾驶汽车为例北斗猿毕业论文设计人工智能无人驾驶法律侵权责任法民法典
===========================================github：https://github.com/MichaelBeechanCSDN：https://blog.csdn.net/u011344545===========================================人工智能侵权责任法律问题研究——以无人驾驶汽车为例目录摘要一、绪论(一)课
人工智能发展简史——未来是属于AI人工智能的。 AI天才研究院 ChatGPT AI人工智能与大数据人工智能
目录人工智能发展简史第一章：起步期-20世纪50年代及以前1.1计算机象棋博弈（Programmingacomputerforplayingchess）1.2图灵测试（TuringTest）1.3达特茅斯学院人工智能夏季研讨会（DartmouthSummerResearchConferenceonArtificialIntelligence）1.4感知机（Perceptrons）第二章：第一次浪潮
Python对JSON数据操作
在Python中，对JSON数据进行增删改查及加载保存操作，主要通过内置的json模块实现。一、基础操作1.加载JSON数据•从文件加载使用json.load()读取JSON文件并转换为Python对象（字典/列表）：importjsonwithopen('data.json','r',encoding='utf-8')asf:data=json.load(f)•从字符串加载使用json.load
【转载】python json
概念序列化（Serialization）：将对象的状态信息转换为可以存储或可以通过网络传输的过程，传输的格式可以是JSON、XML等。反序列化就是从存储区域（JSON，XML）读取反序列化对象的状态，重新创建该对象。JSON（JavaScriptObjectNotation）：一种轻量级数据交换格式，相对于XML而言更简单，也易于阅读和编写，机器也方便解析和生成，Json是JavaScript中的
算法化资本——智能投顾技术重构金融生态的深度解析田园Coder 人工智能科普人工智能科普
金融市场的数字化进程正经历着本质性跃迁。当传统交易大厅的开放式喊价被服务器集群的低频嗡鸣取代，当投资决策从人类直觉转向概率矩阵计算，一场由人工智能驱动的资本范式革命已悄然降临。智能投顾作为这场变革的核心载体，其技术架构不仅重塑财富管理的运作逻辑，更在认知层面挑战着金融市场的存在根基。理解这场变革的深度与广度，需要穿透技术表象，审视算法与资本结合引发的复杂生态嬗变。智能投顾系统的技术支柱建立于三重认
Python os库完全指南：文件操作必备晨曦543210 Python启航之路 python 开发语言
一、简介Python的os库。这个库主要用于和操作系统交互，比如管理文件、目录、运行系统命令等。二、导入库importos三、基础操作获取当前工作目录current_dir=os.getcwd()print("当前目录:",current_dir)切换目录os.chdir("/path/to/new/directory")列出目录内容files=os.listdir()#不传参数则默认当前目录pr
Python 爬虫实战：Selenium 爬取豆瓣相册（图片分类 + 标签提取）西攻城狮北 python 爬虫 selenium
一、引言豆瓣作为国内知名的社区平台，其相册功能允许用户上传和分享各类图片，涵盖电影海报、音乐专辑、生活记录等多个领域。这些图片数据对于了解用户兴趣、进行内容推荐和市场调研具有重要价值。然而，豆瓣对直接的数据访问设定了诸多限制，因此，本文将介绍如何通过Python爬虫技术结合Selenium自动化工具，合法高效地爬取豆瓣相册图片，并运用深度学习技术实现图片分类和标签提取。二、开发环境搭建（一）编程语
Python JSON操作完全指南
目录一、简介二、JSON和Python的对应关系三、核心函数1.json.dumps()：将Python对象→JSON字符串2.json.loads()：将JSON字符串→Python对象3.json.dump()：将Python对象→JSON文件4.json.load()：从JSON文件→Python对象四、常见错误处理1.JSON解析错误2.类型不支持错误五、总结六、常用函数1️⃣json.d
AI“大航海”时代：企业人力资源的AI-HR实践与效能提升策略
在数字化浪潮的推动下，人工智能（AI）正以前所未有的速度渗透各行各业，人力资源管理（HR）领域也不例外。AI技术的引入与应用落地，不仅提升HR管理效率，更在深层次上带来人力资源运作模式的变革。什么是AI-HR所谓AI-HR，是指将人工智能技术应用于人力资源管理，并通过机器学习、自然语言处理、数据挖掘等技术，优化招聘、培训、绩效评估、员工关系等人力资源各个业务模块。近年来，随着AI技术的成熟和普及，
华为OD机试 - 计算某字符出现次数（Python/JS/C/C++ 2025 B卷 100分）哪吒华为od python javascript 2025B卷华为OD机试
2025B卷华为OD机试统一考试题库清单（持续收录中）以及考点说明（Python/JS/C/C++）。专栏导读本专栏收录于《华为OD机试真题（Python/JS/C/C++）》。刷的越多，抽中的概率越大，私信哪吒，备注华为OD，加入华为OD刷题交流群，每一题都有详细的答题思路、详细的代码注释、3个测试用例、为什么这道题采用XX算法、XX算法的适用场景，发现新题目，随时更新。一、题目描述写出一个程序
华为OD机试 - 取零食 - 动态规划（Python/JS/C/C++ 2024 E卷 100分）哪吒华为od 动态规划 python
2025华为OD机试题库（按算法分类）：2025华为OD统一考试题库清单（持续收录中）以及考点说明（Python/JS/C/C++）。专栏导读本专栏收录于《华为OD机试真题（Python/JS/C/C++）》。刷的越多，抽中的概率越大，私信哪吒，备注华为OD，加入华为OD刷题交流群，每一题都有详细的答题思路、详细的代码注释、3个测试用例、为什么这道题采用XX算法、XX算法的适用场景，发现新题目，随
华为OD机试 - 快速人名查找 - 深度优先搜索dfs（Python/JS/C/C++ 2025 B卷 200分）哪吒华为od 深度优先 python 2025A卷华为OD机试
一、题目描述给一个字符串，表示用","分开的人名。然后给定一个字符串，进行快速人名查找，符合要求的输出。快速人名查找要求：人名的每个单词的连续前几位能组成给定字符串，一定要用到每个单词。二、输入描述第一行是人名，用“，”分开的人名第二行是查找字符串。三、输出描述输出满足要求的人名。四、测试用例测试用例1：1、输入alicebob,charliedelta,alicecharlieac2、输出ali
生成式人工智能认证（GAI认证）含金量怎么样？技能咖 GAI认证生成式人工智能认证人工智能
当生成式人工智能（GenerativeAI）的浪潮以摧枯拉朽之势重塑职业版图时，一个尖锐的问题正悬在无数人的心头：在技术迭代比眨眼更快的时代，如何证明自己具备驾驭AI的核心能力？这场认知革命的背后，一张认证证书的价值早已超越了纸面——它既是个人能力的“信用背书”，也是企业筛选人才的“技术密码”。而生成式人工智能认证（GAI认证）的诞生，恰似一把打开未来之门的密钥，其含金量究竟几何？答案藏在三个维度
全球 AI HR 浪潮下的中国实践：从效率革命到战略重构 weixin_54980836 人工智能重构
一、全球AIHR的技术跃迁与价值重构在DeepSeek、ChatGPT引发的生成式AI革命中，人力资源管理领域正经历着从“工具替代”到“认知重构”的范式转变。Gartner《2025年人力资源技术趋势报告》指出，AI在HR场景的应用已从简历筛选、薪资计算等基础效率工具，升级为支持组织战略决策的“数字伙伴”。这种转变的底层逻辑，源于大模型技术带来的三大突破：多模态交互能力：AI已能同时处理文本、语音
2025上半年最新华为OD机试与面试指南，最新2025B卷独家总结上岸技巧，答读者问！必看！【万字长文，建议收藏】（Python/JS/C/C++）
专栏导读本专栏收录于《华为OD机试真题（Python/JS/C/C++）》。刷的越多，抽中的概率越大，私信哪吒，备注华为OD，加入华为OD刷题交流群，每一题都有详细的答题思路、详细的代码注释、3个测试用例、为什么这道题采用XX算法、XX算法的适用场景，发现新题目，随时更新。2025年5月12日，华为官方已经将华为OD机试（A卷）切换为B卷。目前正在考的是B卷，按照华为OD往常的操作，B卷题目是由往
Jetson Orin NX Super安装TensorRT-LLM u013250861 #LLM/部署&推理 elasticsearch 大数据搜索引擎
根据图片中显示的JetsonOrinNXSuper系统环境（JetPack6.2+CUDA12.6+TensorRT10.7），以下是针对该平台的TensorRT-LLM安装优化方案：一、环境适配调整基于你的实际配置：JetPack6.2（含CUDA12.6,TensorRT10.7）Python3.10.12aarch64架构需选择适配的TensorRT-LLM版本。由于官方预编译包可能未覆盖此
SpringBoot多数据源动态切换方案：AbstractRoutingDataSource详解 fanxbl957 Web spring boot 后端 java
博主介绍：Java、Python、js全栈开发“多面手”，精通多种编程语言和技术，痴迷于人工智能领域。秉持着对技术的热爱与执着，持续探索创新，愿在此分享交流和学习，与大家共进步。DeepSeek-行业融合之万象视界(附实战案例详解100+)全栈开发环境搭建运行攻略：多语言一站式指南(环境搭建+运行+调试+发布+保姆级详解)感兴趣的可以先收藏起来，希望帮助更多的人SpringBoot多数据源动态切换
深入解读MaaS技术架构：从模型服务到智能部署的全流程分析 Cc不爱吃洋葱架构人工智能大语言模型大模型智能部署 MaaS技术架构 LLM
随着人工智能（AI）的迅速发展，MaaS（ModelasaService，模型即服务）技术架构应运而生。它通过将复杂的AI模型封装为标准化服务，降低了模型的开发和部署门槛，帮助企业快速实现业务场景的智能化升级。本文将深入解析MaaS技术架构，详细阐述其各个组成部分以及如何在实际应用中高效发挥其功能。一、使用方层：从应用接入到业务赋能MaaS技术架构的顶层是使用方层，它主要面向第三方应用，是企业与M
想要了解大模型，看懂这一篇就够了！大模型工作流程及核心参数介绍！ Gq.xxu qwen3 vllm transforms 大语言模型部署深度学习人工智能
若想深入探究大模型核心参数的效果与作用，就务必先弄清大模型的工作流程，明确核心参数在流程各阶段的效能与功能，知晓其具体含义。一，大模型的工作流程大模型运行时的工作原理可以概括为输入处理→特征提取→模型推理→结果生成四个核心阶段，整个过程融合了深度学习架构、自然语言处理技术以及分布式计算能力。从用户输入到大模型输出，整个工作的处理流程如下：输入文本→分词→嵌入+位置编码→Transformer多层处
「源力觉醒创作者计划」_以FastDeploy为例部署ERNIE-4.5-21B大模型全流程实践 cooldream2009 大模型基础 AI技术文心大模型 FastDeploy
目录前言1环境准备与依赖安装1.1硬件要求1.2Python环境与pip升级2下载ERNIE-4.5模型权重2.1安装HuggingFaceCLI工具2.2设置国内镜像加速（可选）2.3下载模型文件3安装FastDeploy与Paddle推理引擎3.1安装PaddlePaddle-GPU版本3.2安装FastDeploy-GPU4启动ERNIE-4.5本地服务4.1启动OpenAI兼容API服务4
web报表工具FineReport常见的数据集报错错误代码和解释老A不折腾 web报表 finereport 代码可视化工具
在使用finereport制作报表，若预览发生错误，很多朋友便手忙脚乱不知所措了，其实没什么，只要看懂报错代码和含义，可以很快的排除错误，这里我就分享一下finereport的数据集报错错误代码和解释，如果有说的不准确的地方，也请各位小伙伴纠正一下。 NS-war-remote=错误代码\:1117 压缩部署不支持远程设计 NS_LayerReport_MultiDs=错误代码
Java的WeakReference与WeakHashMap bylijinnan java 弱引用
首先看看 WeakReference wiki 上 Weak reference 的一个例子： public class ReferenceTest { public static void main(String[] args) throws InterruptedException { WeakReference r = new Wea
Linux——（hostname）主机名与ip的映射 eksliang linux hostname
一、什么是主机名无论在局域网还是INTERNET上，每台主机都有一个IP地址，是为了区分此台主机和彼台主机，也就是说IP地址就是主机的门牌号。但IP地址不方便记忆，所以又有了域名。域名只是在公网（INtERNET)中存在，每个域名都对应一个IP地址，但一个IP地址可有对应多个域名。域名类型 linuxsir.org 这样的；主机名是用于什么的呢？答：在一个局域网中，每台机器都有一个主
oracle 常用技巧 18289753290
oracle常用技巧 ①复制表结构和数据 create table temp_clientloginUser as select distinct userid from tbusrtloginlog ②仅复制数据如果表结构一样 insert into mytable select * &nb
使用c3p0数据库连接池时出现com.mchange.v2.resourcepool.TimeoutException 酷的飞上天空 exception
有一个线上环境使用的是c3p0数据库，为外部提供接口服务。最近访问压力增大后台tomcat的日志里面频繁出现 com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResou
IT系统分析师如何学习大数据蓝儿唯美大数据
我是一名从事大数据项目的IT系统分析师。在深入这个项目前需要了解些什么呢？学习大数据的最佳方法就是先从了解信息系统是如何工作着手，尤其是数据库和基础设施。同样在开始前还需要了解大数据工具，如Cloudera、Hadoop、Spark、Hive、Pig、Flume、Sqoop与Mesos。系统分析师需要明白如何组织、管理和保护数据。在市面上有几十款数据管理产品可以用于管理数据。你的大数据数据库可能
spring学习——简介 a-john spring
Spring是一个开源框架，是为了解决企业应用开发的复杂性而创建的。Spring使用基本的JavaBean来完成以前只能由EJB完成的事情。然而Spring的用途不仅限于服务器端的开发，从简单性，可测试性和松耦合的角度而言，任何Java应用都可以从Spring中受益。其主要特征是依赖注入、AOP、持久化、事务、SpringMVC以及Acegi Security 为了降低Java开发的复杂性，
自定义颜色的xml文件 aijuans xml
<?xml version="1.0" encoding="utf-8"?> <resources> <color name="white">#FFFFFF</color> <color name="black">#000000</color> &
运营到底是做什么的？ aoyouzi 运营到底是做什么的？
文章来源：夏叔叔（微信号：woshixiashushu），欢迎大家关注！很久没有动笔写点东西，近些日子，由于爱狗团产品上线，不断面试，经常会被问道一个问题。问：爱狗团的运营主要做什么？答：带着用户一起嗨。为什么是带着用户玩起来呢？究竟什么是运营？运营到底是做什么的？那么，我们先来回答一个更简单的问题——互联网公司对运营考核什么？以爱狗团为例，绝大部分的移动互联网公司，对运营部门的考核分为三块——用
js面向对象类和对象百合不是茶 js 面向对象函数创建类和对象
接触js已经有几个月了,但是对js的面向对象的一些概念根本就是模糊的,js是一种面向对象的语言但又不像java一样有class,js不是严格的面向对象语言 ,js在java web开发的地位和java不相上下 ,其中web的数据的反馈现在主流的使用json,json的语法和js的类和属性的创建相似下面介绍一些js的类和对象的创建的技术一:类和对
web.xml之资源管理对象配置 resource-env-ref bijian1013 java web.xml servlet
resource-env-ref元素来指定对管理对象的servlet引用的声明，该对象与servlet环境中的资源相关联 <resource-env-ref> <resource-env-ref-name>资源名</resource-env-ref-name> <resource-env-ref-type>查找资源时返回的资源类
Create a composite component with a custom namespace sunjing
https://weblogs.java.net/blog/mriem/archive/2013/11/22/jsf-tip-45-create-composite-component-custom-namespace When you developed a composite component the namespace you would be seeing would
【MongoDB学习笔记十二】Mongo副本集服务器角色之Arbiter bit1129 mongodb
一、复本集为什么要加入Arbiter这个角色回答这个问题，要从复本集的存活条件和Aribter服务器的特性两方面来说。什么是Artiber？ An arbiter does not have a copy of data set and cannot become a primary. Replica sets may have arbiters to add a
Javascript开发笔记白糖_ JavaScript
获取iframe内的元素通常我们使用window.frames["frameId"].document.getElementById("divId").innerHTML这样的形式来获取iframe内的元素，这种写法在IE、safari、chrome下都是通过的，唯独在fireforx下不通过。其实jquery的contents方法提供了对if
Web浏览器Chrome打开一段时间后，运行alert无效 bozch Web chorme alert 无效
今天在开发的时候，突然间发现alert在chrome浏览器就没法弹出了，很是怪异。试了试其他浏览器，发现都是没有问题的。开始想以为是chorme浏览器有啥机制导致的，就开始尝试各种代码让alert出来。尝试结果是仍然没有显示出来。这样开发的结果，如果客户在使用的时候没有提示，那会带来致命的体验。哎，没啥办法了就关闭浏览器重启。结果就好了，这也太怪异了。难道是cho
编程之美-高效地安排会议图着色问题贪心算法 bylijinnan 编程之美
import java.util.ArrayList; import java.util.Collections; import java.util.List; import java.util.Random; public class GraphColoringProblem { /**编程之美高效地安排会议图着色问题贪心算法 * 假设要用很多个教室对一组
机器学习相关概念和开发工具 chenbowen00 算法 matlab 机器学习
基本概念：机器学习(Machine Learning, ML)是一门多领域交叉学科，涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为，以获取新的知识或技能，重新组织已有的知识结构使之不断改善自身的性能。它是人工智能的核心，是使计算机具有智能的根本途径，其应用遍及人工智能的各个领域，它主要使用归纳、综合而不是演绎。开发工具 M
[宇宙经济学]关于在太空建立永久定居点的可能性 comsci 经济
大家都知道,地球上的房地产都比较昂贵,而且土地证经常会因为新的政府的意志而变幻文本格式........ 所以,在地球议会尚不具有在太空行使法律和权力的力量之前,我们外太阳系统的友好联盟可以考虑在地月系的某些引力平衡点上面,修建规模较大的定居点
oracle 11g database control 证书错误 daizj oracle 证书错误 oracle 11G 安装
oracle 11g database control 证书错误 win7 安装完oracle11后打开 Database control 后，会打开em管理页面，提示证书错误，点“继续浏览此网站”，还是会继续停留在证书错误页面解决办法：是 KB2661254 这个更新补丁引起的，它限制了 RSA 密钥位长度少于 1024 位的证书的使用。具体可以看微软官方公告：
Java I/O之用FilenameFilter实现根据文件扩展名删除文件游其是你 FilenameFilter
在Java中，你可以通过实现FilenameFilter类并重写accept(File dir, String name) 方法实现文件过滤功能。在这个例子中，我们向你展示在“c:\\folder”路径下列出所有“.txt”格式的文件并删除。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
C语言数组的简单以及一维数组的简单排序算法示例，二维数组简单示例 dcj3sjt126com c array
# include <stdio.h> int main(void) { int a[5] = {1, 2, 3, 4, 5}; //a 是数组的名字 5是表示数组元素的个数，并且这五个元素分别用a[0], a[1]...a[4] int i; for (i=0; i<5; ++i) printf("%d\n",
PRIMARY, INDEX, UNIQUE 这3种是一类 PRIMARY 主键。就是唯一且不能为空。 INDEX 索引，普通的 UNIQUE 唯一索引 dcj3sjt126com primary
PRIMARY, INDEX, UNIQUE 这3种是一类PRIMARY 主键。就是唯一且不能为空。INDEX 索引，普通的UNIQUE 唯一索引。不允许有重复。FULLTEXT 是全文索引，用于在一篇文章中，检索文本信息的。举个例子来说，比如你在为某商场做一个会员卡的系统。这个系统有一个会员表有下列字段：会员编号 INT会员姓名
java集合辅助类 Collections、Arrays shuizhaosi888 Collections Arrays HashCode
Arrays、Collections 1 ）数组集合之间转换 public static <T> List<T> asList(T... a) { return new ArrayList<>(a); } a）Arrays.asL
Spring Security（10）——退出登录logout 234390216 logout Spring Security 退出登录 logout-url LogoutFilter
要实现退出登录的功能我们需要在http元素下定义logout元素，这样Spring Security将自动为我们添加用于处理退出登录的过滤器LogoutFilter到FilterChain。当我们指定了http元素的auto-config属性为true时logout定义是会自动配置的，此时我们默认退出登录的URL为“/j_spring_secu
透过源码学前端之 Backbone 三 Model 逐行分析JS源代码 backbone 源码分析 js学习
Backbone 分析第三部分 Model 概述： Model 提供了数据存储，将数据以JSON的形式保存在 Model的 attributes里，但重点功能在于其提供了一套功能强大，使用简单的存、取、删、改数据方法，并在不同的操作里加了相应的监听事件，如每次修改添加里都会触发 change，这在据模型变动来修改视图时很常用，并且与collection建立了关联。
SpringMVC源码总结（七）mvc:annotation-driven中的HttpMessageConverter 乒乓狂魔 springMVC
这一篇文章主要介绍下HttpMessageConverter整个注册过程包含自定义的HttpMessageConverter，然后对一些HttpMessageConverter进行具体介绍。 HttpMessageConverter接口介绍： public interface HttpMessageConverter<T> { /** * Indicate
分布式基础知识和算法理论 bluky999 算法 zookeeper 分布式一致性哈希 paxos
分布式基础知识和算法理论 BY [email protected] 本文永久链接：http://nodex.iteye.com/blog/2103218 在大数据的背景下，不管是做存储，做搜索，做数据分析，或者做产品或服务本身，面向互联网和移动互联网用户，已经不可避免地要面对分布式环境。笔者在此收录一些分布式相关的基础知识和算法理论介绍，在完善自我知识体系的同
Android Studio的.gitignore以及gitignore无效的解决 bell0901 android gitignore
　　github上.gitignore模板合集，里面有各种.gitignore ： https://github.com/github/gitignore 　　自己用的Android Studio下项目的.gitignore文件，对github上的android.gitignore添加了　　　　　　# OSX files　　　　　　//mac os下　　　　　　.DS_Store
成为高级程序员的10个步骤 tomcat_oracle 编程
What 软件工程师的职业生涯要历经以下几个阶段：初级、中级，最后才是高级。这篇文章主要是讲如何通过 10 个步骤助你成为一名高级软件工程师。 Why 得到更多的报酬！因为你的薪水会随着你水平的提高而增加提升你的职业生涯。成为了高级软件工程师之后，就可以朝着架构师、团队负责人、CTO 等职位前进历经更大的挑战。随着你的成长，各种影响力也会提高。
mongdb在linux下的安装 xtuhcy mongodb linux
一、查询linux版本号： lsb_release -a LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noa