【OpenAI】新功能发布

OpenAI Dev Day 提供了多项更新,总结如下:

GPT 4-Turbo

  • 现在可以通过API使用GPT 4-Turbo。
  • 提供了更长的128k令牌上下文,之前为32k。
  • 相比GPT-4,成本降低了50%以上。
  • 知识更新至2023年4月,之前为2021年9月。
  • 性能优于GPT-4。
  • API现在支持同时提供图片和文本输入。
  • 新的JSON模式可以强制GPT以纯JSON格式响应。
  • 更宽松的频率限制。

自定义GPTs

  • 用户可以构建针对特定任务的“自定义GPT”。
  • 可以无需编码、使用自然语言创建CustomGPTs,并上传文件作为上下文。
  • 企业可以制作针对公司和组织的专有Custom-GPTs。
  • OpenAI提供了两个自定义GPT示例:Canva和ZapierAI。

自定义GPT商店

  • 用户可以将他们的CustomGPTs上传到商店供他人使用。
  • OpenAI将提供收入分享计划,流行模型的作者将获得收益。

助手API

  • 助手API可以让你构建具有访问工具的自主代理。
  • OpenAI目前提供了三个工具:代码解释器(编程)、检索(自定义知识)和函数调用。
  • 可以通过自定义指令定义其角色,就像使用普通API一样。

高质量语音合成

  • OpenAI发布了tts-1和tts-1-hd模型。
  • tts-1模型优化了速度,而tts-1-hd模型优化了质量。
  • 可以从六种声音类型中选择,通过API创建逼真的人声。

版权保护

  • 当使用OpenAI的产品时,版权保护功能可以保护您和您的公司不受版权索赔的影响。

Whisper V3

  • Whisper是OpenAI的语音转文字模型,能够转录声音并输出文本。
  • Whisper是开源的,V3也以开源形式发布。
  • 目前,Whisper v3通过API(付费)还未上线。

企业定制模型

  • 对于特定公司,OpenAI研究团队将创建具有特定领域知识的企业定制模型。

总结和展望
这些更新表明OpenAI在推进其产品线向更加灵活、可定制和用户友好的方向发展。GPT 4-Turbo和自定义GPTs的引入,将使开发者和企业能够更容易地集成和利用大规模语言模型。特别是,自定义GPT的出现可能会改变企业如何利用AI,使其更贴近企业自身的特定需求。这意味着,AI将越来越多地嵌入到日常工作流程中,为特定的任务和流程提供支持。

随着助手API的引入,开发者现在可以构建更智能、更能自主运行的代理,这可能会减少对如Langchain这类抽象层的需求,因为检索功能已内建于API中。最后,通过商业化的自定义GPT和版权保护,OpenAI正在为用户提供一种更安全、合规且具有商业潜力的使用AI的方式。

Examples of GPT Applications

  • Educational Use: Code.org has created a Lesson Planner GPT to assist teachers in crafting engaging curriculum content, like explaining for-loops via video game analogies for middle schoolers.

  • Design Tool Integration: Canva has developed a GPT that starts design processes through natural language prompts, offering a more intuitive interface for design creation.

  • Workflow Automation: Zapier’s GPT enables action across 6,000 applications, showcasing a live demo by Jessica Shay, which involved integrating with her calendar to schedule and manage tasks.

Creation and Distribution of GPTs

  • Building a GPT: Sam Altman demonstrated building a GPT to provide advice to startup founders and developers, showing the simplicity of the GPT builder.

  • GPT Builder Tool: A walkthrough was provided on using the GPT builder tool, highlighting the user-friendly interface and the ability to upload transcripts for personalized advice.

  • Sharing and Discoverability: GPTs can be made private, shared publicly, or restricted to company use on ChatGPT Enterprise.

  • GPT Store Launch: The upcoming launch of the GPT Store will allow users to list and feature GPTs, with compliance to policies and revenue-sharing for creators.

Developer Opportunities

  • API Integration: The same concepts of GPT customization will be available through the API, with enthusiasm expressed for the agent-like experiences developers have been building.

Summary of Assistants API Announcement

Introduction to Assistants API

  • Shopify Sidekick, Discord’s Clyde, and Snap’s My AI have provided great custom assistant experiences but were challenging to build, often requiring months and large engineering teams.

  • A new Assistants API has been announced to simplify the creation of custom assistant experiences.

Features of the Assistants API

  • Persistent Threads: Eliminates the need to manage long conversation histories.

  • Built-In Retrieval: Allows for easy access and utilization of external data.

  • Code Interpreter: Integrates a working Python interpreter in a sandbox for executing code.

  • Improved Function Calling: Enhanced to guarantee JSON output without added latency and to allow multiple functions to be invoked simultaneously.

Demo Overview - “Wanderlust” Travel App

  • Travel App Creation: Used GPT-4 for destination ideas and DALL·E 3 API for illustrations.

  • Assistant Creation: Simple process involving naming, setting initial instructions, selecting the model, and enabling features like Code Interpreter.

  • API Primitives: Threads and messages facilitate user interactions.

  • Application Integration: Demonstrated by adding an assistant to a travel app, which can interact with maps and perform calculations for trip planning.

Retrieval and State Management

  • File Parsing: Assistants can now parse PDFs and other documents, adding retrieved information to the conversation.

  • Stateful API: Simplifies context management by removing the need for developers to handle the entire conversation history.

Developer Transparency

  • Dashboard Access: Developers can view the steps taken by the assistant within the developer dashboard, including thread activities and uploaded documents.

Code Interpreter Capability

  • Dynamic Code Execution: Allows the AI to perform calculations and generate files on the fly.

Voice Integration and Actions

  • Custom Voice Assistant: Demonstrated a voice-activated assistant using new API modalities.

  • Voice to Text and Text to Voice: Utilized Whisper for voice-to-text conversion and SSI for voice output.

  • Function Calling in Action: Executed a function to distribute OpenAI credits to event attendees.

Closing Statements

  • API Beta Access: The Assistants API enters beta, inviting developers to build with it.

  • Future of Agents: Anticipated growth of agents’ ability to plan and perform complex actions.

  • Feedback-Driven Updates: OpenAI emphasizes the iterative development process based on user feedback.

  • New Developments: Introduction of custom versions of ChatGPT, a new GPT-4 Turbo model, and deeper Microsoft partnership.

Special Announcements

  • Credits Giveaway: The assistant granted $500 in OpenAI credits to all event attendees as a demonstration of its capabilities.

你可能感兴趣的:(OpenAI,语音识别,人工智能)