机器学习 对模型进行惩罚
GitHub Repo: ml-streamlit-demo
GitHub存储库: ml-streamlit-demo
Bringing a Machine Learning model outside of a notebook environment and turning it into a beautiful data product used to be a lot of work. Luckily, there’s a lot of tooling being developed in this area to make prototyping easier. A while ago, I came across Streamlit, an open source Python library for building custom web apps.
将机器学习模型带到笔记本环境之外,然后将其转变为漂亮的数据产品,这过去需要做很多工作。 幸运的是,该领域正在开发许多工具,以简化原型设计。 前一段时间,我遇到了Streamlit ,这是一个用于构建自定义Web应用程序的开源Python库。
It’s quick and easy to get started and took me less than 30 minutes to build an app with a pre-trained model. Since then, I have been using Streamlit for prototyping models and demonstrating their capabilities. The nice user interface is a refreshing change from using my console for predictions.
它快速且容易上手,用了不到30分钟的时间我就构建了具有预训练模型的应用程序。 从那时起,我一直在使用Streamlit制作模型原型并展示其功能。 与使用我的控制台进行预测相比,漂亮的用户界面是一个令人耳目一新的变化。
I’m going to be sharing the process in this article, using GPT-2 as an example.
我将以GPT-2为例分享本文中的过程。
在本文中,我们将: (In this article, we will:)
load GPT-2 from HuggingFace’s transformer library
从HuggingFace的变压器库中加载GPT-2
- serve the text generator in a simple web app using Streamlit 使用Streamlit在简单的Web应用程序中提供文本生成器
从变形金刚加载GPT-2 (Load GPT-2 from Transformers)
GPT-2 is a transformers model trained on a very large English corpus for the purpose of predicting the next word in a phrase. It’s recent successor, GPT-3, have shocked the world with what it is capable of doing.
GPT-2是在非常大型的英语语料库上训练的变形模型,用于预测短语中的下一个单词。 它最近的继任者GPT-3 ,以其强大的功能震惊了世界。
Let’s begin by installing all of the Python prerequisites. This is what my requirements.txt
looks like. While not referenced anywhere in the code, GPT-2 requires either TensorFlow 2.0 or PyTorch to be installed.
让我们从安装所有Python先决条件开始。 这就是我的requirements.txt
样子。 尽管代码中未引用GPT-2,但需要安装TensorFlow 2.0或PyTorch。
streamlit==0.56.0
tensorflow==2.2.0
transformers==3.0.2
In this example, we are going to work out of a single Python script. Here is the class for loading GPT-2 and using it to generate text when given a starting phrase. The max_length
attribute indicates the max length of the generated text.
在此示例中,我们将使用单个Python脚本。 这是用于加载GPT-2并在给定起始短语时使用它生成文本的类。 max_length
属性指示生成的文本的最大长度。
通过Streamlit服务模型 (Serve the Model with Streamlit)
I’m going to define another function for instantiating the generator. The purpose of this function is to help with caching the load_generator
method to make subsequent predictions faster.
我将定义另一个用于实例化生成器的函数。 此函数的目的是帮助缓存load_generator
方法,以使后续预测更快。
The st.cache
decorator tells Streamlit to skip execution if the function has been executed already.
如果函数已经执行,则st.cache
装饰器告诉Streamlit跳过执行。
The entry point for this file defines the layout for the UI.
该文件的入口点定义UI的布局。
快速演示 (A Quick Demo)
To start the app, run streamlit run filepath.py]
. If you are using my repository, this would be streamlit run models/gpt_demo.py
要启动该应用程序,请运行streamlit run filepath.py]
。 如果您使用的是我的存储库,则可以streamlit run models/gpt_demo.py
The app should be automatically launched in the browser. If not, it will be at http://localhost:8501/. After the generator is loaded, predictions should be much faster!
该应用程序应在浏览器中自动启动。 如果不是,它将位于http:// localhost:8501 / 。 生成器加载后,预测应该快得多!
I haven’t used this yet, but if you click on the hamburger menu (3 horizontal lines in the top right corner), there is an option for recording your demo using Streamlit. Pretty neat.
我还没有用过,但是如果您单击汉堡菜单(右上角有3条水平线),则可以使用Streamlit录制演示。 漂亮整齐。
让GPT-2完成一些想法! (Let’s have GPT-2 finish some thoughts!)
我渴望... (I am craving …)
建立时间机器的秘诀... (The secret to building a time machine …)
And there you have it — a pretty web app built in under 5 minutes.
在那里,您可以在5分钟之内构建一个漂亮的Web应用程序。
感谢您的阅读! (Thank you for reading!)
If you enjoyed this article, check out my other articles on Data Science, Math and Programming. Follow me on Medium for the latest updates.
如果您喜欢这篇文章,请查看我关于数据科学,数学和编程的其他文章。 按照我在Medium上的最新更新。
I am also building a comprehensive set of free Data Science lessons and practice problems at www.dscrashcourse.com as a hobby project.
我还在业余爱好www.dscrashcourse.com上构建了一套全面的免费数据科学课程,并练习了问题。
If you want to support my writing, consider using my affiliate link the next time you sign up for a Coursera course. Full disclosure — I receive a commission for every enrollment, but it comes at no extra cost for you.
如果您想支持我的写作,请在下次注册Coursera课程时考虑使用“我的会员”链接 。 完全披露-每次注册我都会获得佣金,但这对您没有任何额外费用。
Thank you again for reading!
再次感谢您的阅读!
翻译自: https://towardsdatascience.com/prototyping-machine-learning-models-with-streamlit-1134c34e9620
机器学习 对模型进行惩罚