微信小程序在url简化
数据(Analytics / ML)到应用程序—零痛苦 (Data (Analytics/ML) to App — with zero pain)
We all want to showcase our data analytics skills. We all want to wrap around that cool machine learning model in a shiny package and present it to the wide world — as an App.
我们都想展示我们的数据分析技能。 我们所有人都希望将这个凉爽的机器学习模型包装成一个闪亮的包装,并将其作为一个应用程序展示给广阔的世界。
How to free yourself from that two-dimensional world of Jupyter notebook and offer your data science model/pipeline to the outside world?
如何使自己摆脱Jupyter笔记本的二维世界,并向外部世界提供数据科学模型/管道?
Streamlit makes it happen. Painlessly.
Streamlit实现了它。 无痛。
It makes the transition from data (analytics or machine learning) to a shiny app with minimal code and almost no requirement of learning and deploying web service frameworks (like Flask or Django).
它使从数据(分析或机器学习)到具有光泽的应用程序的过渡,只需最少的代码,几乎不需要学习和部署Web服务框架(例如Flask或Django )。
Data scientists can test and showcase their model or statistical analyses on a live web-server as they fiddle with the data and algorithms.
数据科学家在摆弄数据和算法时,可以在实时网络服务器上测试和展示其模型或统计分析。
To know more about the history and origin of this great Python package, see these series of videos from their CEO and founder. I am posting just the first video link, you should watch all four videos in this series.
要了解有关此出色Python包的历史和起源的更多信息,请参阅其CEO和创始人的系列视频。 我只是发布第一个视频链接,您应该观看本系列的所有四个视频。
We all want to wrap around that cool machine learning model in a shiny package and present it to the wide world — as an App.
我们所有人都希望将这个凉爽的机器学习模型包装成一个闪亮的包装,并将其作为一个应用程序展示给广阔的世界。
In this article, I want to cover some essential basics of the framework to get you started on your journey to create data-driven apps, in no time!
在本文中,我想介绍该框架的一些基本基础知识,以使您立即开始创建数据驱动的应用程序!
核心工作流程 (The core workflow)
安装 (Install)
You install Streamlit, of course. It is being updated at a rapid pace. Hence, the — U
is recommended.
当然,您将安装Streamlit。 它正在快速更新。 因此,建议使用— U
。
pip install -U streamlit
创建第一个应用 (Creating the first app)
They have a Hello World kind of app for you to try. But why bother with that when you can code? Why not create your first app with your own data/math skills?
他们有一个Hello World类的应用程序供您尝试。 但是,当您可以编码时,为什么还要打扰呢? 为什么不使用自己的数据/数学技能创建第一个应用程序?
We will create a simple web-app to check if a given number is prime or not.
我们将创建一个简单的网络应用程序,以检查给定数字是否为质数。
Open up your favorite IDE/text editor and type in…
打开您喜欢的IDE /文本编辑器,然后输入...
We can save the script as is_prime.py
. Then, just go the terminal (command prompt) and run this app with,
我们可以将脚本另存为is_prime.py
。 然后,只需进入终端(命令提示符)并使用以下命令运行该应用,
streamlit run is_prime.py
You will see a web-server started automatically serving your app! In the terminal, you will see something like this,
您会看到一个网络服务器开始自动为您的应用提供服务! 在终端中,您会看到类似这样的内容,
The port number may be different depending on what other local servers are running and how they are using specific ports.
端口号可能会有所不同,具体取决于正在运行的其他本地服务器以及它们如何使用特定端口。
Data scientists can test and showcase their model or statistical analyses on a live web-server as they fiddle with the data and algorithms.
数据科学家在摆弄数据和算法时,可以在实时网络服务器上测试和展示其模型或统计分析。
查看您的第一个应用程序并与之交互 (Seeing and interacting with your first app)
You can open your browser and head to localhost:8502
. You will see something cool, like this,
您可以打开浏览器并转到localhost:8502
。 您会看到很酷的东西,像这样,
A simple little math app complete with header and HTML formatted text and all — not bad!
一个简单的小数学应用程序,包含标题和HTML格式的文本以及所有内容-不错!
我们如何生成此应用的? (How did we generate this app?)
So, where did they come from? From your Python script, of course. Let us revisit the code piece by piece,
那么,它们来自哪里呢? 当然,从您的Python脚本开始。 让我们逐一回顾代码,
So that st.title()
function created the header (basically an H1 HTML tag).
因此,该st.title()
函数创建了标头(基本上是H1 HTML标签)。
After that, we wrote a piece of simple Markdown within those """..."""
tags. You can put any markdown inside those triple quotes and they will render beautifully. They follow, of course, the same markdown syntax that you use in your Jupyter notebook markdown cells or in the Readme.md
file for your Github repo.
之后,我们在那些"""..."""
标签内编写了一个简单的Markdown。 您可以在这些三重引号内放入任何减价,它们将呈现精美的外观。 当然,它们遵循与您在Jupyter笔记本Markdown单元或Github存储库的Readme.md
文件中使用的markdown语法相同的语法。
The next function st.header()
is also another markdown. We put a *...*
syntax to stylize it (italic).
下一个函数st.header()
也是另一个markdown。 我们使用*...*
语法对其进行样式化(斜体)。
The next portion is what creates the textbox which accepts the user input (number).
下一部分是什么创建了接受用户输入(数字)的文本框。
The function st.number_input()
essentially creates an HTML form/input box with an expectation of a numerical input type. By default, it accepts a floating-point number, and therefore, in the next line, we convert it to a int
type (just because we are checking primality of the input and they should be integers).
函数st.number_input()
本质上会创建一个带有数字输入类型期望HTML表单/输入框。 默认情况下,它接受浮点数,因此,在下一行中,我们将其转换为int
类型(仅因为我们正在检查输入的素数,并且它们应该为整数)。
The next function st.write()
is a very powerful and general-purpose method to render a wide variety of objects — starting from simple text to Pandas DataFrame and Matplotlib visualizations. Here, we are just writing a simple text value by passing in an argument similar to what we would pass on to a Python print()
function.
下一个函数st.write()
是一种非常强大且通用的方法,可以呈现各种对象-从简单的文本到Pandas DataFrame和Matplotlib可视化。 在这里,我们只是通过传递与传递给Python print()
函数类似的参数来编写一个简单的文本值。
So far, this much code will render like this (without any actual math),
到目前为止,如此多的代码将呈现为这种形式(没有任何实际的数学方法),
So, a user can just type in a number and hit ENTER. Your code will process that input.
因此,用户只需输入数字并按Enter。 您的代码将处理该输入。
Next, in your code, you have a simple Python function which checks whether the input number is prime or not. We will not talk about that function but let us see what is happening after that.
接下来,在您的代码中,您有一个简单的Python函数,该函数检查输入数字是否为质数。 我们不会讨论该功能,但让我们看看之后发生了什么。
So, we are calling that function and storing the returned values in two variables — decision
and divisor
. Then, we are writing two st.write()
functions with a if-else
clause — the first one simply indicating that the number is prime and the second one telling that the number is a non-prime along with showing the smallest divisor. Again, note how we simply pass a Python print-
like string to the second st.write()
function.
因此,我们正在调用该函数并将返回的值存储在两个变量中: decision
和divisor
。 然后,我们使用if-else
子句编写两个st.write()
函数-第一个简单地指示数字为质数,第二个简单地指示数字为非质数,并显示最小的除数。 再次注意,我们如何简单地将类似Python的print-
字符串传递给第二个st.write()
函数。
Test your web-app with various inputs and see the results. For example, if I put 35, then it shows,
使用各种输入测试您的Web应用程序并查看结果。 例如,如果我输入35,则表明
If I put 251, then it shows,
如果我输入251,则表明
I hope you got a sense of the code-flow that we have to write to build simple mathematical analysis apps with Streamlit.
希望您对使用Streamlit构建简单的数学分析应用程序所必须编写的代码流有所了解。
But we are data scientists, we strive for more than a mathematical problem-solving. Let’s get on to that!
但是我们是数据科学家,我们不仅仅致力于解决数学问题。 让我们继续吧!
构建机器学习应用程序的组件 (Components to build Machine Learning app)
组件 (Components)
Before we can build a meaningful machine learning app, we need to know how to put data objects/components onto our app with Streamlit. We show how easy it is to render some of the most data components on a Streamlit-powered app.
在构建有意义的机器学习应用程序之前,我们需要知道如何通过Streamlit将数据对象/组件放到我们的应用程序中。 我们展示了在Streamlit驱动的应用程序中渲染大多数数据组件是多么容易。
The Python script is located here. You can download it and run it. I won’t make the article monotonous by repeating the code but will just show the results on the app.
Python脚本位于此处 。 您可以下载并运行它。 我不会通过重复代码来使文章单调,而只会在应用程序上显示结果。
基本减价以创建格式正确的描述 (Basic markdown to create a nicely-formatted description)
Simple markdown syntax produced this description with multiple levels of headers, hyperlinks, and bulleted lists.
简单的markdown语法产生了具有多个级别的标题,超链接和项目符号列表的描述。
Also, this,
还有这个
显示影像 (Displaying images)
The default markdown image tag is not suitable for controlling the image size. So, we should use st.image()
method to display images.
默认的markdown图片标记不适用于控制图片尺寸。 因此,我们应该使用st.image()
方法显示图像。
Here is a screenshot from the Streamlit website. The image is hosted on my Github repo and we just pass on the URL.
这是Streamlit网站的屏幕截图。 该图片托管在我的Github存储库中 ,我们只传递了URL。
LaTeX渲染 (LaTeX rendering)
The default markdown already handles LaTeX. We can separately use st.latex()
to render LaTeX content.
默认的markdown已经可以处理LaTeX。 我们可以单独使用st.latex()
呈现LaTeX内容。
码 (Code)
We can use st.code()
to render code blocks nicely with optional syntax highlighting.
我们可以使用st.code()
通过可选的语法突出显示来很好地呈现代码块。
本机Python对象 (Native Python objects)
Python objects like lists and dictionaries are rendered in a pretty and visually appealing manner. We use the versatile st.write()
method for all such rendering.
诸如列表和字典之类的Python对象以一种美观且吸引人的方式呈现。 我们使用通用的st.write()
方法进行所有此类渲染。
numpy数组 (Numpy arrays)
Numpy arrays (one- and two-dimensional) are also rendered nicely by the st.write()
method, although for long arrays the vertical rendering can become unwieldy.
用st.write()
方法也可以很好地渲染多个数组(一维和二维),尽管对于长数组而言,垂直渲染可能会变st.write()
笨拙。
熊猫数据框 (Pandas DataFrame)
We can render a Pandas DataFrame either by using st.write()
or st.dataframe()
methods.
我们可以使用st.write()
或st.dataframe()
方法来渲染Pandas DataFrame。
Every time the page refreshes, the code generates new random data, and the table below regenerates as well.
每次页面刷新时 ,代码都会生成新的随机数据,并且下表也会重新生成 。
在DataFrame上应用过滤器 (Applying a filter on the DataFrame)
We filter the DataFrame by selecting only those rows where A1
> 0 and A3
> 3. Note that due to the random nature of the DataFrame generation, there is no guarantee that we will get a non-empty DataFrame every time we re-run the code.
我们仅通过选择A1
> 0和A3
> 3的行来过滤DataFrame。请注意,由于DataFrame生成的随机性,我们无法保证每次重新运行DataFrame时都会获得非空的DataFrame。代码 。
将DataFrame写入文件 (Write the DataFrame to a file)
We can easily ask the user a filename and write the filtered data to that file. Of course, we have to write a bit of code (e.g. user input validity, filename checking logic, etc.) behind the scene, but the resulting app interface is clean and refreshing.
我们可以轻松地向用户询问文件名,并将过滤后的数据写入该文件。 当然,我们必须在幕后编写一些代码(例如,用户输入有效性,文件名检查逻辑等),但是生成的应用程序界面干净整洁。
从网上读取数据文件 (Reading a data file from the web)
Reading data from a remotely hosted file, formatting the file a bit, and rendering in a DataFrame is as easy as the brief code below,
从远程托管的文件中读取数据,对该文件进行一些格式化以及在DataFrame中进行渲染,就像下面的简短代码一样容易,
使用Pandas内置的“绘图”模块的简单条形图 (A simple bar chart using Pandas built-in ‘plot’ module)
We can work on the dataset that we just imported from the web, and call built-in plotting methods.
我们可以处理刚从网络上导入的数据集,并调用内置绘图方法。
This results in a Matplotlib-powered plot as expected (similar to a Jupyter notebook).
这将按预期生成由Matplotlib驱动的绘图(类似于Jupyter笔记本)。
带有Altair库的折线图 (Line chart with Altair library)
Streamlit supports many advanced and highly interactive plotting libraries other than Matplotlib. One of them is Altair, which is built atop Vega (which, in turn, is built atop the ubiquitous and all-powerful D3.js).
Streamlit支持除Matplotlib之外的许多高级且高度交互的绘图库。 其中之一就是Altair,它建在Vega之上(而Vega则建在无处不在且功能强大的D3.js之上)。
We take some of the columns from the DataFrame and create a line chart. You can zoom and pan the chart and even see the HTML code behind it. See the video below.
我们从DataFrame中获取一些列,并创建一个折线图。 您可以缩放和平移图表,甚至可以查看其背后HTML代码。 请参见下面的视频。
演示地址
Streamlit supports many advanced and highly interactive plotting libraries other than Matplotlib.
Streamlit支持除Matplotlib之外的许多高级且高度交互的绘图库。
小部件魔术 (Widget magic)
We can create interactivity with Streamlit very easily. This is one of its primary attractions!
我们可以非常轻松地与Streamlit建立交互性。 这是它的主要吸引力之一!
Let’s say we have a function defined as follows,
假设我们有一个定义如下的函数,
We can create a slider with just one line of code,
我们可以只用一行代码创建一个滑块,
We can move the slider left and right and the function is evaluated dynamically and the value can be displayed live. Just like a web-app is expected to behave.
我们可以左右移动滑块,动态评估函数,并且可以实时显示该值。 就像预期的网络应用程序一样。
演示地址
该函数的Matplotlib图 (A Matplotlib graph of the function)
The code below graphs the function above using plain vanilla Matplotlib
and a single Streamlit
call st.pyplot()
for rendering.
下面的代码使用普通的香草Matplotlib
和单个 Streamlit
调用 st.pyplot()
绘制上述函数的图形以进行渲染。
This chart, unlike the Altair chart above, is not a dynamic chart. However, note that the Matplotlib
code contains a fair bit of sophistication (even a LaTeX formatted string in the title). All of that is flawlessly handled by the st.pyplot()
function.
与上面的Altair图表不同,此图表不是动态图表。 但是,请注意, Matplotlib
代码包含相当复杂的内容(即使标题中也包含LaTeX格式的字符串)。 所有这些都由st.pyplot()
函数完美地处理。
交互式线性回归问题 (An interactive linear regression problem)
Finally, we show, how to generate a linear regression dataset with a tunable level of noise using simple widgets from Streamlit.
最后,我们展示了如何使用Streamlit的简单小部件生成具有可调噪声水平的线性回归数据集。
交互式数据生成 (Interactive data generation)
In the previous two sections, we introduced the slider widget and the pyplot. In this section, we combine them in a dynamic fashion.
在前两节中,我们介绍了滑块小部件和pyplot。 在本节中,我们以动态方式将它们组合在一起。
Below, the sliders can be adjusted to tune the level of the noise. Every time you move the sliders, you essentially generate a new linear regression problem (with the same features but with slightly different observed data).
在下面,可以调节滑块以调节噪音水平。 每次移动滑块时,实际上都会产生一个新的线性回归问题 (具有相同的特征,但观察到的数据略有不同)。
The data vs. feature plot, which is dynamically updated, is shown below to illustrate this point.
动态更新的数据与特征图如下所示,以说明这一点。
Move the “Noise std. dev” slider all the way from the left end to right end to observe the impact on the observed data. Do you see the observed data becoming noisier as you move the slider towards right?
移动“噪音标准”。 “开发”滑块从左端一直到右端,以观察对观察数据的影响。 向右移动滑块时,您是否看到观察到的数据变得更嘈杂 ?
演示地址
拟合模型-交互式! (Fitting the model — interactively!)
Next, we fit a LinearRegression()
model from the famous scikit-learn
package with our data, and show the model coefficients and the R² metric.
接下来,我们适应LinearRegression()
从著名模特scikit-learn
包与我们的数据,并显示模型系数和将R²度量。
Note, how they change slightly with each iteration of a new problem generation. Note that we chose a1 = 2.5 and b1= 5 and the estimates should come close to these numbers.
请注意,它们随着新问题的每次迭代而略有变化。 请注意,我们选择a1 = 2.5和b1 = 5,并且估计值应该接近这些数字。
The noise mean value primarily impacts the bias term whereas noise std. dev primarily impacts the linear coefficient. You will also notice that the R² score generally becomes lower as the noise std. dev increases i.e. the linear model has a hard time explaining the variance in the observed data (if the spread of the noise is high).
噪声平均值主要影响偏置项,而噪声std则对其影响最大。 dev主要影响线性系数。 您还会注意到的R²一般得分变得噪音低性病。 dev增加,即线性模型很难解释所观察到的数据中的方差(如果噪声的散布很大) 。
演示地址
All these linear regression stuff is nothing new. But note how easy the modeling exercise seems with a live web-app to fit numerous models (corresponding to dynamically generated datasets) just by operating a couple of sliders!
所有这些线性回归的东西都不是新鲜事物。 但是请注意,只需通过操作几个滑块,就可以使用实时网络应用程序轻松模拟建模工作,以适应众多模型(对应于动态生成的数据集)!
摘要 (Summary)
That’s it for now. However, I am sure you got the idea that the possibilities are endless with this library.
现在就这样。 但是,我敢肯定,您会想到该库带来无限可能。
Here is their official API reference guide and here are more tutorials.
这是他们的官方API参考指南 ,还有更多教程 。
You can create really complex data analytics and machine learning pipeline in the backend (using Jupyter Notebook or a complex web of Python modules, whatever you fancy) and render all the output results and visualizations in a simple, live web-app with minimal coding overhead.
您可以在后端中创建真正复杂的数据分析和机器学习管道(使用Jupyter Notebook或复杂的Python模块网络,无论您喜欢什么),并以简单的实时网络应用程序呈现所有输出结果和可视化效果,而只需最少的编码开销。
Next, I plan to build Streamlit-powered apps with more intricate data processing and visualization at the backend and share with you here on my Medium page.
接下来,我计划在后端构建具有更复杂的数据处理和可视化功能的Streamlit驱动的应用程序,并在“中型”页面上与您共享。
Stay tuned.
敬请关注。
…how easy the modeling exercise seems with a live web-app to fit numerous models (corresponding to dynamically generated datasets) just by operating a couple of sliders!
…通过运行一个Web应用程序 ,通过操作几个滑块就能适应众多模型(对应于动态生成的数据集) ,建模工作看起来多么容易 !
Also, you can check the author’s GitHub repositories for code, ideas, and resources in machine learning and data science. If you are, like me, passionate about AI/machine learning/data science, please feel free to add me on LinkedIn or follow me on Twitter.
同样 ,您可以检查作者的GitHub 存储库 ,以获取机器学习和数据科学中的代码,思想和资源。 如果您像我一样对AI /机器学习/数据科学充满热情,请随时在LinkedIn上添加我或在Twitter上关注我 。
翻译自: https://towardsdatascience.com/data-analytics-to-web-app-streamlit-made-easy-ed687266f0e8
微信小程序在url简化