3ml乐谱制作工具
The more time you spend working with machine learning models the more you realize how important it is to properly understand exactly what your model is doing and how well it is doing it. In practice, keeping track of how your model is performing, especially when testing a variety of model parameter combinations, can be tedious in the best of circumstances. In most cases I find myself building my own tools to debug and analyze my machine learning models.
您花在机器学习模型上的时间越多,您就越能意识到正确理解模型在做什么以及它做得如何的重要性。 在实践中,在最佳情况下,跟踪模型的运行情况(尤其是在测试各种模型参数组合时)可能很繁琐。 在大多数情况下,我发现自己会构建自己的工具来调试和分析机器学习模型。
Recently while working on a slew of different models for MAFAT’s doppler-pulse radar classification challenge, read more here, I found myself wasting time manually building these debugging tools. This was especially tedious as I was working on building an ensemble, a collection of machine learning models for a majority classification strategy that can be very effective if done correctly. The problem with creating an ensemble is the variety of different models and diversity of classification that is required to make the strategy effective. This means training more models, performing more analysis, and understanding the impact of more parameters on the overall accuracy and effectiveness of your model. Again, this required me to spend more time trying to create my own debugging tools and strategies. To better utilize my time and resources I decided to turn to the range of tools available online for debugging and analyzing machine learning models. After trialing a few different options, I was able to narrow down my list to two great tools every data scientist should consider when developing and refining their machine learning apparatuses:
最近,在为MAFAT的多普勒脉冲雷达分类挑战研究各种模型时, 在这里了解更多 ,我发现自己在浪费时间手动构建这些调试工具。 当我正在构建整体时,这特别繁琐,这是针对多数分类策略的机器学习模型的集合,如果正确完成,这将非常有效。 建立整体的问题在于,要使该策略有效,需要使用各种不同的模型和不同的分类。 这意味着需要训练更多的模型,执行更多的分析并了解更多参数对模型的整体准确性和有效性的影响。 同样,这需要我花更多时间尝试创建自己的调试工具和策略。 为了更好地利用我的时间和资源,我决定使用在线上可用的一系列工具来调试和分析机器学习模型。 在尝试了几种不同的选择之后,我能够将清单缩小为两个数据科学家在开发和完善他们的机器学习设备时应该考虑的两种出色工具 :
Weights and Biasesand Uber’s Manifold
权重和偏见 和 Uber的歧管
In this blog, I will discuss the two tools, what they are, how they are used, and how they can help you save time and stress on your next big project.
在此博客中,我将讨论这两种工具的含义,用法以及如何帮助您节省时间和减轻下一个大型项目的压力。
权重和偏见 (Weights and Biases)
Those who don’t track training are doomed to repeat it.- W&B product tag
那些不跟踪培训的人注定要重复训练。-W&B产品标签
Weights and Biases, or W&B, is a company based out of San Francisco that provide a range of deep learning and machine learning tools that can be seamlessly integrated into an existing or new project. Although W&B provides many functions, its main role is tracking the realtime performance of model variations within your project. To this end, it is incredibly useful. As I experimented with my project I often lost track of which changes I had made when, and if those changes had a positive or negative impact on the various evaluation metrics of my project. W&B allows you to store and visualize these evaluation metrics in a variety of ways. The two I found most useful included the charting and the tabling:
W&B,Weights and Biases,是一家总部位于旧金山的公司,提供一系列深度学习和机器学习工具,可以将其无缝集成到现有或新项目中。 尽管W&B提供了许多功能,但其主要作用是跟踪项目中模型变体的实时性能。 为此,它非常有用。 当我对项目进行试验时,我常常无法跟踪何时进行了哪些更改,以及这些更改是否对项目的各种评估指标产生了正面或负面的影响。 W&B允许您以多种方式存储和可视化这些评估指标。 我发现最有用的两个包括图表和制表符:
As you can see, the line charts track the performance of various models using different metrics during training. This allows for seamless side by side comparison to check for things like overfitting or which models performed the best on the validation set.
如您所见,折线图在训练期间使用不同的指标跟踪各种模型的性能。 这样可以进行无缝的并排比较,以检查诸如过度拟合之类的事情,或者哪些模型在验证集上表现最佳。
那么W&B如何与您的项目联系起来? (So how does W&B link up with your project?)
After creating an account on W&B’s website you have to install and log onto your profile from the environment you are using for your project.
在W&B网站上创建帐户后,您必须从用于项目的环境中安装并登录到个人资料。
!pip install --upgrade wandb
!wandb login
From there your circumstances may differ depending on which deep learning or machine learning tool kit you are using. For my project I used Keras, but the documentation for other projects is clear and easy to implement:
从那里开始,您所使用的深度学习或机器学习工具套件的情况可能会有所不同。 在我的项目中,我使用了Keras,但是其他项目的文档清晰易懂:
#Import the package
import wandb
from wandb.keras import WandbCallback#Initialize the W&B object
wandb.init(project="tester")#Link the model with W&B’s tracking metrics
model.fit(X_train, y_train, validationData=(X_test, y_test) epochs=config.epochs, callbacks=[WandbCallback()])model.save(os.path.join(wandb.run.dir, "model.h5"))
As your model trains, the progress is tracked and updated in realtime on your W&B account where you can easily analyze and evaluate the performance of your model. From here you can choose to create a report providing a more professional and digestible view of your results which you can overlay with text and other visuals.
在训练模型时,会在W&B帐户中实时跟踪和更新进度,您可以在其中轻松分析和评估模型的性能。 在这里,您可以选择创建一个报告,以提供更专业,更易理解的结果视图,您可以在其中覆盖文本和其他视觉效果。
It cannot be overstated how helpful it is to track the performance of your models, especially when you are altering parameters and trialing a range of techniques. It is so helpful in fact, large companies such as OpenAI and Toyota Research regularly use and laud W&B as a flexible and useful tool for their projects.
不能夸大其词对跟踪模型的性能有多大帮助,尤其是在您更改参数并尝试各种技术时。 实际上,它非常有用,例如OpenAI和Toyota Research之类的大公司经常将W&B用作其项目的灵活有用工具。
Get started for free here.
在这里免费开始使用。
优步的歧管 (Uber’s Manifold)
For my project, I am creating an Ensemble. An Ensemble is a collection of machine learning algorithms that each individually train and predict on the same data. The advantage of an Ensemble is that it provides a range of different strategies for finding a solution and utilizes a majority vote that democratizes the classification by all the models. This is useful because whilst an individual model may predict some portions well, it may struggle on other portions of the data. Hence, an ensemble is just the machine learning version of the strength in numbers adage. In order for an ensemble to perform well, the individual models that make it up must have diversity of prediction. Diversity of prediction is a fancy way of saying that the models can’t all be predicting exactly same for the exact points, rather they should be performing well on different selections of points. This raises the question however, how do you know if the parts of your ensemble are diversifying their predictions? This is where the transportation tech giant Uber’s Manifold comes in.
对于我的项目,我正在创建一个合奏。 集合是机器学习算法的集合,每个算法分别对相同的数据进行训练和预测。 集成的优势在于,它提供了一系列不同的策略来寻找解决方案,并利用多数票使所有模型的分类民主化。 这很有用,因为尽管单个模型可以很好地预测某些部分,但它可能会在数据的其他部分上挣扎。 因此,合奏只是数字格言强度的机器学习版本。 为了使整体表现良好,组成整体的各个模型必须具有多样化的预测。 预测的多样性是一种很好的说法,即模型不能完全针对精确的点进行完全相同的预测,而应该在不同的点选择上表现良好。 这就提出了一个问题,但是,您如何知道合奏的各个部分是否正在使他们的预测多样化? 交通科技巨头Uber的Manifold就是在这里出现的。
Uber’s Manifold is an open-source long-term project that aims to provide a model-agnostic visual debugging tool for machine learning. In layman's terms Manifold allows you to visualize which subset of the data your model or models are underperforming on and which features are causing ambiguity.
Uber的Manifold是一个开源长期项目,旨在为机器学习提供与模型无关的可视调试工具。 用通俗易懂的术语来说,歧管可以使您可视化您的一个或多个模型的数据子集表现不佳以及哪些特征引起歧义。
As you can imagine this is very useful when working on ensembles. The tool creates a widget output that can be interacted with within your notebook for quick analysis. It is important to note, however, that this tool currently only works in classic Jupyter notebooks. It doesn’t function on Jupyter Lab or Google’s Colab.
可以想象,这在合奏时非常有用。 该工具创建了一个小部件输出,可以在您的笔记本中与之交互以进行快速分析。 但是,请务必注意,该工具目前仅在经典的Jupyter笔记本电脑上可用 。 它无法在Jupyter Lab或Google的Colab上运行。
Manifold works by using k-means clustering, a neighbor grouping technique, to separate the prediction data into performance similarity segments. You can imagine this as splitting the data into subcategories of similarity. The models are then plotted along each segment, where the further to the left the model is the better it performed on that segment, you can see this on a randomly generated example below:
流形通过使用k-means聚类(一种邻居分组技术)将预测数据分为性能相似性段来工作。 您可以想象这是将数据分成相似的子类别。 然后沿着每个分段绘制模型,其中模型越靠左,则在该分段上执行得越好,您可以在以下随机生成的示例中看到这一点:
In the example above we have three models and the input data has been split into four segments. Using log-loss as our performance metric we can see that model_1 performs poorly on segment_0, whereas model_2 performs poorly on segment_2. The shape of the lines represents the performance distribution and the height of the lines represents the relative data point count at that log-loss. So again, for example, on model_1 in segment_1, we can see that there is a low but intense concentration of points with a log loss of 1.5.
在上面的示例中,我们有三个模型,输入数据已分为四个部分。 使用对数损失作为我们的性能指标,我们可以看到model_1在segment_0上的表现不佳,而model_2在segment_2上的表现不佳。 线的形状表示性能分布,线的高度表示在该对数损失时的相对数据点计数。 再次如此,例如,在segment_1的model_1上,我们可以看到点集中度很低,但集中度很强,对数损失为1.5。
Manifold also offers a feature attribution view:
集成块还提供了功能归因视图:
The feature attribution view highlights the distribution of features for each segmentation. In the example above data group 0 includes clusters two and three, and we are comparing them to data group 1 which includes clusters zero and one. Along the x-axis is the feature values and the y-axis is the intensity of the cause. Feature_0 highlights these differences at small intervals whereas feature_1 highlights the histogram of feature values. Because this is an interactive widget the values aren’t shown unless moused over. If you are interested in a closer look check out the example here.
要素归因视图突出显示每个细分的要素分布。 在上面的示例中, 数据组0包含聚类2和3,我们将它们与数据组1进行了比较, 数据组1包含聚类0和1。 沿着x轴是特征值,而y轴是原因的强度。 Feature_0高亮显示这些差异,而Feature_1高亮显示特征值的直方图。 由于这是一个交互式小部件,因此除非将鼠标悬停在其中,否则不会显示值。 如果您对进一步研究感兴趣, 请在此处查看示例 。
那么如何将歧管集成到我们的项目中呢? (So how do we integrate Manifold in our project?)
Manifold is still in the early stages of development and there are still some bugs and nuances to the tool, however, this should not discourage you from trying to use it in your own project. In my circumstances, I needed to install a few packages to get it to work in my Jupyter notebook. This required some trial and error but eventually resulted in the following commands:
Manifold仍处于开发的早期阶段,该工具仍然存在一些错误和细微差别,但是,这不应阻止您尝试在自己的项目中使用它。 在我的情况下,我需要安装一些软件包才能使其在Jupyter笔记本中正常工作。 这需要反复试验,但最终导致以下命令:
!jupyter nbextension install --py --sys-prefix widgetsnbextension
!jupyter nbextension enable --py --sys-prefix widgetsnbextension
!pip install mlvis
!jupyter nbextension install --py --symlink --sys-prefix mlvis
!jupyter nbextension enable --py --sys-prefix mlvis
It wasn’t sufficient to just install the nbextention packages, I also had to enable the packages. From here we can import a few tools for our demo:
仅安装nbextention软件包是不够的,我还必须启用这些软件包。 从这里我们可以导入一些用于演示的工具:
from mlvis import Manifold
import sys, json, math
from random import uniform
To use the Manifold framework your data needs to grouped into three specific formats. The first group is all of your x-values, which must be in a list of dictionaries:
要使用Manifold框架,您的数据需要分为三种特定格式。 第一组是您的所有x值,这些值必须在字典列表中:
#Example of x-values
x = [
{'feature_0': 21, 'feature_1': 'B'},
{'feature_0': 36, 'feature_1': 'A'}
]
The second group is your different model predictions, which must be a list of a list where each list is a different model’s predictions:
第二组是您不同的模型预测,它必须是列表的列表,其中每个列表都是不同模型的预测:
#Example of model predictions
yPred = [
[{'false': 0.1, 'true': 0.9}, {'false': 0.8, 'true': 0.2}],
[{'false': 0.3, 'true': 0.7}, {'false': 0.9, 'true': 0.1}],
[{'false': 0.6, 'true': 0.4}, {'false': 0.4, 'true': 0.6}]
]
The final group is the ground truth values or actual correct y-values, which are in a list of values:
最后一组是地面真值或实际正确的y值,它们在值列表中:
#Example of ground truth
yTrue = [
'true',
'false'
]
Once your data is in this format we can pass the values into the Manifold object and execute to get the widget, which looks like the examples above:
一旦您的数据采用这种格式,我们就可以将值传递到Manifold对象中并执行以获取小部件,如上面的示例所示:
Manifold(props={'data': {
'x': x,
'yPred': yPred,
'yTrue': yTrue
}})
Using the Manifold tool you can then visually evaluate how your different models are performing on the same data. In my case, this was very helpful for building the ensemble because it allowed for me to understand which models performed where, and which data clusters were the hardest for the models to classify. Manifold also helped me evaluate the diversity of prediction for each model within the ensemble allowing me to construct a more robust apparatus that was able to classify over a range of different inputs.
然后,使用歧管工具,您可以直观地评估不同模型对相同数据的性能。 就我而言,这对于构建集成非常有帮助,因为它使我能够了解哪些模型在哪里执行,哪些数据集群是模型最难分类的。 流形还帮助我评估了集成中每个模型的预测多样性,使我能够构建一个更强大的设备,该设备能够对一系列不同的输入进行分类。
结论 (Conclusion)
Throughout this blog, I have mentioned a few times what I have been using these tools for. As I outlined in another blog I have been working on MAFAT’s doppler-pulse radar classification challenge, a machine learning challenge with a $40,000 prize pool. Both the tools I mentioned above have been increasingly useful for me in working through this challenge and obtaining tangible improvements to my apparatus' performance. Going forward I will be continuing to work on this challenge, and you can expect another blog where I go into more detail about how I specifically used these tools in creating a better model.
在整个博客中,我多次提到我一直在使用这些工具。 正如我在另一个博客中概述的那样,我一直在研究MAFAT的多普勒脉冲雷达分类挑战赛 ,这是一个机器学习挑战赛, 奖金池为40,000美元 。 上面提到的两种工具对我来说都是越来越有用的,它可以帮助我应对这一挑战并切实改善仪器的性能。 展望未来,我将继续努力应对这一挑战,您可以期待另一个博客,在该博客中我会详细介绍如何专门使用这些工具来创建更好的模型。
资料来源: (Sources:)
A link to my earlier MAFAT Challenge blog: https://medium.com/gsi-technology/training-a-model-to-use-doppler-pulse-radar-for-target-classification-2944a312148c
我以前的MAFAT挑战博客的链接: https ://medium.com/gsi-technology/training-a-model-to-use-doppler-pulse-radar-for-target-classification-2944a312148c
Li, L. (2019, August 12). Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber. Retrieved September 04, 2020, from https://eng.uber.com/manifold/
李琳。(2019年8月12日)。 集成块:Uber上用于机器学习的模型不可知的视觉调试工具。 于2020年9月4日从https://eng.uber.com/manifold/检索
Lutins, E. (2017, August 02). Ensemble Methods in Machine Learning: What are They and Why Use Them? Retrieved September 04, 2020, from https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why-use-them-68ec3f9fef5f
Lutins,E.(2017年8月2日)。 在机器学习中集成方法:它们是什么,为什么要使用它们? 于2020年9月4日从https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why-use-them-68ec3f9fef5f检索
Weights & Biases — Developer tools for ML. (n.d.). Retrieved September 04, 2020, from https://www.wandb.com/
权重和偏见-ML的开发人员工具。 (nd)。 于2020年9月4日从https://www.wandb.com/检索
All images used are either created by myself or used with the explicit permission of the authors. Links to the author’s material are included under each image.
所有使用的图像要么由我自己创建,要么在作者的明确许可下使用。 每个图像下方都包含指向作者资料的链接。
翻译自: https://towardsdatascience.com/two-tools-every-data-scientist-should-use-for-their-next-ml-project-fa4fce5cf868
3ml乐谱制作工具