weka机器学习实战_Weka教程–使用Java的基于GUI的机器学习

weka机器学习实战

Nowadays, programming languages such as Python and R are undoubtedly some of the most in-demand languages in Data Science and Machine Learning.

如今,诸如Python和R之类的编程语言无疑是数据科学和机器学习中最受欢迎的一些语言。

But is it also possible to perform common Machine Learning and Data Science tasks without necessarily being proficient in coding?

但是,不必熟练地编码也可以执行常见的机器学习和数据科学任务吗?

Of course it is! Weka is a Graphical User Interface-based open-source package. It can be used in order to perform common Data Science tasks just by using the graphical interface.

当然是的! Weka是基于图形用户界面的开源软件包。 仅通过使用图形界面,即可将其用于执行常见的Data Science任务。

基本 (Basics)

Weka can be easily installed on any type of platform by following the instructions at the following link. The only pre-requisite is having Java 8.0 installed on your local machine.

按照以下链接中的说明,Weka可以轻松安装在任何类型的平台上。 唯一的先决条件是在本地计算机上安装Java 8.0。

Once you've installed Weka, you will have a set of standard data processing and inference techniques such as:

安装Weka之后,您将拥有一套标准的数据处理和推断技术,例如:

  • Data Pre-processing: once you've loaded a dataset, Weka enables you to quickly explore its attributes and instances. Additionally, different filtering techniques are available in order to, for example, convert categorical data into numerical or perform feature selection in order to reduce the dimensionality of our dataset (eg. to speed up training times and performance).

    数据预处理 :加载数据集后,Weka使您能够快速浏览其属性和实例。 此外,可以使用不同的过滤技术,例如,将分类数据转换为数值数据或执行特征选择 ,以减少数据集的维数(例如,加快训练时间和提高性能)。

  • Classification and Regression Algorithms: a collection of different algorithms such as Gaussian Naive Bayes, Decision Trees, K-Nearest Neighbour, Ensembles techniques, and various linear regression variants.

    分类和回归算法: 各种不同算法的集合,例如高斯朴素贝叶斯算法,决策树,K最近邻,Ensembles技术以及各种线性回归变量。

  • Clustering: this technique can be used in order to identify the main categories in our data in an unsupervised way. Some example algorithms available in the Weka collection are K-Means Clustering and Expectation Maximisation.

    聚类:可以使用此技术以无监督的方式识别数据中的主要类别。 Weka集合中可用的一些示例算法是K均值聚类和期望最大化。

  • Discovering Associations: discovering rules in our dataset in order to more easily identify patterns and connections between the different features.

    发现关联:发现我们数据集中的规则,以便更轻松地识别不同要素之间的模式和联系。

  • Data Visualisation: a suite of integrated data visualisation techniques to quickly visualise correlations between features and represent learned machine learning patterns such as Decision Trees and K-Means Clustering.

    数据可视化: 一套集成的数据可视化技术,可以快速可视化要素之间的相关性,并表示学习的机器学习模式,例如决策树和K-Means聚类。

Another interesting feature of Weka is the ability to install new packages as they are created.

Weka另一个有趣的功能是能够在创建新软件包时安装它们。

One example of an additional package you can install is AutoML. AutoML can in fact be particularly useful for beginners who might find it difficult to identify what Machine Learning model might be best to use for a specific task.

您可以安装的其他软件包的一个示例是AutoML。 实际上,AutoML对于可能难以确定哪种机器学习模型最适合特定任务的初学者特别有用。

Using the Weka AutoML package, you can easily test different Machine Learning models on the fly. It also allows you to auto-tune its hyper-parameters in order to increase performance.

使用Weka AutoML软件包,您可以轻松地动态测试不同的机器学习模型。 它还允许您自动调整其超参数以提高性能。

Finally, for more expert users, Weka also offers a command line interface to use Java code. This can be particularly useful especially if you're working with large amounts of data.

最后,对于更多的专业用户,Weka还提供了使用Java代码的命令行界面。 这在处理大量数据时特别有用。

(Example)

We are now going to walk through a simple example in order to demonstrate how to get started with Weka.

现在,我们将通过一个简单的示例来演示如何开始使用Weka。

First of all, we can start our analysis by opening Weka Explorer and opening our dataset (in this example, the Iris Dataset).

首先,我们可以通过打开Weka Explorer并打开数据集(在此示例中为Iris数据集)来开始分析。

weka机器学习实战_Weka教程–使用Java的基于GUI的机器学习_第1张图片
Figure 1: Importing and Visualising the data 图1:导入和可视化数据

Select the Classify tab, choose Naive Bayes as our classifier, and click start. You'll see that we can quickly achieve 96% classification accuracy without having to write any code!

选择分类选项卡,选择朴素贝叶斯作为我们的分类器,然后单击开始。 您会看到,无需编写任何代码,我们就能快速达到96%的分类精度!

weka机器学习实战_Weka教程–使用Java的基于GUI的机器学习_第2张图片
Figure 2: Naive Bayes Classification Results 图2:朴素贝叶斯分类结果

结论 (Conclusion)

In case you are looking for more information about how to get started with Weka, this YouTube series by Google Developers is a great place to start.

如果您正在寻找有关Weka入门的更多信息,那么Google Developers的YouTube系列是一个不错的起点。

联络我 (Contact me)

If you want to keep updated with my latest articles and projects, follow me on Medium and subscribe to my mailing list. These are some of my contacts details:

如果您想随时了解我的最新文章和项目,请在Medium上关注我,并订阅我的邮件列表 。 这些是我的一些联系方式:

  • Linkedin

    领英

  • Personal Blog

    个人博客

  • Personal Website

    个人网站

  • Medium Profile

    中档

  • GitHub

    的GitHub

  • Kaggle

    卡格勒

翻译自: https://www.freecodecamp.org/news/machine-learning-using-weka/

weka机器学习实战

你可能感兴趣的:(算法,可视化,大数据,python,机器学习)