Orange脚本调用Data Mining Library

原文(英):http://docs.orange.biolab.si/3/data-mining-library/


Orange Data Mining Library

教程

这是一个有好的关于 Orange 脚本使用的教程, 而Orange是一个基于Python 3的数据挖掘支持库。假定你已经下载和安装了 Orange, 可以从github repository 获得的版本,并且有一个可以正常运行的Python环境,注意需要python3,python3.4将是目前推荐的版本。使用Python shell,输入下面的代码:

% python
>>> import Orange
>>> Orange.version.version
'3.2.dev0+8907f35'
>>>

如果没有错误和警告信息,Orange 和 Python 就已经正确安装,可以继续下面的教程。

  • The Data

    • Data Input

    • Saving the Data

    • Exploration of the Data Domain

    • Data Instances

    • Orange Data Sets and NumPy

    • Meta Attributes

    • Missing Values

    • Data Selection and Sampling

  • Classification

    • Learners and Classifiers

    • Probabilistic Classification

    • Cross-Validation

    • Handful of Classifiers

  • Regression

    • Handful of Regressors

    • Cross Validation

参考

可用的类库和方法:

  • Data model (data)

    • Data Storage (storage)

    • Data Table (table)

    • SQL table (data.sql)

    • Domain description (domain)

    • Variable Descriptors (variable)

    • Values (value)

    • Data Instance (instance)

    • Data Filters (filter)

  • Data Preprocessing (preprocess)

    • Impute

    • Discretization

    • Continuization

    • Normalization

    • Randomization

    • Remove

    • Feature selection

    • Preprocessors

  • Classification (classification)

    • Logistic Regression

    • Random Forest

    • Simple Random Forest

    • Softmax Regression

    • k-Nearest Neighbors

    • Naive Bayes

    • Multilayer perceptron (feed-forward neural network)

    • Support Vector Machines

    • Linear Support Vector Machines

    • Nu-Support Vector Machines

    • One Class Support Vector Machines

    • Classification Tree

    • Simple Tree

    • Majority Classifier

    • Elliptic Envelope

  • Regression (regression)

    • Linear Regression

    • Mean

    • Random Forest

    • Simple Random Forest

    • Regression Tree

  • Clustering (clustering)

    • Hierarchical (hierarchical)

  • Distance (distance)

  • Evaluation (evaluation)

    • Sampling procedures for testing models (testing)

    • Scoring methods (scoring)

  • Miscellaneous (misc)

    • Symbolic constants (enum)

    • Distance Matrix (distmatrix)


你可能感兴趣的:(widget,Orange,datamining)