前言
苹果Create ML目前已支持Natural Language处理,这里介绍一个简单的使用CoreML来分类垃圾信息的应用。
使用到的数据来源于英文短信SMS Spam Collection v. 1:
Application | File format | # Spam | # Ham | Total | Link |
---|---|---|---|---|---|
General | Plain text | 747 | 4,827 | 5,574 | Link 1 |
Weka | ARFF | 747 | 4,827 | 5,574 | Link 2 |
构建模型
首先,下载的数据已经基本是标注好的,但是缺少列名,我们需要给数据加上label
,text
两列名,方便后面训练模型使用。
然后,我们利用xcode创建一个MacOS的playground工程,因为目前CreateML库iOS不支持。然后编写训练模型,可以参考官方教程Creating a Text Classifier Model。
以下是我们工程代码:
import CreateML
import Cocoa
let data = try MLDataTable(contentsOf: URL(fileURLWithPath: "/Users/Jiao/Desktop/SecurityKeeper/SMSSpamDetect/SMSSpamCollection.csv"))
let (trainData, testData) = data.randomSplit(by: 0.8, seed: 10)
let SMSClassifier = try MLTextClassifier(trainingData: trainData, textColumn: "text", labelColumn: "label")
let trainAcc = (1 - SMSClassifier.trainingMetrics.classificationError) * 100
let validAcc = (1 - SMSClassifier.validationMetrics.classificationError) * 100
let evalMetrics = SMSClassifier.evaluation(on: testData)
let evalAcc = (1 - evalMetrics.classificationError) * 100
print(trainAcc, validAcc, evalAcc)
let metadata = MLModelMetadata(author: "Jiao", shortDescription: "SMS SPAM Detect", license: "MIT", version: "1.0", additional: nil)
try SMSClassifier.write(to: URL(fileURLWithPath: "/Users/Jiao/Desktop/SecurityKeeper/SMSSpamDetect/mlmodel/SMSClassifier.mlmodel"), metadata: metadata)
// test
let l = try SMSClassifier.prediction(from: "free phone")
非常傻瓜式的使用,不需要关心具体的分类实现过程,准确率能达到97%~98%。
iOS上使用模型
有了mlmodel,在移动端使用就很简单了,直接导入model进工程,然后初始化,在需要使用的地方调用predict就行了。
代码如下:
//
// TableViewController.m
// MessageDetect
//
// Created by Jiao Liu on 5/29/19.
// Copyright © 2019 ChangHong. All rights reserved.
//
#import "TableViewController.h"
#import "SMSClassifier.h"
@interface TableViewController ()
{
NSMutableArray *data;
SMSClassifier *classifier;
}
@end
@implementation TableViewController
- (void)viewDidLoad {
[super viewDidLoad];
classifier = [[SMSClassifier alloc] init];
NSArray *initD = @[@"Did you hear about the new \"Divorce Barbie\"? It comes with all of Ken's stuff!",
@"100 dating service cal;l 09064012103 box334sk38ch",
@"Hello, I'm james",
@"Even my brother is not like to speak with me. They treat me like aids patent.",
@"HOT LIVE FANTASIES call now 08707509020 Just 20p per min NTT Ltd, PO Box 1327 Croydon CR9 5WB 0870..k",
@"Ok...",
@"Yeah!!",
@"Oh my God.",
@"Our brand new mobile music service is now live. The free music player will arrive shortly. Just install on your phone to browse content from the top artists."];
data = [NSMutableArray arrayWithArray:initD];
self.tableView.allowsSelection = NO;
}
#pragma mark - Table view data source
- (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView {
return 1;
}
- (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section {
return data.count;
}
- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath {
UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:@"Message" forIndexPath:indexPath];
cell.textLabel.text = data[indexPath.row];
NSString *type = [classifier predictionFromText:data[indexPath.row] error:nil].label;
if ([type isEqualToString:@"spam"]) {
cell.textLabel.textColor = [UIColor redColor];
cell.accessoryType = UITableViewCellAccessoryDetailButton;
}
else
{
cell.textLabel.textColor = [UIColor blackColor];
cell.accessoryType = UITableViewCellAccessoryNone;
}
cell.textLabel.numberOfLines = 0;
return cell;
}
- (IBAction)AddMessage:(id)sender {
UIAlertController *alert = [UIAlertController alertControllerWithTitle:@"New Message" message:@"" preferredStyle:UIAlertControllerStyleAlert];
[alert addTextFieldWithConfigurationHandler:^(UITextField * _Nonnull textField) {
textField.placeholder = @"message";
textField.clearButtonMode = UITextFieldViewModeWhileEditing;
}];
UIAlertAction *okAction = [UIAlertAction actionWithTitle:@"OK" style:UIAlertActionStyleDefault handler:^(UIAlertAction * _Nonnull action) {
NSString *newMsg = alert.textFields.firstObject.text;
if (newMsg.length != 0) {
[self->data insertObject:alert.textFields.firstObject.text atIndex:0];
[self.tableView reloadData];
[self.tableView scrollToRowAtIndexPath:[NSIndexPath indexPathForRow:0 inSection:0] atScrollPosition:UITableViewScrollPositionTop animated:YES];
}
}];
UIAlertAction *cancelAction = [UIAlertAction actionWithTitle:@"Cancel" style:UIAlertActionStyleCancel handler:nil];
[alert addAction:cancelAction];
[alert addAction:okAction];
[self presentViewController:alert animated:YES completion:nil];
}
@end
运行效果
源码地址:https://github.com/JiaoLiu/SMSSpamDetect