iOS使用CoreML来分类垃圾信息

前言


苹果Create ML目前已支持Natural Language处理,这里介绍一个简单的使用CoreML来分类垃圾信息的应用。

使用到的数据来源于英文短信SMS Spam Collection v. 1:

Application File format # Spam # Ham Total Link
General Plain text 747 4,827 5,574 Link 1
Weka ARFF 747 4,827 5,574 Link 2
垃圾短信

构建模型


首先,下载的数据已经基本是标注好的,但是缺少列名,我们需要给数据加上labeltext两列名,方便后面训练模型使用。

添加列名

然后,我们利用xcode创建一个MacOS的playground工程,因为目前CreateML库iOS不支持。然后编写训练模型,可以参考官方教程Creating a Text Classifier Model。
以下是我们工程代码:

import CreateML
import Cocoa

let data = try MLDataTable(contentsOf: URL(fileURLWithPath: "/Users/Jiao/Desktop/SecurityKeeper/SMSSpamDetect/SMSSpamCollection.csv"))
let (trainData, testData) = data.randomSplit(by: 0.8, seed: 10)
let SMSClassifier = try MLTextClassifier(trainingData: trainData, textColumn: "text", labelColumn: "label")
let trainAcc = (1 - SMSClassifier.trainingMetrics.classificationError) * 100
let validAcc = (1 - SMSClassifier.validationMetrics.classificationError) * 100

let evalMetrics = SMSClassifier.evaluation(on: testData)
let evalAcc = (1 - evalMetrics.classificationError) * 100
print(trainAcc, validAcc, evalAcc)

let metadata = MLModelMetadata(author: "Jiao", shortDescription: "SMS SPAM Detect", license: "MIT", version: "1.0", additional: nil)
try SMSClassifier.write(to: URL(fileURLWithPath: "/Users/Jiao/Desktop/SecurityKeeper/SMSSpamDetect/mlmodel/SMSClassifier.mlmodel"), metadata: metadata)

// test
let l = try SMSClassifier.prediction(from: "free phone")

非常傻瓜式的使用,不需要关心具体的分类实现过程,准确率能达到97%~98%。

iOS上使用模型


有了mlmodel,在移动端使用就很简单了,直接导入model进工程,然后初始化,在需要使用的地方调用predict就行了。


xcode导入模型

代码如下:

//
//  TableViewController.m
//  MessageDetect
//
//  Created by Jiao Liu on 5/29/19.
//  Copyright © 2019 ChangHong. All rights reserved.
//

#import "TableViewController.h"
#import "SMSClassifier.h"

@interface TableViewController ()
{
    NSMutableArray *data;
    SMSClassifier *classifier;
}

@end

@implementation TableViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    classifier = [[SMSClassifier alloc] init];
    NSArray *initD = @[@"Did you hear about the new \"Divorce Barbie\"? It comes with all of Ken's stuff!",
             @"100 dating service cal;l 09064012103 box334sk38ch",
             @"Hello, I'm james",
             @"Even my brother is not like to speak with me. They treat me like aids patent.",
             @"HOT LIVE FANTASIES call now 08707509020 Just 20p per min NTT Ltd, PO Box 1327 Croydon CR9 5WB 0870..k",
             @"Ok...",
             @"Yeah!!",
             @"Oh my God.",
             @"Our brand new mobile music service is now live. The free music player will arrive shortly. Just install on your phone to browse content from the top artists."];
    data = [NSMutableArray arrayWithArray:initD];
    self.tableView.allowsSelection = NO;
}

#pragma mark - Table view data source

- (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView {
    return 1;
}

- (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section {
    return data.count;
}


- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath {
    UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:@"Message" forIndexPath:indexPath];
    
    cell.textLabel.text = data[indexPath.row];
    NSString *type = [classifier predictionFromText:data[indexPath.row] error:nil].label;
    if ([type isEqualToString:@"spam"]) {
        cell.textLabel.textColor = [UIColor redColor];
        cell.accessoryType = UITableViewCellAccessoryDetailButton;
    }
    else
    {
        cell.textLabel.textColor = [UIColor blackColor];
        cell.accessoryType = UITableViewCellAccessoryNone;
    }
    cell.textLabel.numberOfLines = 0;
    
    return cell;
}

- (IBAction)AddMessage:(id)sender {
    UIAlertController *alert = [UIAlertController alertControllerWithTitle:@"New Message" message:@"" preferredStyle:UIAlertControllerStyleAlert];
    [alert addTextFieldWithConfigurationHandler:^(UITextField * _Nonnull textField) {
        textField.placeholder = @"message";
        textField.clearButtonMode = UITextFieldViewModeWhileEditing;
    }];
    UIAlertAction *okAction = [UIAlertAction actionWithTitle:@"OK" style:UIAlertActionStyleDefault handler:^(UIAlertAction * _Nonnull action) {
        NSString *newMsg = alert.textFields.firstObject.text;
        if (newMsg.length != 0) {
            [self->data insertObject:alert.textFields.firstObject.text atIndex:0];
            [self.tableView reloadData];
            [self.tableView scrollToRowAtIndexPath:[NSIndexPath indexPathForRow:0 inSection:0] atScrollPosition:UITableViewScrollPositionTop animated:YES];
        }
    }];
    UIAlertAction *cancelAction = [UIAlertAction actionWithTitle:@"Cancel" style:UIAlertActionStyleCancel handler:nil];
    [alert addAction:cancelAction];
    [alert addAction:okAction];
    [self presentViewController:alert animated:YES completion:nil];
}

@end

运行效果


源码地址:https://github.com/JiaoLiu/SMSSpamDetect

demo.gif

你可能感兴趣的:(iOS使用CoreML来分类垃圾信息)