机器学习之垃圾短信过滤

写在前面:

数据集来源:http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/,或者参见UCI https://archive.ics.uci.edu/ml/datasets/sms+spam+collection

数据集说明:

Attribute Information:

The collection is composed by just one text file, where each line has the correct class followed by the raw message. We offer some examples bellow: 

ham What you doing?how are you? 
ham Ok lar... Joking wif u oni... 
ham dun say so early hor... U c already then say... 
ham MY NO. IN LUTON 0125698789 RING ME IF UR AROUND! H* 
ham Siva is in hostel aha:-. 
ham Cos i was out shopping wif darren jus now n i called him 2 ask wat present he wan lor. Then he started guessing who i was wif n he finally guessed darren lor. 
spam FreeMsg: Txt: CALL to No: 86888 & claim your reward of 3 hours talk time to use from your phone now! ubscribe6GBP/ mnth inc 3hrs 16 stop?txtStop 
spam Sunshine Quiz! Win a super Sony DVD recorder if you canname the capital of Australia? Text MQUIZ to 82277. B 
spam URGENT! Your Mobile No 07808726822 was awarded a L2,000 Bonus Caller Prize on 02/09/03! This is our 2nd attempt to contact YOU! Call 0871-872-9758 BOX95QU 

Note: the messages are not chronologically sorted.

 

 

转载于:https://www.cnblogs.com/gangzhuzi/p/7157511.html

你可能感兴趣的:(机器学习之垃圾短信过滤)