正则表达式是一种针对于字符串的操作,主要功能有匹配、切割、替换和获取的作用,在Scala中正则也是被频繁使用的方法(regex.r表示为正则表达式)
Scala支持多种正则表达式解析,主要包括下面三种:
String.matches()
方法scala.util.matching.Regex
APIval a = "studying83"
println(a.matches("[a-z0-9]+")) //true
println(a.matches("[a-z0-9]{4}"))//false
val b = """([a-z0-9]+)""".r
"studying83" match {
case b => println("匹配成功")
case _ => println("匹配失败")
}
//匹配成功
其中有三种匹配:
findFirstMatchIn()返回第一个匹配(Option[match])
findAllMatchIn()返回所有匹配(regex.match)
findAllIn()返回所有匹配结果(String)
//findFirstMatchIn()
val reg = "[0-9]".r
reg.findFirstMatchIn("abc3d2gf") match {
case Some(x) => println(x)
case None => println("no")
} //3
//findAllMatchIn()
val reg = "[0-9]".r
println(reg.findAllMatchIn("abc3d2gf").toList)
//List(3, 2)
val str = "{\"id\":\"123456\",\"friends\":{\"name\":\"zs\",\"age\":\"40\"}}"
val reg = "\\{\"id\":\"([0-9]+)\",\"friends\":\\{\"name\":\"([a-z]+)\",\"age\":\"([0-9]+)\"}}".r
reg.findAllMatchIn(str).foreach(x=>println(x.group(1),x.group(2),x.group(3)))
//(123456,zs,40)
val input="name:Jason,age:19,weight:100"
val studentPattern="([0-9a-zA-Z-#() ]+):([0-9a-zA-Z-#() ]+)".r
studentPattern.findAllMatchIn(input).foreach(x=>println(x.group(1),x.group(2)))
//(name,Jason)
//(age,19)
//(weight,100)
//实用性 例如某一日志文件内容如:INFO 2000-01-07 requestURI:/c?app=0&p=1 路径为path 对其进行解析
import scala.io.Source
val source = Source.fromFile("path","UTF-8")
val lines = source.getLines.toArray
val reg = """([A-Z]+) ([0-9]{4}-[0-9]{2}-[0-9]{1,2}) requestURI:(.*)""".r
1## lines.map(line => reg.findAllMatchIn(line).toList.map(x => (x.group(1),x.group(2),x.group(3)))).foreach(println)
//List((INFO,2020-01-07,/c?app=0&p=1))
2## lines.map(line => line match{case reg(le,ld,ad) => (le,ld,ad)})
// Array[(String)] = Array((INFO,2000-01-07,/c?app=0&p=1))
//replaceFirstIn
val a = """([0-9]+)""".r
a.replaceFirstIn("123,go! 666","run")
// run,go! 666
//replaceAllIn
val a = """([0-9]+)""".r
a.replaceAllIn("123 you are the best!","come on!")
//come on! you are the best!
val date = """([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})""".r
"2020-5-18" match {case date(year, _*) => println((year))}
//2020
"2020-5-18" match {case date(_,mon,_*) => println(mon)}
//5