使用stringr处理字符串

因为我最近的工作很多都与字符串处理相关。恰好生信星球小洁学姐在之前介绍了stringr这个包的用法。我今天的主要工作就是把小洁学姐的代码总结一遍实际操作一遍。之后将他用到我的工作中。

1. 字符串基本操作

1.1字符串长度统计

library(stringr)
str_length(c("a","R for bioplanet",NA))

1.2字符串的连接

str_c("x","y")
str_c("x","y","z")
str_c("x","y",sep=",")
str_c("prefix-",c("a","b"),"-suffix")
str_replace_na() #去掉NA
str_c(c("x","y","z"),collapse="") #在这里collapse是将字符向量合并为字符串,注意collapse的用法

1.3str_sub 字符串的提取

x=c("apple","banana","pear")
str_sub(x,1,3)
str_sub(x,-3,-1)

1.4大小写转化及字母排序

str_to_lower("X")
str_to_upper(“x”)
str_to_title("dedd")
str_sort(x,locale="en")

2 正则表达式的用法

主要用到的语句是str_view(),和str_view_all()

2.1基础匹配

x=c("apple","banana","pear")
str_view(x,"an")
y=c("app.e","banana","pear")
str_view(y,"\\.")#这块转义为什么要用两个反斜杠有点没太理解
str_view(x,"^a")
str_view(x,"a$")
str_view(x,"^a$")

3. 匹配检测

x=c("apple","banana","pear")
str_detect(x,"e")#与sum和mean连用,统计匹配的个数和比例
sum(str_detect(x,"^a"))
mean(str_detect(x,"[aeiou]$"))
x[!str_detect(x,"[aeiou]$")]
str_subset(x,"[aeiou]$")
str_count(x,"a")

3.2 匹配内容的提取

有一个示例代码

length(sentences)
head(sentences)
colors=c("red","orange","yellow","green","blue","purple")
color_match=str_c(colors,collapse = "|")
has_color=str_subset(sentences,color_match)
more=sentences[str_count(sentences,color_match)>1]
str_view_all(more,color_match)

3.3 替换匹配的内容

str_replace(x,"[aeiou]","-")
str_replace_all(x,c("1"="one","2"="two","3"="three"))

3.4 拆分

sentences %>% head(5) %>% str_split(" ")
c("name:hadley","country:nz:2","age:35") %>% str_split(":",simplify = TRUE,n=2)
边界的探索

boundary(x)
QQ图片20181126232810.png

你可能感兴趣的:(使用stringr处理字符串)