第10章 使用stringr处理字符串P3

使用stringr处理字符串数据

工具

匹配检测

Detect the presence or absence of a pattern in a string.
简单理解一下,就是被检测字符串中是否包含想要检测的字符

准备工作

library(tidyverse)

简单运行

x <- c("apple","banana","pear")
str_detect(x,"e")
[1]  TRUE FALSE  TRUE

进阶

  1. 在R中定义FALSE为0,TRUE为1。这使得可以使用前面所学的数学函数对其进行运算
#分解一下以下的语句
sum(str_detect(words,"^t"))
#>head(str_detect(words,"^t"))
[1] FALSE FALSE FALSE FALSE FALSE FALSE
#aaa <- str_detect(words,"^t") %>% sum()
#925 *0 +65 *1 =65
mean(str_detect(words,"[aeiou]$"))#跟上边的是一样的。
  1. 复杂逻辑条件下调用正则表达式
    解释下面的正则表达式
    [aeiou]+$
    先解释[]内的含义表示非aeiou
    +号表示重复一次或者多次
    以非aeiou开头的单词重复一次或者多次并以非aeiou结尾的单词。
  2. 取子集和filter筛选
  3. str_count返回的是每个字符串中需求的字符个数

提取匹配内容

如果我想知道匹配检测为T的单词是什么?就需要对匹配检测的内容进行提取。
注意一点的是str_*系列的函数需要一个string和一个正则表达式才可以(pattern)。
例如:str_detect(string, pattern)

> has_color <- str_subset(sentences,colors)
> has_color
[1] "The spot on the blotter was made by green ink."
[2] "Torn scraps littered the stone floor."         
[3] "It is hard to erase blue or red ink."          
[4] "The box is held by a bright red snapper."      
[5] "Nine men were hired to dig the ruins."         
[6] "A man in a blue sweater sat at the desk."      
[7] "The sky in the west is tinged with orange red."
has_color <- str_subset(sentences,color_match)
has_color
 [1] "Glue the sheet to the dark blue background."       
 [2] "Two blue fish swam in the tank."                   
 [3] "The colt reared and threw the tall rider."         
 [4] "The wide road shimmered in the hot sun."           
 [5] "See the cat glaring at the scared mouse."          
 [6] "A wisp of cloud hung in the blue air."             
 [7] "Leaves turn brown and yellow in the fall."         
 [8] "He ordered peach pie with ice cream."              
 [9] "Pure bred poodles have curls."                     
[10] "The spot on the blotter was made by green ink."    
[11] "Mud was spattered on the front of his white shirt."
[12] "The sofa cushion is red and of light weight."      
[13] "The sky that morning was clear and bright blue."   
[14] "Torn scraps littered the stone floor."             
[15] "The doctor cured him with these pills."            
[16] "The new girl was fired today at noon."             
[17] "The third act was dull and tired the players."     
[18] "A blue crane is a tall wading bird."               
[19] "Lire wires should be kept covered."                
[20] "It is hard to erase blue or red ink."              
[21] "The wreck occurred by the bank on Main Street."    
[22] "The lamp shone with a steady green flame."         
[23] "The box is held by a bright red snapper."          
[24] "The prince ordered his head chopped off."          
[25] "The houses are built of red clay bricks."          
[26] "The red tape bound the smuggled food."             
[27] "Nine men were hired to dig the ruins."             
[28] "The flint sputtered and lit a pine torch."         
[29] "Hedge apples may stain your hands green."          
[30] "The old pan was covered with hard fudge."          
[31] "The plant grew large and green in the window."     
[32] "The store walls were lined with colored frocks."   
[33] "The purple tie was ten years old."                 
[34] "Bathe and relax in the cool green grass."          
[35] "The clan gathered on each dull night."             
[36] "The lake sparkled in the red hot sun."             
[37] "Mark the spot with a sign painted red."            
[38] "Smoke poured out of every crack."                  
[39] "Serve the hot rum to the tired heroes."            
[40] "The couch cover and hall drapes were blue."        
[41] "He offered proof in the form of a lsrge chart."    
[42] "A man in a blue sweater sat at the desk."          
[43] "The sip of tea revives his tired friend."          
[44] "The door was barred, locked, and bolted as well."  
[45] "A thick coat of black paint covered all."          
[46] "The small red neon lamp went out."                 
[47] "Paint the sockets in the wall dull green."         
[48] "Wake and rise, and step into the green outdoors."  
[49] "The green light in the brown box flickered."       
[50] "He put his last cartridge into the gun and fired." 
[51] "The ram scared the school children off."           
[52] "Tear a thin sheet from the yellow pad."            
[53] "Dimes showered down from all sides."               
[54] "The sky in the west is tinged with orange red."    
[55] "The red paper brightened the dim stage."           
[56] "The hail pattered on the burnt brown grass."       
[57] "The big red apple fell to the ground."  

其实在这里我有一些疑问,为什么要将单字符转换为字符串啊!

后来我读了读帮助,恍然大悟
str_subset() is a wrapper around x[str_detect(x, pattern)], and is equivalent to grep(pattern, x, value = TRUE). str_which() is a wrapper around which(str_detect(x, pattern)), and is equivalent to grep(pattern, x).

sum(str_detect(sentences,colors))
[1] 7

因为它只会返回一个与输入向量具有同样长度的逻辑向量啊!多么痛的领悟!
以后跟if,for,apply一起用吧!

你可能感兴趣的:(第10章 使用stringr处理字符串P3)