2019-11-21R语言-day4 R的内置函数-字符处理函数

1 提取或替换

1.1提取或者替换元素中起始位置之间的内容

substr(x, start=n1, stop=n2)

x <- c("howareyou","fine","thank")
substr(x,2,4)    #  "owa" "ine" "han"  即每个字符串的第2-4个字符
substr(x, 2, 4) <- "1234567"  #"h123reyou" "f123"      "t123k"  每个字符串的第2-4换替换为后数据

1.2 替换匹配的元素

sub 替换第一次匹配的元素,gsub是贪婪模式,替换所有匹配到的。

sub(pattern, replacement, x, ignore.case =FALSE, fixed=FALSE, perl=FALSE, useBytes=FALSE)

x <- c("howareyouaaa","fine","thank")
sub("a",replacement = "A",x)  #  "howAreyouaaa" "fine" "thAnk"  
gsub("a",replacement = "A",x=c("a1a","a2","b1","b2"))   # "howAreyouAAA" "fine" "thAnk" 

2 查找

grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE);

grep 返回符合正则条件的元素在向量中位置、本身、或者逻辑值。

  • invert→若设置为TRUE,返回不包含pattern的元素的下标

  • value→若设置为TRUE,返回相应的元素

  • fixed→若fixed =FALSE,则pattern是一个正则表达式。若fixed=TRUE,那么pattern是一个文本字符串,返回匹配指数。

    x <- c("howareyou","fine","thank")
    grep("e",x)   #1 2   返回包含“e”的元素的下标
    grep("e",x,invert = T)   # 3  返回不包含“e”的元素的下标
    grep("e",x,value = T)    # "howareyou" "fine"   返回元素本身
    grep("e",x,value = T, invert =T)  #"thank"   返回不包含“e”的元素本身
    

grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
类似grep,但是返回逻辑向量,即是否包含pattern

  grepl("e",x)    #TRUE  TRUE FALSE   返回逻辑值

3 粘合和分割字符串

paste (..., sep = " ", collapse = NULL)
paste0(..., collapse = NULL)

paste("a","b",sep="-")  # [1] "a-b"
paste("x",1:4,sep="")    #"x1" "x2" "x3" "x4"
x <- c("howareyouaaa","fine","thank")
y <- c("de","ta")
paste(x,y, sep = "-" )   #"howareyou-de" "fine-ta"  "thank-de"  

strsplit(x,split,fixed = FALSE, pelr =FALSE, useBytes = FALSE)

strsplit(c("a1,a2"),split = "")  
#[[1]]
# [1] "a" "1" "," "a" "2" 
strsplit(c("a1","a2"),split = "")
# [[1]]
#[1] "a" "1"
#[[2]]
#[1] "a" "2"

问题:在每行之前的[][[]]分别是什么意思?

4 大小写字母

toupper(x) 大写转换
tolower(x) 小写转换
toupper(c("wo"))  #返回"WO"
tolower("whNIL")  #返回"whnil"

你可能感兴趣的:(2019-11-21R语言-day4 R的内置函数-字符处理函数)