去除类似AB和BA这样的重复行


title: "test"
author: "qliu"
date: "2018年4月30日"
output:
html_document:
keep_md: yes


4.30 去除类似AB和BA这样的重复行

方法一: ------------------------------

a <- c(rep("A", 3), rep("B", 3), rep("C",2))
b <- c('A','B','B','C','A','A','B','B')
df <-data.frame(a,b)
df
##   a b
## 1 A A
## 2 A B
## 3 A B
## 4 B C
## 5 B A
## 6 B A
## 7 C B
## 8 C B
cols = c(1,2)
newdf = df[,cols]

system.time({
for (i in 1:nrow(df)){
  newdf[i, ] = sort(df[i,cols])
}

df[!duplicated(newdf),]
})
##    user  system elapsed 
##    0.02    0.00    0.02

方法二:-------------------------------

system.time({
df[!duplicated(data.frame(list(do.call(pmin,df),do.call(pmax,df)))),]
})
##    user  system elapsed 
##       0       0       0

方法三:--------------------------------

system.time({
newDf <- data.frame(t(apply(df,1,sort)))
newDf <- newDf[!duplicated(newDf),]
})
##    user  system elapsed 
##       0       0       0

你可能感兴趣的:(去除类似AB和BA这样的重复行)