x <- c("aa", "aa", "aa", "bb", "cc", "cc", "cc")
y <- c(101, 102, 113, 201, 202, 344, 407)
df = data.frame(x, y)
x y
1 aa 101
2 aa 102
3 aa 113
4 bb 201
5 cc 202
6 cc 344
7 cc 407
#filter rows that contain 'Guard' or 'Forward' in the player column df %>% filter(grepl('Guard|Forward', player)) player points rebounds 1 P Guard 12 5 2 S Guard 15 7 3 S Forward 19 7 4 P Forward 22 12
#filter out rows that contain 'Guard' in the player column
df %>% filter(!grepl('Guard', player))
player points rebounds
1 S Forward 19 7
2 P Forward 22 12
3 Center 32 11
文章来源:https://www.toymoban.com/news/detail-745250.html
df %>% filter(y != grep("^1"))df %>% filter(!str_detect(y, "^1"))文章来源地址https://www.toymoban.com/news/detail-745250.html
Pipe functions together
We created multiple new data objects during our explorations of dplyr functions, above. While this works, we can produce the same results more efficiently by chaining functions together and creating only one new data object that encapsulates all of the previously sought information: filter on only females, grepl to get only Peromyscus spp., group_by individual species, and summarise the numbers of individuals.
# combine several functions to get a summary of the numbers of individuals of
# female Peromyscus species in our dataset.
# remember %>% are "pipes" that allow us to pass information from one function
# to the next.
dataBySpFem <- myData %>%
filter(grepl('Peromyscus', scientificName), sex == "F") %>%
group_by(scientificName) %>%
summarise(n_individuals = n())
## `summarise()` ungrouping output (override with `.groups` argument)
# view the data
dataBySpFem
## # A tibble: 3 x 2
## scientificName n_individuals
## <chr> <int>
## 1 Peromyscus leucopus 455
## 2 Peromyscus maniculatus 98
## 3 Peromyscus sp. 5
Cool!
Base R only
So that is nice, but we had to install a new package dplyr. You might ask, "Is it really worth it to learn new commands if I can do this is base R." While we think "yes", why don't you see for yourself. Here is the base R code needed to accomplish the same task.
# For reference, the same output but using R's base functions
# First, subset the data to only female Peromyscus
dataFemPero <- myData[myData$sex == 'F' &
grepl('Peromyscus', myData$scientificName), ]
# Option 1 --------------------------------
# Use aggregate and then rename columns
dataBySpFem_agg <-aggregate(dataFemPero$sex ~ dataFemPero$scientificName,
data = dataFemPero, FUN = length)
names(dataBySpFem_agg) <- c('scientificName', 'n_individuals')
# view output
dataBySpFem_agg
## scientificName n_individuals
## 1 Peromyscus leucopus 455
## 2 Peromyscus maniculatus 98
## 3 Peromyscus sp. 5
# Option 2 --------------------------------------------------------
# Do it by hand
# Get the unique scientificNames in the subset
sppInDF <- unique(dataFemPero$scientificName[!is.na(dataFemPero$scientificName)])
# Use a loop to calculate the numbers of individuals of each species
sciName <- vector(); numInd <- vector()
for (i in sppInDF) {
sciName <- c(sciName, i)
numInd <- c(numInd, length(which(dataFemPero$scientificName==i)))
}
#Create the desired output data frame
dataBySpFem_byHand <- data.frame('scientificName'=sciName,
'n_individuals'=numInd)
# view output
dataBySpFem_byHand
## scientificName n_individuals
## 1 Peromyscus leucopus 455
## 2 Peromyscus maniculatus 98
## 3 Peromyscus sp. 5
到了这里,关于dataframe安装某种标准过滤行dplyr gsub filter slice grep str_detect的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!