string 在 data.frame 中查找字符串

Question

提问by Jonas Lindel?v

How do I search for a string in a data.frame? As a minimal example, how do I find the locations (columns and rows) of 'horse' in this data.frame?

如何在 data.frame 中搜索字符串？作为一个最小的例子，我如何在这个 data.frame 中找到“马”的位置（列和行）？

> df = data.frame(animal=c('goat','horse','horse','two', 'five'), level=c('five','one','three',30,'horse'), length=c(10, 20, 30, 'horse', 'eight'))
> df
  animal level length
1   goat  five     10
2  horse   one     20
3  horse three     30
4    two    30  horse
5   five horse  eight

... so row 4 and 5 have the wrong order. Any output that would allow me to identify that 'horse' has shifted to the levelcolumn in row 5 and to the lengthcolumn in row 4 is good. Maybe:

...所以第 4 行和第 5 行的顺序错误。任何可以让我识别“马”已转移到level第 5 行的length列和第 4 行的列的输出都很好。也许：

> magic_function(df, 'horse')
col       row
'animal', 2
'animal', 3
'length', 4
'level',  5

Here's what I want to use this for: I have a very large data frame (around 60 columns, 20.000 rows) in which some columns are messed up for some rows. It's too large to eyeball in order to identify the different ways that order can be wrong, so searching would be nice. I will use this info to move data to the correct columns for these rows.

这是我想用它来做的：我有一个非常大的数据框（大约 60 列，20.000 行），其中一些列对于一些行来说是混乱的。为了识别顺序可能出错的不同方式，它太大而无法观察，因此搜索会很好。我将使用此信息将数据移动到这些行的正确列。

Answer 1

回答by thothal

What about:

关于什么：

which(df == "horse", arr.ind = TRUE)
#      row col
# [1,]   2   1
# [2,]   3   1
# [3,]   5   2
# [4,]   4   3

Answer 2

回答by 989

Another way around:

另一种方法：

l <- sapply(colnames(df), function(x) grep("horse", df[,x]))

$animal
[1] 2 3

$level
[1] 5

$length
[1] 4

If you want the output to be matrix:

如果您希望输出为矩阵：

sapply(l,'[',1:max(lengths(l)))

     animal level length
[1,]      2     5      4
[2,]      3    NA     NA

Answer 3

回答by piyuw

Another way to do it is the following:

另一种方法是：

library(data.table)
library(zoo)
library(dplyr)
library(timeDate)
library(reshape2)
data frame name = tbl_account

first,Transpose it :

首先，转置它：

temp = t(tbl_Account)

Then, put it in to a list :

然后，将其放入列表：

temp = list(temp)

This essentially puts every single observation in a data frame in to one massive string, allowing you to search the whole data frame in one go.

这基本上将数据框中的每个观察结果放入一个大字符串中，让您可以一次性搜索整个数据框。

then do the searching :

然后进行搜索：

temp[[1]][grep("Horse",temp[[1]])] #brings back the actual value occurrences
grep("Horse", temp[[1]]) # brings back the position of the element in a list it occurs in

hope this helps :)

希望这可以帮助：）

Answer 4

回答by Ronak Shah

We can get the indices where the value is equal to horse. Divide it by number of rows (nrow) to get the column indices and by columns (ncol) to get the row indices.

我们可以得到值等于的索引horse。将其除以行数 ( nrow) 以获取列索引并除以列 ( ncol) 以获取行索引。

We use colnamesto get column names instead of indices.

我们colnames用来获取列名而不是索引。

data.frame(col = colnames(df)[floor(which(df == "horse") / (nrow(df) + 1)) + 1], 
           row = floor(which(df == "horse") / ncol(df)) + 1)

#   col   row
#1 animal   1
#2 animal   2
#3  level   4
#4 length   5

string 在 data.frame 中查找字符串

提问by Jonas Lindel?v

回答by thothal

回答by 989

回答by piyuw

回答by Ronak Shah

相关推荐

最近更新

标签

string 在 data.frame 中查找字符串

提问by Jonas Lindel?v

回答by thothal

回答by 989

回答by piyuw

回答by Ronak Shah

相关推荐

将 VBA 代码从一个工作簿中的工作表复制到另一个工作簿？

string 如何将多个字符串和 int 合并为一个字符串

VBA 在单元格值更改时触发宏

string 带有 str_detect R 的多个字符串

相关推荐

最近更新

标签