string 从 R 中的字符串中提取字母

Question

提问by Moose

I have a character vector containing variable names such as x <- c("AB.38.2", "GF.40.4", "ABC.34.2"). I want to extract the letters so that I have a character vector now containing only the letters e.g. c("AB", "GF", "ABC").

我有一个包含变量名称的字符向量，例如x <- c("AB.38.2", "GF.40.4", "ABC.34.2"). 我想提取字母，以便我有一个字符向量，现在只包含字母，例如c("AB", "GF", "ABC").

Because the number of letters varies, I cannot use substringto specify the first and last characters.

因为字母的数量不同，所以我不能substring用来指定第一个和最后一个字符。

How can I go about this?

我该怎么办？

Answer 1

采纳答案by Mamoun Benghezal

you can try

你可以试试

sub("^([[:alpha:]]*).*", "\1", x)
[1] "AB"  "GF"  "ABC"

Answer 2

回答by Bernard Beckerman

The previous answers seem more complicated than necessary. This questionregarding digits also works with letters:

以前的答案似乎比必要的要复杂。这个关于数字的问题也适用于字母：

> x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", "  a")
> gsub("[^a-zA-Z]", "", x)
[1] "AB"    "GF"    "ABC"   "ABCFd" "a"

Answer 3

回答by mimoralea

None of the answers work if you have mixed letter with spaces. Here is what I'm doing for those cases:

如果您将字母与空格混合在一起，则所有答案都不起作用。这是我为这些情况所做的：

x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd")
unique(na.omit(unlist(strsplit(unlist(x), "[^a-zA-Z]+"))))

[1] "AB" "GF" "ABC" "A" "B" "C" "Fd"

[1]“AB”“GF”“ABC”“A”“B”“C”“Fd”

Answer 4

回答by cephalopod

This is how I managed to solve this problem. I use this because it returns the 5 items cleanly and I can control if i want a space in between the words:

这就是我设法解决这个问题的方法。我使用它是因为它干净地返回了 5 个项目，我可以控制是否要在单词之间留一个空格：

x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", "  a")

extract.alpha <- function(x, space = ""){      
  require(stringr)
  require(purrr)
  require(magrittr)
  
  y <- strsplit(unlist(x), "[^a-zA-Z]+") 
  z <- y %>% map(~paste(., collapse = space)) %>% simplify()
  return(z)}

extract.alpha(x, space = " ")

Answer 5

回答by Lenz Paul

I realize this is an old question but since I was looking for a similar answer just now and found it, I thought I'd share.

我意识到这是一个老问题，但由于我刚刚正在寻找类似的答案并找到了它，我想我会分享。

The simplest and fastest solution I found myself:

我发现自己最简单，最快的解决方案：

x <- c("AB.38.2", "GF.40.4", "ABC.34.2")
only_letters <- function(x) { gsub("^([[:alpha:]]*).*$","\1",x) }
only_letters(x)

And the output is:

输出是：

[1] "AB"  "GF"  "ABC"

Hope this helps someone!

希望这可以帮助某人！

string 从 R 中的字符串中提取字母

提问by Moose

采纳答案by Mamoun Benghezal

回答by Bernard Beckerman

回答by mimoralea

回答by cephalopod

回答by Lenz Paul

相关推荐

最近更新

标签

string 从 R 中的字符串中提取字母

提问by Moose

采纳答案by Mamoun Benghezal

回答by Bernard Beckerman

回答by mimoralea

回答by cephalopod

回答by Lenz Paul

相关推荐

string Powershell 字符串不包含

vba 为什么我会收到运行时错误 424 Object Required？

string 在 R 中：从字段中删除逗号并使修改后的字段保留为数据帧的一部分

vba 数据存在的行数

相关推荐

最近更新

标签