string golang中不区分大小写的字符串搜索
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24836044/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Case insensitive string search in golang
提问by user3841581
How do I search through a file for a word in a case insensitivemanner?
如何以不区分大小写的方式在文件中搜索单词?
For example
例如
If I'm searching for UpdaTe
in the file, if the file contains update, the search should pick it and count it as a match.
如果我UpdaTe
在文件中搜索,如果文件包含更新,则搜索应该选择它并将其计为匹配项。
回答by 425nesp
strings.EqualFold()
can check if two strings are equal, while ignoring case. It even works with Unicode. See http://golang.org/pkg/strings/#EqualFoldfor more info.
strings.EqualFold()
可以检查两个字符串是否相等,同时忽略大小写。它甚至适用于 Unicode。有关更多信息,请参阅http://golang.org/pkg/strings/#EqualFold。
http://play.golang.org/p/KDdIi8c3Ar
http://play.golang.org/p/KDdIi8c3Ar
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.EqualFold("HELLO", "hello"))
fmt.Println(strings.EqualFold("?O?O", "?o?o"))
}
Both return true.
两者都返回true。
回答by joshlf
Presumably the important part of your question is the search, not the part about reading from a file, so I'll just answer that part.
大概你的问题的重要部分是搜索,而不是关于从文件中读取的部分,所以我只回答那部分。
Probably the simplest way to do this is to convert both strings (the one you're searching through and the one that you're searching for) to all upper case or all lower case, and then search. For example:
可能最简单的方法是将两个字符串(您正在搜索的字符串和您正在搜索的字符串)转换为全部大写或全部小写,然后进行搜索。例如:
func CaseInsensitiveContains(s, substr string) bool {
s, substr = strings.ToUpper(s), strings.ToUpper(substr)
return strings.Contains(s, substr)
}
You can see it in action here.
你可以在这里看到它的实际效果。
回答by chendesheng
If your file is large, you can use regexp and bufio:
如果你的文件很大,你可以使用 regexp 和 bufio:
//create a regex `(?i)update` will match string contains "update" case insensitive
reg := regexp.MustCompile("(?i)update")
f, err := os.Open("test.txt")
if err != nil {
log.Fatal(err)
}
defer f.Close()
//Do the match operation
//MatchReader function will scan entire file byte by byte until find the match
//use bufio here avoid load enter file into memory
println(reg.MatchReader(bufio.NewReader(f)))
The bufio package implements a buffered reader that may be useful both for its efficiency with many small reads and because of the additional reading methods it provides.
bufio 包实现了一个缓冲读取器,它可能对许多小读取的效率以及它提供的其他读取方法很有用。
回答by Xeoncross
Do not use strings.Contains
unless you need exact matching rather than language-correct string searches
strings.Contains
除非您需要精确匹配而不是语言正确的字符串搜索,否则不要使用
None of the current answers are correct unless you are only searching ASCII charactersthe minority of languages (like english) without certain diaeresis / umlauts or other unicode glyph modifiers(the more "correct" way to define it as mentioned by @snap). The standard google phrase is "searching non-ASCII characters".
当前的答案都不是正确的,除非您只搜索少数语言(如英语)的ASCII 字符,而没有特定的分音符/变音符号或其他 unicode 字形修饰符(@snap 提到的更“正确”的定义方式)。标准的谷歌短语是“搜索非 ASCII 字符”。
For proper support for language searching you need to use http://golang.org/x/text/search.
为了正确支持语言搜索,您需要使用http://golang.org/x/text/search。
func SearchForString(str string, substr string) (int, int) {
m := search.New(language.English, search.IgnoreCase)
return = m.IndexString(str, substr)
}
start, end := SearchForString('foobar', 'bar');
if start != -1 && end != -1 {
fmt.Println("found at", start, end);
}
Or if you just want the starting index:
或者,如果您只想要起始索引:
func SearchForStringIndex(str string, substr string) (int, bool) {
m := search.New(language.English, search.IgnoreCase)
start, _ := m.IndexString(str, substr)
if start == -1 {
return 0, false
}
return start, true
}
index, found := SearchForStringIndex('foobar', 'bar');
if found {
fmt.Println("match starts at", index);
}
Search the language.Tag
structs hereto find the language you wish to search with or use language.Und
if you are not sure.
如果您不确定,请language.Tag
在此处搜索结构以查找您希望搜索或使用的语言language.Und
。
Update
更新
There seems to be some confusion so this following example should help clarify things.
似乎有些混乱,所以下面的例子应该有助于澄清事情。
package main
import (
"fmt"
"strings"
"golang.org/x/text/language"
"golang.org/x/text/search"
)
var s = `?`
var s2 = `?`
func main() {
m := search.New(language.Finnish, search.IgnoreDiacritics)
fmt.Println(m.IndexString(s, s2))
fmt.Println(CaseInsensitiveContains(s, s2))
}
// CaseInsensitiveContains in string
func CaseInsensitiveContains(s, substr string) bool {
s, substr = strings.ToUpper(s), strings.ToUpper(substr)
return strings.Contains(s, substr)
}