string 如何在 Rust 中索引字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24542115/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 02:19:28  来源:igfitidea点击:

How to index a String in Rust

stringindexingrust

提问by Sam Myers

I am attempting to index a string in Rust, but the compiler throws an error. My code (Project Euler problem 4, playground):

我试图在 Rust 中索引一个字符串,但编译器抛出一个错误。我的代码(Project Euler 问题 4,操场):

fn is_palindrome(num: u64) -> bool {
    let num_string = num.to_string();
    let num_length = num_string.len();

    for i in 0 .. num_length / 2 {
        if num_string[i] != num_string[(num_length - 1) - i] {
            return false;
        }
    }

    true
}

The error:

错误:

error[E0277]: the trait bound `std::string::String: std::ops::Index<usize>` is not satisfied
 --> <anon>:7:12
  |
7 |         if num_string[i] != num_string[(num_length - 1) - i] {
  |            ^^^^^^^^^^^^^
  |
  = note: the type `std::string::String` cannot be indexed by `usize`

Is there a reason why Stringcan not indexed? How can I access the data then?

是否有String无法索引的原因?那我怎样才能访问数据呢?

采纳答案by Vladimir Matveev

Yes, indexing into a string is not available in Rust. The reason for this is that Rust strings are encoded in UTF-8 internally, so the concept of indexing itself would be ambiguous, and people would misuse it: byte indexing is fast, but almost always incorrect (when your text contains non-ASCII symbols, byte indexing may leave you inside a character, which is really bad if you need text processing), while char indexing is not free because UTF-8 is a variable-length encoding, so you have to traverse the entire string to find the required code point.

是的,在 Rust 中无法对字符串进行索引。这样做的原因是 Rust 字符串在内部是用 UTF-8 编码的,所以索引本身的概念会很模糊,人们会误用它:字节索引很快,但几乎总是不正确的(当你的文本包含非 ASCII 符号时) ,字节索引可能会让你留在一个字符中,如果你需要文本处理,这真的很糟糕),而字符索引不是免费的,因为 UTF-8 是可变长度编码,所以你必须遍历整个字符串才能找到所需的代码点。

If you are certain that your strings contain ASCII characters only, you can use the as_bytes()method on &strwhich returns a byte slice, and then index into this slice:

如果您确定您的字符串包含ASCII字符而已,你可以使用as_bytes()上方法&str返回一个字节的片,然后索引到这片:

let num_string = num.to_string();

// ...

let b: u8 = num_string.as_bytes()[i];
let c: char = b as char;  // if you need to get the character as a unicode code point

If you do need to index code points, you have to use the char()iterator:

如果确实需要索引代码点,则必须使用char()迭代器:

num_string.chars().nth(i).unwrap()

As I said above, this would require traversing the entire iterator up to the ith code element.

正如我上面所说,这需要遍历整个迭代器直到i第 th 个代码元素。

Finally, in many cases of text processing, it is actually necessary to work with grapheme clustersrather than with code points or bytes. With the help of the unicode-segmentationcrate, you can index into grapheme clusters as well:

最后,在文本处理的许多情况下,实际上需要使用字素而不是代码点或字节。在unicode-segmentationcrate的帮助下,您也可以索引到字素簇中:

use unicode_segmentation::UnicodeSegmentation

let string: String = ...;
UnicodeSegmentation::graphemes(&string, true).nth(i).unwrap()

Naturally, grapheme cluster indexing has the same requirement of traversing the entire string as indexing into code points.

自然地,字素簇索引具有与索引到代码点相同的遍历整个字符串的要求。

回答by Chris Morgan

The correct approach to doing this sort of thing in Rust is not indexing but iteration. The main problem here is that Rust's strings are encoded in UTF-8, a variable-length encoding for Unicode characters. Being variable in length, the memory position of the nth character can't determined without looking at the string. This also means that accessing the nth character has a runtime of O(n)!

在 Rust 中做这种事情的正确方法不是索引而是迭代。这里的主要问题是 Rust 的字符串是用 UTF-8 编码的,UTF-8 是 Unicode 字符的可变长度编码。由于长度可变,不查看字符串就无法确定第 n 个字符的内存位置。这也意味着访问第 n 个字符的运行时间为 O(n)!

In this special case, you can iterate over the bytes, because your string is known to only contain the characters 0–9 (iterating over the characters is the more general solution but is a little less efficient).

在这种特殊情况下,您可以遍历字节,因为已知您的字符串仅包含字符 0-9(遍历字符是更通用的解决方案,但效率稍低)。

Here is some idiomatic code to achieve this (playground):

这是一些惯用的代码来实现这一点(操场):

fn is_palindrome(num: u64) -> bool {
    let num_string = num.to_string();
    let half = num_string.len() / 2;

    num_string.bytes().take(half).eq(num_string.bytes().rev().take(half))
}

We go through the bytes in the string both forwards (num_string.bytes().take(half)) and backwards (num_string.bytes().rev().take(half)) simultaneously; the .take(half)part is there to halve the amount of work done. We then simply compare one iterator to the other one to ensure at each step that the nth and nth last bytes are equivalent; if they are, it returns true; if not, false.

我们同时向前 ( num_string.bytes().take(half)) 和向后 ( num_string.bytes().rev().take(half))遍历字符串中的字节;该.take(half)部分用于将已完成的工作量减半。然后我们简单地将一个迭代器与另一个迭代器进行比较,以确保在每个步骤中第 n 个和第 n 个最后一个字节是等效的;如果是,则返回 true;如果不是,则为假。

回答by Angel Angel

If what you are looking for is something similar to an index, you can use

如果您要查找的内容类似于索引,则可以使用

.chars()and .nth()on a string.

.chars().nth()在一个字符串上。



.chars()-> Returns an iterator over the chars of a string slice.

.chars()->char在字符串切片的s 上返回一个迭代器。

.nth()-> Returns the nth element of the iterator, in an Option

.nth()-> 返回迭代器的第 n 个元素,在一个 Option



Now you can use the above in several ways, for example:

现在您可以通过多种方式使用上述内容,例如:

let s: String = String::from("abc");
//If you are sure
println!("{}", s.chars().nth(x).unwrap());
//or if not
println!("{}", s.chars().nth(x).expect("message"));

回答by iceblue

You can convert a Stringor &strto a vecof a chars and then index that vec.

您可以将 aString&stra转换vec为 a 字符,然后索引该vec.

For example:

例如:

fn main() {
    let s = "Hello world!";
    let my_vec: Vec<char> = s.chars().collect();
    println!("my_vec[0]: {}", my_vec[0]);
    println!("my_vec[1]: {}", my_vec[1]);
}

Here you have a live example

这里有一个活生生的例子