C语言 如何将 C 字符串转换为 Rust 字符串并通过 FFI 返回?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24145823/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I convert a C string into a Rust string and back via FFI?
提问by Dirk
I'm trying to get a C string returned by a C library and convert it to a Rust string via FFI.
我正在尝试获取 C 库返回的 C 字符串,并通过 FFI 将其转换为 Rust 字符串。
mylib.c
mylib.c
const char* hello(){
return "Hello World!";
}
main.rs
主文件
#![feature(link_args)]
extern crate libc;
use libc::c_char;
#[link_args = "-L . -I . -lmylib"]
extern {
fn hello() -> *c_char;
}
fn main() {
//how do I get a str representation of hello() here?
}
回答by Vladimir Matveev
The best way to work with C strings in Rust is to use structures from the std::ffimodule, namely CStrand CString.
在 Rust 中使用 C 字符串的最佳方法是使用std::ffi模块中的结构,即CStr和CString。
CStris a dynamically sized type and so it can only be used through a pointer. This makes it very similar to the regular strtype. You can construct a &CStrfrom *const c_charusing an unsafe CStr::from_ptrstatic method. This method is unsafe because there is no guarantee that the raw pointer you pass to it is valid, that it really does point to a valid C string and that the string's lifetime is correct.
CStr是动态大小的类型,因此只能通过指针使用。这使它与常规str类型非常相似。您可以构建一个&CStr从*const c_char使用不安全的CStr::from_ptr静态方法。这个方法是不安全的,因为不能保证传递给它的原始指针是有效的,它确实指向一个有效的 C 字符串,并且字符串的生命周期是正确的。
You can get a &strfrom a &CStrusing its to_str()method.
您可以使用其方法&str从 a 中获取 a 。&CStrto_str()
Here is an example:
下面是一个例子:
extern crate libc;
use libc::c_char;
use std::ffi::CStr;
use std::str;
extern {
fn hello() -> *const c_char;
}
fn main() {
let c_buf: *const c_char = unsafe { hello() };
let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned(); // if necessary
}
You need to take into account the lifetime of your *const c_charpointers and who owns them. Depending on the C API, you may need to call a special deallocation function on the string. You need to carefully arrange conversions so the slices won't outlive the pointer. The fact that CStr::from_ptrreturns a &CStrwith arbitrary lifetime helps here (though it is dangerous by itself); for example, you can encapsulate your C string into a structure and provide a Derefconversion so you can use your struct as if it was a string slice:
您需要考虑*const c_char指针的生命周期以及拥有它们的人。根据 C API,您可能需要对字符串调用特殊的释放函数。您需要仔细安排转换,以便切片不会超过指针。CStr::from_ptr返回&CStr具有任意生命周期的a的事实在这里有所帮助(尽管它本身很危险);例如,您可以将 C 字符串封装到一个结构中并提供Deref转换,以便您可以像使用字符串切片一样使用您的结构:
extern crate libc;
use libc::c_char;
use std::ops::Deref;
use std::ffi::CStr;
extern "C" {
fn hello() -> *const c_char;
fn goodbye(s: *const c_char);
}
struct Greeting {
message: *const c_char,
}
impl Drop for Greeting {
fn drop(&mut self) {
unsafe {
goodbye(self.message);
}
}
}
impl Greeting {
fn new() -> Greeting {
Greeting { message: unsafe { hello() } }
}
}
impl Deref for Greeting {
type Target = str;
fn deref<'a>(&'a self) -> &'a str {
let c_str = unsafe { CStr::from_ptr(self.message) };
c_str.to_str().unwrap()
}
}
There is also another type in this module called CString. It has the same relationship with CStras Stringwith str- CStringis an owned version of CStr. This means that it "holds" the handle to the allocation of the byte data, and dropping CStringwould free the memory it provides (essentially, CStringwraps Vec<u8>, and it's the latter that will be dropped). Consequently, it is useful when you want to expose the data allocated in Rust as a C string.
该模块中还有另一种类型,称为CString. 它有同样的关系CStr作为String与str-CString是一个拥有版本CStr。这意味着它“持有”字节数据分配的句柄,丢弃CString将释放它提供的内存(本质上,CStringwraps Vec<u8>,后者将被丢弃)。因此,当您想将 Rust 中分配的数据公开为 C 字符串时,它很有用。
Unfortunately, C strings always end with the zero byte and can't contain one inside them, while Rust &[u8]/Vec<u8>are exactly the opposite thing - they do not end with zero byte and can contain arbitrary numbers of them inside. This means that going from Vec<u8>to CStringis neither error-free nor allocation-free - the CStringconstructor both checks for zeros inside the data you provide, returning an error if it finds some, and appends a zero byte to the end of the byte vector which may require its reallocation.
不幸的是,C 字符串总是以零字节结尾,并且内部不能包含 1,而 Rust &[u8]/Vec<u8>恰恰相反——它们不以零字节结尾,并且可以包含任意数量的字符串。这意味着从Vec<u8>toCString既不是无错误也不是无分配 -CString构造函数都检查您提供的数据中的零,如果找到一些则返回错误,并将零字节附加到字节向量的末尾,这可能需要重新分配。
Like String, which implements Deref<Target = str>, CStringimplements Deref<Target = CStr>, so you can call methods defined on CStrdirectly on CString. This is important because the as_ptr()method that returns the *const c_charnecessary for C interoperation is defined on CStr. You can call this method directly on CStringvalues, which is convenient.
就像String,它实现了Deref<Target = str>,CString实现了Deref<Target = CStr>,所以你可以CStr直接调用定义在上的方法CString。这很重要,因为as_ptr()返回*const c_charC 互操作所需的方法是在 上定义的CStr。可以直接对CString值调用这个方法,很方便。
CStringcan be created from everything which can be converted to Vec<u8>. String, &str, Vec<u8>and &[u8]are valid arguments for the constructor function, CString::new(). Naturally, if you pass a byte slice or a string slice, a new allocation will be created, while Vec<u8>or Stringwill be consumed.
CString可以从可以转换为Vec<u8>. String,&str,Vec<u8>并且&[u8]是构造函数有效参数,CString::new()。自然地,如果您传递一个字节切片或字符串切片,则会创建一个新的分配,同时Vec<u8>或String将被消耗。
extern crate libc;
use libc::c_char;
use std::ffi::CString;
fn main() {
let c_str_1 = CString::new("hello").unwrap(); // from a &str, creates a new allocation
let c_str_2 = CString::new(b"world" as &[u8]).unwrap(); // from a &[u8], creates a new allocation
let data: Vec<u8> = b"12345678".to_vec(); // from a Vec<u8>, consumes it
let c_str_3 = CString::new(data).unwrap();
// and now you can obtain a pointer to a valid zero-terminated string
// make sure you don't use it after c_str_2 is dropped
let c_ptr: *const c_char = c_str_2.as_ptr();
// the following will print an error message because the source data
// contains zero bytes
let data: Vec<u8> = vec![1, 2, 3, 0, 4, 5, 0, 6];
match CString::new(data) {
Ok(c_str_4) => println!("Got a C string: {:p}", c_str_4.as_ptr()),
Err(e) => println!("Error getting a C string: {}", e),
}
}
If you need to transfer ownership of the CStringto C code, you can call CString::into_raw. You are then required to get the pointer back and free it in Rust; the Rust allocator is unlikely to be the same as the allocator used by mallocand free. All you need to do is call CString::from_rawand then allow the string to be dropped normally.
如果您需要将所有权转让CString给 C 代码,您可以调用CString::into_raw. 然后你需要取回指针并在 Rust 中释放它;锈分配器不可能是作为分配器使用的相同的malloc和free。您需要做的就是调用CString::from_raw然后允许字符串正常删除。
回答by Des Nerger
In addition to what @vladimir-matveev has said, you can also convert between them without the aid of CStror CString:
除了@vladimir-matveev 所说的之外,您还可以在没有CStror帮助的情况下在它们之间进行转换CString:
#![feature(link_args)]
extern crate libc;
use libc::{c_char, puts, strlen};
use std::{slice, str};
#[link_args = "-L . -I . -lmylib"]
extern "C" {
fn hello() -> *const c_char;
}
fn main() {
//converting a C string into a Rust string:
let s = unsafe {
let c_s = hello();
str::from_utf8_unchecked(slice::from_raw_parts(c_s as *const u8, strlen(c_s)+1))
};
println!("s == {:?}", s);
//and back:
unsafe {
puts(s.as_ptr() as *const c_char);
}
}
Just make sure that when converting from a &str to a C string, your &str ends with '\0'.
Notice that in the code above I use strlen(c_s)+1instead of strlen(c_s), so sis "Hello World!\0", not just "Hello World!".
(Of course in this particular case it works even with just strlen(c_s). But with a fresh &str you couldn't guarantee that the resulting C string would terminate where expected.)
Here's the result of running the code:
只需确保从 &str 转换为 C 字符串时,您的 &str 以'\0'. 请注意,在上面的代码中,我使用了strlen(c_s)+1而不是strlen(c_s),所以s是"Hello World!\0",而不仅仅是"Hello World!".
(当然,在这种特殊情况下,它甚至可以使用strlen(c_s)。但是使用新的 &str 您不能保证生成的 C 字符串会在预期的地方终止。)
这是运行代码的结果:
s == "Hello World!\u{0}"
Hello World!

