C++ QString::split() 和 "\r", "\n" 和 "\r\n" 约定

Question

提问by sashoalm

I understand that QString::splitshould be used to get a QStringListfrom a multiline QString. But if I have a file and I don't know if it comes from Mac, Windows or Unix, I'm not sure if QString.split("\n")would work well in all the cases. What is the best way to handle this situation?

我知道QString::split应该用于QStringList从 multiline 中获取 a QString。但是如果我有一个文件并且我不知道它是来自 Mac、Windows 还是 Unix，我不确定是否QString.split("\n")在所有情况下都能正常工作。处理这种情况的最佳方法是什么？

Answer 1

回答by Emanuele Bezzi

If it's acceptable to remove blank lines, you can try:

如果可以接受删除空行，您可以尝试：

QString.split(QRegExp("[\r\n]"),QString::SkipEmptyParts);

This splits the string whenever any of the newline character (either line feed or carriage return) is found. Any consecutive line breaks (e.g. \r\n\r\nor \n\n) will be considered multiple delimiters with empty parts between them, which will be skipped.

只要找到任何换行符（换行符或回车符），就会拆分字符串。任何连续的换行符（例如\r\n\r\n或\n\n）将被视为多个分隔符，它们之间有空部分，将被跳过。

Answer 2

回答by Keith Thompson

Emanuele Bezzi's answermisses a couple of points.

Emanuele Bezzi 的回答遗漏了几点。

In most cases, a string read from a text file will have been read using a text stream, which automatically translates the OS's end-of-line representation to a single '\n'character. So if you're dealing with native text files, '\n'should be the only delimiter you need to worry about. For example, if your program is running on a Windows system, reading input in text mode, line endings will be marked in memorywith single \ncharacters; you'll never see the "\r\n"pairs that exist in the file.

在大多数情况下，从文本文件中读取的字符串将使用文本流读取，该流将操作系统的行尾表示自动转换为单个'\n'字符。因此，如果您正在处理本机文本文件，则'\n'应该是唯一需要担心的分隔符。例如，如果您的程序运行在 Windows 系统上，以文本模式读取输入，则行尾将在内存中标记为单个\n字符；你永远不会看到"\r\n"文件中存在的对。

But sometimes you do need to deal with "foreign" text files.

但有时您确实需要处理“外部”文本文件。

Ideally, you should probably translate any such files to the local format before reading them, which avoids the issue. Only the translation utility needs to be aware of variant line endings; everything else just deals with text.

理想情况下，您应该在阅读之前将任何此类文件转换为本地格式，从而避免出现此问题。只有翻译实用程序需要知道变体行尾；其他一切都只处理文本。

But that's not always possible; sometimes you might want your program to handle Windows text files when running on a POSIX system (Linux, UNIX, etc.), or vice versa.

但这并不总是可能的。有时您可能希望程序在 POSIX 系统（Linux、UNIX 等）上运行时处理 Windows 文本文件，反之亦然。

A Windows-format text file on a POSIX system will appear to have an extra '\r'character at the end of each line.

POSIX 系统上的 Windows 格式文本文件将'\r'在每行末尾显示一个额外的字符。

A POSIX-format text file on a Windows system will appear to consist of one very long line with embedded '\n'characters.

Windows 系统上的 POSIX 格式文本文件将显示为包含一个非常长的带有嵌入'\n'字符的行。

The most general approach is to read the file in binary mode and deal with the line endings explicitly.

最通用的方法是以二进制模式读取文件并显式处理行尾。

I'm not familiar with QString.split, but I suspect that this:

我不熟悉QString.split，但我怀疑这个：

QString.split(QRegExp("[\r\n]"),QString::SkipEmptyParts);

will ignore empty lines, which will appear either as "\n\n"or as "\r\n\r\n", depending on the format. Empty lines are perfectly valid text data; you shouldn't ignore them unless you're certain that it makes sense to do so.

将忽略空行，空行将显示为"\n\n"或"\r\n\r\n"，具体取决于格式。空行是完全有效的文本数据；除非您确定这样做有意义，否则您不应忽略它们。

If you need to deal with text input delimited either by "\n", "\r\n", or "\r", then I think something like this:

如果你需要处理的文本输入分隔符或者通过"\n"，"\r\n"或者"\r"，那我觉得是这样的：

QString.split(QRegExp("\n|\r\n|\r"));

would do the job. (Thanks to parsley72's comment for helping me with the regular expression syntax.)

会做的工作。（感谢 parsley72 的评论帮助我使用正则表达式语法。）

Another point: you're probably not likely to encounter text files that use just '\r'to delimit lines. That's the format used by MacOS up to version 9. MaxOS X is based on UNIX, and it uses standard UNIX-style '\n'line endings (though it probably tolerates '\r'line endings as well).

另一点：您可能不会遇到仅'\r'用于分隔行的文本文件。这是 MacOS 版本 9 之前使用的格式。MaxOS X 基于 UNIX，它使用标准的 UNIX 样式的'\n'行尾（尽管它也可能容忍'\r'行尾）。

C++ QString::split() 和 "\r", "\n" 和 "\r\n" 约定

提问by sashoalm

回答by Emanuele Bezzi

回答by Keith Thompson

相关推荐

最近更新

标签

C++ QString::split() 和 "\r", "\n" 和 "\r\n" 约定

提问by sashoalm

回答by Emanuele Bezzi

回答by Keith Thompson

相关推荐

C++ 对数字列表及其索引进行排序的最快方法

如何在 C++ 中格式化日期和时间字符串

C++ 创建 DLL 时导出所有符号

C++动态数组的初始值

相关推荐

最近更新

标签