c++ 只读取文本文件最后一行的最快方法?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11876290/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
c++ fastest way to read only last line of text file?
提问by user788171
I would like to read only the last line of a text file (I'm on UNIX, can use Boost). All the methods I know require scanning through the entire file to get the last line which is not efficient at all. Is there an efficient way to get only the last line?
我只想读取文本文件的最后一行(我在 UNIX 上,可以使用 Boost)。我知道的所有方法都需要扫描整个文件才能获得最后一行,这根本没有效率。有没有一种有效的方法来只获取最后一行?
Also, I need this to be robust enough that it works even if the text file in question is constantly being appended to by another process.
此外,我需要它足够健壮,即使有问题的文本文件不断被另一个进程附加到它也能正常工作。
回答by derpface
Use seekg to jump to the end of the file, then read back until you find the first newline. Below is some sample code off the top of my head using MSVC.
使用 seekg 跳转到文件末尾,然后回读直到找到第一个换行符。下面是一些使用 MSVC 的示例代码。
#include <iostream>
#include <fstream>
#include <sstream>
using namespace std;
int main()
{
string filename = "test.txt";
ifstream fin;
fin.open(filename);
if(fin.is_open()) {
fin.seekg(-1,ios_base::end); // go to one spot before the EOF
bool keepLooping = true;
while(keepLooping) {
char ch;
fin.get(ch); // Get current byte's data
if((int)fin.tellg() <= 1) { // If the data was at or before the 0th byte
fin.seekg(0); // The first line is the last line
keepLooping = false; // So stop there
}
else if(ch == '\n') { // If the data was a newline
keepLooping = false; // Stop at the current position.
}
else { // If the data was neither a newline nor at the 0 byte
fin.seekg(-2,ios_base::cur); // Move to the front of that data, then to the front of the data before it
}
}
string lastLine;
getline(fin,lastLine); // Read the current line
cout << "Result: " << lastLine << '\n'; // Display it
fin.close();
}
return 0;
}
And below is a test file. It succeeds with empty, one-line, and multi-line data in the text file.
下面是一个测试文件。它成功处理文本文件中的空、单行和多行数据。
This is the first line.
Some stuff.
Some stuff.
Some stuff.
This is the last line.
回答by Will Hartung
Jump to then end, and start reading blocks backwards until you find whatever your criteria for a line is. If the last block doesn't "end" with a line, you'll probably need to try and scan forward as well (assuming a really long line in an actively appended to file).
跳到最后,然后开始向后阅读块,直到找到一行的标准。如果最后一个块没有以一行“结束”,则您可能还需要尝试向前扫描(假设在主动附加到文件中的行中有很长的行)。
回答by alexandros
Initially this was designed to read the last syslog entry. Given that the last character before the EOF is '\n'
we seek back to find the next occurrence of '\n'
and then we store the line into a string.
最初这旨在读取最后一个系统日志条目。鉴于 EOF 之前的最后一个字符是'\n'
我们寻找下一个出现的'\n'
,然后我们将该行存储到一个字符串中。
#include <fstream>
#include <iostream>
int main()
{
const std::string filename = "test.txt";
std::ifstream fs;
fs.open(filename.c_str(), std::fstream::in);
if(fs.is_open())
{
//Got to the last character before EOF
fs.seekg(-1, std::ios_base::end);
if(fs.peek() == '\n')
{
//Start searching for \n occurrences
fs.seekg(-1, std::ios_base::cur);
int i = fs.tellg();
for(i;i > 0; i--)
{
if(fs.peek() == '\n')
{
//Found
fs.get();
break;
}
//Move one character back
fs.seekg(i, std::ios_base::beg);
}
}
std::string lastline;
getline(fs, lastline);
std::cout << lastline << std::endl;
}
else
{
std::cout << "Could not find end line character" << std::endl;
}
return 0;
}
回答by Joost Huizinga
While the answer by derpface is definitely correct, it often returns unexpected results. The reason for this is that, at least on my operating system (Mac OSX 10.9.5), many text editors terminate their files with an 'end line' character.
虽然 derpface 的答案绝对正确,但它经常返回意想不到的结果。这样做的原因是,至少在我的操作系统 (Mac OSX 10.9.5) 上,许多文本编辑器以“结束行”字符终止其文件。
For example, when I open vim, type just the single character 'a' (no return), and save, the file will now contain (in hex):
例如,当我打开 vim 时,只输入单个字符“a”(不返回),然后保存,文件现在将包含(十六进制):
61 0A
Where 61 is the letter 'a' and 0A is an end of line character.
其中 61 是字母“a”,0A 是行尾字符。
This means that the code by derpface will return an empty string on all files created by such a text editor.
这意味着 derpface 的代码将在由此类文本编辑器创建的所有文件上返回一个空字符串。
While I can certainly imagine cases where a file terminated with an 'end line' should return the empty string, I think ignoring the last 'end line' character would be more appropriate when dealing with regular text files; if the file is terminated by an 'end line' character we properly ignore it, and if the file is not terminated by an 'end line' character we don't need to check it.
虽然我当然可以想象以“结束行”结尾的文件应该返回空字符串的情况,但我认为在处理常规文本文件时忽略最后一个“结束行”字符会更合适;如果文件以“结束行”字符终止,我们会适当地忽略它,如果文件没有以“结束行”字符终止,我们就不需要检查它。
My code for ignoring the last character of the input file is:
我忽略输入文件最后一个字符的代码是:
#include <iostream>
#include <string>
#include <fstream>
#include <iomanip>
int main() {
std::string result = "";
std::ifstream fin("test.txt");
if(fin.is_open()) {
fin.seekg(0,std::ios_base::end); //Start at end of file
char ch = ' '; //Init ch not equal to '\n'
while(ch != '\n'){
fin.seekg(-2,std::ios_base::cur); //Two steps back, this means we
//will NOT check the last character
if((int)fin.tellg() <= 0){ //If passed the start of the file,
fin.seekg(0); //this is the start of the line
break;
}
fin.get(ch); //Check the next character
}
std::getline(fin,result);
fin.close();
std::cout << "final line length: " << result.size() <<std::endl;
std::cout << "final line character codes: ";
for(size_t i =0; i<result.size(); i++){
std::cout << std::hex << (int)result[i] << " ";
}
std::cout << std::endl;
std::cout << "final line: " << result <<std::endl;
}
return 0;
}
Which will output:
这将输出:
final line length: 1
final line character codes: 61
final line: a
On the single 'a' file.
在单个“a”文件上。
EDIT: The line if((int)fin.tellg() <= 0){
actually causes problems if the file is too large (> 2GB), because tellg does not just return the number of characters from the start of the file (tellg() function give wrong size of file?). It may be better to separately test for the start of the file fin.tellg()==tellgValueForStartOfFile
and for errors fin.tellg()==-1
. The tellgValueForStartOfFile
is probably 0, but a better way of making sure would probably be:
编辑:if((int)fin.tellg() <= 0){
如果文件太大(> 2GB),该行实际上会导致问题,因为 tellg 不仅返回文件开头的字符数(tellg() 函数给出错误的文件大小?)。最好分别测试文件的开头fin.tellg()==tellgValueForStartOfFile
和错误fin.tellg()==-1
。的tellgValueForStartOfFile
可能是0,但要确保一个更好的方式很可能是:
fin.seekg (0, is.beg);
tellgValueForStartOfFile = fin.tellg();
回答by carter2000
You can use seekg() to jump to the end of file, and read backward, the Pseudo-code is like:
可以使用 seekg() 跳转到文件尾,向后阅读,伪代码如下:
ifstream fs
fs.seekg(ios_base::end)
bytecount = fs.tellg()
index = 1
while true
fs.seekg(bytecount - step * index, ios_base::beg)
fs.read(buf, step)
if endlinecharacter in buf
get endlinecharacter's index, said ei
fs.seekg(bytecount - step*index + ei)
fs.read(lastline, step*index - ei)
break
++index
回答by Gary Yang
I was also struggling on the problem because I ran uberwulu's code and also got blank line. Here is what I found. I am using the following .csv file as an example:
我也在这个问题上苦苦挣扎,因为我运行了 uberwulu 的代码并且也得到了空行。这是我发现的。我使用以下 .csv 文件作为示例:
date test1 test2
20140908 1 2
20140908 11 22
20140908 111 235
To understand the commands in the code, please notice the following locations and their corresponding chars. (Loc, char) : ... (63,'3') , (64,'5') , (65,-) , (66,'\n'), (EOF,-).
要理解代码中的命令,请注意以下位置及其对应的字符。(Loc, char) : ... (63,'3') , (64,'5') , (65,-) , (66,'\n'), (EOF,-)。
#include<iostream>
#include<string>
#include<fstream>
using namespace std;
int main()
{
std::string line;
std::ifstream infile;
std::string filename = "C:/projects/MyC++Practice/Test/testInput.csv";
infile.open(filename);
if(infile.is_open())
{
char ch;
infile.seekg(-1, std::ios::end); // move to location 65
infile.get(ch); // get next char at loc 66
if (ch == '\n')
{
infile.seekg(-2, std::ios::cur); // move to loc 64 for get() to read loc 65
infile.seekg(-1, std::ios::cur); // move to loc 63 to avoid reading loc 65
infile.get(ch); // get the char at loc 64 ('5')
while(ch != '\n') // read each char backward till the next '\n'
{
infile.seekg(-2, std::ios::cur);
infile.get(ch);
}
string lastLine;
std::getline(infile,lastLine);
cout << "The last line : " << lastLine << '\n';
}
else
throw std::exception("check .csv file format");
}
std::cin.get();
return 0;
}
回答by Daniel Duvilanski
I took alexandros' solution and spruced it up a bit
我采用了 alexandros 的解决方案并对其进行了一些修饰
bool moveToStartOfLine(std::ifstream& fs)
{
fs.seekg(-1, std::ios_base::cur);
for(long i = fs.tellg(); i > 0; i--)
{
if(fs.peek() == '\n')
{
fs.get();
return true;
}
fs.seekg(i, std::ios_base::beg);
}
return false;
}
std::string getLastLineInFile(std::ifstream& fs)
{
// Go to the last character before EOF
fs.seekg(-1, std::ios_base::end);
if (!moveToStartOfLine(fs))
return "";
std::string lastline = "";
getline(fs, lastline);
return lastline;
}
int main()
{
const std::string filename = "test.txt";
std::ifstream fs;
fs.open(filename.c_str(), std::fstream::in);
if(!fs.is_open())
{
std::cout << "Could not open file" << std::endl;
return -1;
}
std::cout << getLastLineInFile(fs) << std::endl;
return 0;
}