C++ 如何让 IOStream 表现得更好?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5166263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get IOStream to perform better?
提问by Matthieu M.
Most C++ users that learned C prefer to use the printf
/ scanf
family of functions even when they're coding in C++.
大多数学习 C 的 C++ 用户更喜欢使用printf
/scanf
系列函数,即使他们在使用 C++ 编码时也是如此。
Although I admit that I find the interface way better (especially POSIX-like format and localization), it seems that an overwhelming concern is performance.
虽然我承认我发现界面更好(尤其是类似 POSIX 的格式和本地化),但似乎压倒性的关注是性能。
Taking at look at this question:
看看这个问题:
It seems that the best answer is to use fscanf
and that the C++ ifstream
is consistently 2-3 times slower.
似乎最好的答案是使用fscanf
,并且 C++ifstream
始终慢 2-3 倍。
I thought it would be great if we could compile a repository of "tips" to improve IOStreams performance, what works, what does not.
我认为如果我们可以编译一个“技巧”存储库来提高 IOStreams 性能,哪些有效,哪些无效,那就太好了。
Points to consider
需要考虑的要点
- buffering (
rdbuf()->pubsetbuf(buffer, size)
) - synchronization (
std::ios_base::sync_with_stdio
) - locale handling (Could we use a trimmed-down locale, or remove it altogether ?)
- 缓冲 (
rdbuf()->pubsetbuf(buffer, size)
) - 同步 (
std::ios_base::sync_with_stdio
) - 语言环境处理(我们可以使用精简的语言环境,还是完全删除它?)
Of course, other approaches are welcome.
当然,也欢迎其他方法。
Note: a "new" implementation, by Dietmar Kuhl, was mentioned, but I was unable to locate many details about it. Previous references seem to be dead links.
注意:提到了 Dietmar Kuhl 的“新”实现,但我无法找到有关它的许多细节。以前的参考文献似乎是死链接。
采纳答案by Matthieu M.
Here is what I have gathered so far:
以下是我到目前为止收集的内容:
Buffering:
缓冲:
If by default the buffer is very small, increasing the buffer size can definitely improve the performance:
如果默认情况下缓冲区很小,增加缓冲区大小肯定可以提高性能:
- it reduces the number of HDD hits
- it reduces the number of system calls
- 它减少了硬盘命中次数
- 它减少了系统调用的数量
Buffer can be set by accessing the underlying streambuf
implementation.
可以通过访问底层streambuf
实现来设置缓冲区。
char Buffer[N];
std::ifstream file("file.txt");
file.rdbuf()->pubsetbuf(Buffer, N);
// the pointer reader by rdbuf is guaranteed
// to be non-null after successful constructor
Warning courtesy of @iavr: according to cppreferenceit is best to call pubsetbuf
before opening the file. Various standard library implementations otherwise have different behaviors.
@iavr 提供警告:根据cppreference,最好pubsetbuf
在打开文件之前调用。否则,各种标准库实现具有不同的行为。
Locale Handling:
语言环境处理:
Locale can perform character conversion, filtering, and more clever tricks where numbers or dates are involved. They go through a complex system of dynamic dispatch and virtual calls, so removing them can help trimming down the penalty hit.
Locale 可以执行字符转换、过滤以及涉及数字或日期的更聪明的技巧。它们经历了一个复杂的动态调度和虚拟调用系统,因此删除它们可以帮助减少惩罚。
The default C
locale is meant not to perform any conversion as well as being uniform across machines. It's a good default to use.
默认C
语言环境意味着不执行任何转换以及跨机器统一。这是一个很好的默认使用。
Synchronization:
同步:
I could not see any performance improvement using this facility.
使用此工具我看不到任何性能改进。
One can access a globalsetting (static member of std::ios_base
) using the sync_with_stdio
static function.
可以使用静态函数访问全局设置( 的静态成员std::ios_base
)sync_with_stdio
。
Measurements:
测量:
Playing with this, I have toyed with a simple program, compiled using gcc 3.4.2
on SUSE 10p3 with -O2
.
玩这个,我玩弄了一个简单的程序,gcc 3.4.2
在 SUSE 10p3 上使用-O2
.
C : 7.76532e+06
C++: 1.0874e+07
C:7.76532e+06
C++:1.0874e+07
Which represents a slowdown of about 20%
... for the default code. Indeed tampering with the buffer (in either C or C++) or the synchronization parameters (C++) did not yield any improvement.
20%
对于默认代码,这代表了大约......的减速。实际上,篡改缓冲区(在 C 或 C++ 中)或同步参数 (C++) 并没有产生任何改进。
Results by others:
其他人的结果:
@Irfy on g++ 4.7.2-2ubuntu1, -O3, virtualized Ubuntu 11.10, 3.5.0-25-generic, x86_64, enough ram/cpu, 196MB of several "find / >> largefile.txt" runs
C : 634572 C++: 473222
@Irfy on g++ 4.7.2-2ubuntu1,-O3,虚拟化 Ubuntu 11.10,3.5.0-25-generic,x86_64,足够的 ram/cpu,196MB 的几个“find / >> largefile.txt”运行
C:634572 C++:473222
C++ 25% faster
C++快 25%
@Matteo Italia on g++ 4.4.5, -O3, Ubuntu Linux 10.10 x86_64 with a random 180 MB file
C : 910390
C++: 776016
@Matteo Italia 在 g++ 4.4.5、-O3、Ubuntu Linux 10.10 x86_64 上使用随机 180 MB 文件
C:910390
C++:776016
C++ 17% faster
C++快 17%
@Bogatyr on g++ i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664), mac mini, 4GB ram, idle except for this test with a 168MB datafile
C : 4.34151e+06
C++: 9.14476e+06
@Bogatyr on g++ i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664),mac mini,4GB ram,空闲,除了这个测试有 168MB 数据文件
C:4.34151e+06
C++:9.14476e+06
C++ 111% slower
C++慢 111%
@Asu on clang++ 3.8.0-2ubuntu4, Kubuntu 16.04 Linux 4.8-rc3, 8GB ram, i5 Haswell, Crucial SSD, 88MB datafile (tar.xz archive)
C : 270895 C++: 162799
@Asu on clang++ 3.8.0-2ubuntu4、Kubuntu 16.04 Linux 4.8-rc3、8GB ram、i5 Haswell、Crucial SSD、88MB 数据文件(tar.xz 存档)
C:270895 C++:162799
C++ 66% faster
C++快 66%
So the answer is: it's a quality of implementation issue, and really depends on the platform :/
所以答案是:这是一个实施质量问题,实际上取决于平台:/
The code in full here for those interested in benchmarking:
对于那些对基准测试感兴趣的人,这里有完整的代码:
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cmath>
#include <cstdio>
#include <sys/time.h>
template <typename Func>
double benchmark(Func f, size_t iterations)
{
f();
timeval a, b;
gettimeofday(&a, 0);
for (; iterations --> 0;)
{
f();
}
gettimeofday(&b, 0);
return (b.tv_sec * (unsigned int)1e6 + b.tv_usec) -
(a.tv_sec * (unsigned int)1e6 + a.tv_usec);
}
struct CRead
{
CRead(char const* filename): _filename(filename) {}
void operator()() {
FILE* file = fopen(_filename, "r");
int count = 0;
while ( fscanf(file,"%s", _buffer) == 1 ) { ++count; }
fclose(file);
}
char const* _filename;
char _buffer[1024];
};
struct CppRead
{
CppRead(char const* filename): _filename(filename), _buffer() {}
enum { BufferSize = 16184 };
void operator()() {
std::ifstream file(_filename, std::ifstream::in);
// comment to remove extended buffer
file.rdbuf()->pubsetbuf(_buffer, BufferSize);
int count = 0;
std::string s;
while ( file >> s ) { ++count; }
}
char const* _filename;
char _buffer[BufferSize];
};
int main(int argc, char* argv[])
{
size_t iterations = 1;
if (argc > 1) { iterations = atoi(argv[1]); }
char const* oldLocale = setlocale(LC_ALL,"C");
if (strcmp(oldLocale, "C") != 0) {
std::cout << "Replaced old locale '" << oldLocale << "' by 'C'\n";
}
char const* filename = "largefile.txt";
CRead cread(filename);
CppRead cppread(filename);
// comment to use the default setting
bool oldSyncSetting = std::ios_base::sync_with_stdio(false);
double ctime = benchmark(cread, iterations);
double cpptime = benchmark(cppread, iterations);
// comment if oldSyncSetting's declaration is commented
std::ios_base::sync_with_stdio(oldSyncSetting);
std::cout << "C : " << ctime << "\n"
"C++: " << cpptime << "\n";
return 0;
}
回答by gaazkam
Two more improvements:
还有两个改进:
Issue std::cin.tie(nullptr);
before heavy input/output.
std::cin.tie(nullptr);
在大量输入/输出之前发出问题。
Quoting http://en.cppreference.com/w/cpp/io/cin:
引用http://en.cppreference.com/w/cpp/io/cin:
Once std::cin is constructed, std::cin.tie() returns &std::cout, and likewise, std::wcin.tie() returns &std::wcout. This means that any formatted input operation on std::cin forces a call to std::cout.flush() if any characters are pending for output.
一旦构造了 std::cin,std::cin.tie() 返回 &std::cout,同样地,std::wcin.tie() 返回 &std::wcout。这意味着如果有任何字符等待输出,则 std::cin 上的任何格式化输入操作都会强制调用 std::cout.flush() 。
You can avoid flushing the buffer by untying std::cin
from std::cout
. This is relevant with multiple mixed calls to std::cin
and std::cout
. Note that calling std::cin.tie(std::nullptr);
makes the program unsuitable to run interactively by user, since output may be delayed.
您可避免解开刷新缓冲区std::cin
的std::cout
。这与对std::cin
和 的多次混合调用有关std::cout
。请注意,调用std::cin.tie(std::nullptr);
会使程序不适合用户交互运行,因为输出可能会延迟。
Relevant benchmark:
相关基准:
File test1.cpp
:
文件test1.cpp
:
#include <iostream>
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
int i;
while(cin >> i)
cout << i << '\n';
}
File test2.cpp
:
文件test2.cpp
:
#include <iostream>
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
int i;
while(cin >> i)
cout << i << '\n';
cout.flush();
}
Both compiled by g++ -O2 -std=c++11
. Compiler version: g++ (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
(yeah, I know, pretty old).
两者都由g++ -O2 -std=c++11
. 编译器版本:(g++ (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
是的,我知道,很旧)。
Benchmark results:
基准测试结果:
work@mg-K54C ~ $ time ./test1 < test.in > test1.in
real 0m3.140s
user 0m0.581s
sys 0m2.560s
work@mg-K54C ~ $ time ./test2 < test.in > test2.in
real 0m0.234s
user 0m0.234s
sys 0m0.000s
(test.in
consists of 1179648 lines each consisting only of a single 5
. It's 2.4 MB, so sorry for not posting it here.).
(test.in
由 1179648 行组成,每行只包含一个5
. 它是 2.4 MB,很抱歉没有在这里发布。)
I remember solving an algorithmic task where the online judge kept refusing my program without cin.tie(nullptr)
but was accepting it with cin.tie(nullptr)
or printf
/scanf
instead of cin
/cout
.
我记得解决了一个算法任务,在线法官一直拒绝我的程序,cin.tie(nullptr)
但用cin.tie(nullptr)
或printf
/scanf
代替cin
/接受它cout
。
Use '\n'
instead of std::endl
.
使用'\n'
代替std::endl
。
Quoting http://en.cppreference.com/w/cpp/io/manip/endl:
引用http://en.cppreference.com/w/cpp/io/manip/endl:
Inserts a newline character into the output sequence os and flushes it as if by calling os.put(os.widen('\n')) followed by os.flush().
将换行符插入输出序列 os 并刷新它,就像调用 os.put(os.widen('\n')) 后跟 os.flush() 一样。
You can avoid flushing the bufer by printing '\n'
instead of endl
.
您可以通过打印'\n'
而不是endl
.
Relevant benchmark:
相关基准:
File test1.cpp
:
文件test1.cpp
:
#include <iostream>
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
for(int i = 0; i < 1179648; ++i)
cout << i << endl;
}
File test2.cpp
:
文件test2.cpp
:
#include <iostream>
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
for(int i = 0; i < 1179648; ++i)
cout << i << '\n';
}
Both compiled as above.
两者编译如上。
Benchmark results:
基准测试结果:
work@mg-K54C ~ $ time ./test1 > test1.in
real 0m2.946s
user 0m0.404s
sys 0m2.543s
work@mg-K54C ~ $ time ./test2 > test2.in
real 0m0.156s
user 0m0.135s
sys 0m0.020s
回答by CashCow
Interesting you say C programmers prefer printf when writing C++ as I see a lot of code that is C other than using cout
and iostream
to write the output.
有趣的是,你说 C 程序员在编写 C++ 时更喜欢 printf,因为我看到很多 C 代码不是使用cout
和iostream
编写输出。
Uses can often get better performance by using filebuf
directly (Scott Meyers mentioned this in Effective STL) but there is relatively little documentation in using filebuf direct and most developers prefer std::getline
which is simpler most of the time.
使用通常可以通过filebuf
直接使用获得更好的性能(Scott Meyers 在 Effective STL 中提到了这一点),但是关于直接使用 filebuf 的文档相对较少,大多数开发人员在大多数情况下更喜欢std::getline
哪个更简单。
With regards to locale, if you create facets you will often get better performance by creating a locale once with all your facets, keeping it stored, and imbuing it into each stream you use.
关于语言环境,如果您创建方面,您通常会通过使用所有方面创建一次语言环境,将其存储起来并将其注入到您使用的每个流中来获得更好的性能。
I did see another topic on this here recently, so this is close to being a duplicate.
我最近确实在这里看到了另一个主题,所以这几乎是重复的。