C# 性能 - 使用不安全的指针代替 IntPtr 和 Marshal
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17549123/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C# performance - Using unsafe pointers instead of IntPtr and Marshal
提问by kol
Question
题
I'm porting a C application into C#. The C app calls lots of functions from a 3rd-party DLL, so I wrote P/Invoke wrappers for these functions in C#. Some of these C functions allocate data which I have to use in the C# app, so I used IntPtr's, Marshal.PtrToStructureand Marshal.Copyto copy the native data (arrays and structures) into managed variables.
我正在将 C 应用程序移植到 C# 中。C 应用程序从第 3 方 DLL 调用大量函数,因此我在 C# 中为这些函数编写了 P/Invoke 包装器。其中的一些C函数分配,我有在C#应用程序使用的数据,所以就用IntPtr的,Marshal.PtrToStructure并且Marshal.Copy到本机数据(数组和结构)复制到管理变量。
Unfortunately, the C# app proved to be much slower than the C version. A quick performance analysis showed that the above mentioned marshaling-based data copying is the bottleneck. I'm considering to speed up the C# code by rewriting it to use pointers instead.Since I don't have experience with unsafe code and pointers in C#, I need expert opinion regarding the following questions:
不幸的是,事实证明 C# 应用程序比 C 版本慢得多。快速性能分析表明,上述基于编组的数据复制是瓶颈。我正在考虑通过重写它以使用指针来加速 C# 代码。由于我没有使用 C# 中的不安全代码和指针的经验,因此我需要有关以下问题的专家意见:
- What are the drawbacks of using
unsafecode and pointers instead ofIntPtrandMarshaling? For example, is it more unsafe (pun intended) in any way? People seem to prefer marshaling, but I don't know why. - Is using pointers for P/Invoking really faster than using marshaling? How much speedup can be expected approximately? I couldn't find any benchmark tests for this.
- 使用
unsafe代码和指针代替IntPtrandMarshaling 有什么缺点?例如,它是否更不安全(双关语)?人们似乎更喜欢编组,但我不知道为什么。 - 使用指针进行 P/Invoking 真的比使用封送处理快吗?大约可以预期多少加速?我找不到任何基准测试。
Example code
示例代码
To make the situation more clear, I hacked together a small example code (the real code is much more complex). I hope this example shows what I mean when I'm talking about "unsafe code and pointers" vs. "IntPtr and Marshal".
为了使情况更清楚,我拼凑了一个小示例代码(实际代码要复杂得多)。我希望这个例子能说明我在谈论“不安全的代码和指针”与“IntPtr 和 Marshal”时的意思。
C library (DLL)
C 库 (DLL)
MyLib.h
我的库文件
#ifndef _MY_LIB_H_
#define _MY_LIB_H_
struct MyData
{
int length;
unsigned char* bytes;
};
__declspec(dllexport) void CreateMyData(struct MyData** myData, int length);
__declspec(dllexport) void DestroyMyData(struct MyData* myData);
#endif // _MY_LIB_H_
MyLib.c
我的图书馆
#include <stdlib.h>
#include "MyLib.h"
void CreateMyData(struct MyData** myData, int length)
{
int i;
*myData = (struct MyData*)malloc(sizeof(struct MyData));
if (*myData != NULL)
{
(*myData)->length = length;
(*myData)->bytes = (unsigned char*)malloc(length * sizeof(char));
if ((*myData)->bytes != NULL)
for (i = 0; i < length; ++i)
(*myData)->bytes[i] = (unsigned char)(i % 256);
}
}
void DestroyMyData(struct MyData* myData)
{
if (myData != NULL)
{
if (myData->bytes != NULL)
free(myData->bytes);
free(myData);
}
}
C application
C应用
Main.c
主文件
#include <stdio.h>
#include "MyLib.h"
void main()
{
struct MyData* myData = NULL;
int length = 100 * 1024 * 1024;
printf("=== C++ test ===\n");
CreateMyData(&myData, length);
if (myData != NULL)
{
printf("Length: %d\n", myData->length);
if (myData->bytes != NULL)
printf("First: %d, last: %d\n", myData->bytes[0], myData->bytes[myData->length - 1]);
else
printf("myData->bytes is NULL");
}
else
printf("myData is NULL\n");
DestroyMyData(myData);
getchar();
}
C# application, which uses IntPtrand Marshal
C# 应用程序,它使用IntPtr和Marshal
Program.cs
程序.cs
using System;
using System.Runtime.InteropServices;
public static class Program
{
[StructLayout(LayoutKind.Sequential)]
private struct MyData
{
public int Length;
public IntPtr Bytes;
}
[DllImport("MyLib.dll")]
private static extern void CreateMyData(out IntPtr myData, int length);
[DllImport("MyLib.dll")]
private static extern void DestroyMyData(IntPtr myData);
public static void Main()
{
Console.WriteLine("=== C# test, using IntPtr and Marshal ===");
int length = 100 * 1024 * 1024;
IntPtr myData1;
CreateMyData(out myData1, length);
if (myData1 != IntPtr.Zero)
{
MyData myData2 = (MyData)Marshal.PtrToStructure(myData1, typeof(MyData));
Console.WriteLine("Length: {0}", myData2.Length);
if (myData2.Bytes != IntPtr.Zero)
{
byte[] bytes = new byte[myData2.Length];
Marshal.Copy(myData2.Bytes, bytes, 0, myData2.Length);
Console.WriteLine("First: {0}, last: {1}", bytes[0], bytes[myData2.Length - 1]);
}
else
Console.WriteLine("myData.Bytes is IntPtr.Zero");
}
else
Console.WriteLine("myData is IntPtr.Zero");
DestroyMyData(myData1);
Console.ReadKey(true);
}
}
C# application, which uses unsafecode and pointers
C# 应用程序,它使用unsafe代码和指针
Program.cs
程序.cs
using System;
using System.Runtime.InteropServices;
public static class Program
{
[StructLayout(LayoutKind.Sequential)]
private unsafe struct MyData
{
public int Length;
public byte* Bytes;
}
[DllImport("MyLib.dll")]
private unsafe static extern void CreateMyData(out MyData* myData, int length);
[DllImport("MyLib.dll")]
private unsafe static extern void DestroyMyData(MyData* myData);
public unsafe static void Main()
{
Console.WriteLine("=== C# test, using unsafe code ===");
int length = 100 * 1024 * 1024;
MyData* myData;
CreateMyData(out myData, length);
if (myData != null)
{
Console.WriteLine("Length: {0}", myData->Length);
if (myData->Bytes != null)
Console.WriteLine("First: {0}, last: {1}", myData->Bytes[0], myData->Bytes[myData->Length - 1]);
else
Console.WriteLine("myData.Bytes is null");
}
else
Console.WriteLine("myData is null");
DestroyMyData(myData);
Console.ReadKey(true);
}
}
采纳答案by Xan-Kun Clark-Davis
It's a little old thread, but I recently made excessive performance tests with marshaling in C#. I need to unmarshal lots of data from a serial port over many days. It was important to me to have no memory leaks (because the smallest leak will get significant after a couple of million calls) and I also made a lot of statistical performance (time used) tests with very big structs (>10kb) just for the sake of it (an no, you should never have a 10kb struct :-) )
这是一个有点旧的线程,但我最近在 C# 中使用封送处理进行了过多的性能测试。我需要在多日内从串行端口解组大量数据。没有内存泄漏对我来说很重要(因为在几百万次调用后最小的泄漏会变得很重要),我还使用非常大的结构(> 10kb)进行了大量统计性能(使用时间)测试,仅用于为了它(不,你永远不应该有一个 10kb 的结构:-))
I tested the following three unmarshalling strategies (I also tested the marshalling). In nearly all cases the first one (MarshalMatters) outperformed the other two. Marshal.Copy was always slowest by far, the other two were mostly very close together in the race.
我测试了以下三种解组策略(我也测试了编组)。在几乎所有情况下,第一个 (MarshalMatters) 的表现都优于其他两个。Marshal.Copy 一直是最慢的,其他两个在比赛中大多非常接近。
Using unsafe code can pose a significant security risk.
使用不安全的代码会带来重大的安全风险。
First:
第一的:
public class MarshalMatters
{
public static T ReadUsingMarshalUnsafe<T>(byte[] data) where T : struct
{
unsafe
{
fixed (byte* p = &data[0])
{
return (T)Marshal.PtrToStructure(new IntPtr(p), typeof(T));
}
}
}
public unsafe static byte[] WriteUsingMarshalUnsafe<selectedT>(selectedT structure) where selectedT : struct
{
byte[] byteArray = new byte[Marshal.SizeOf(structure)];
fixed (byte* byteArrayPtr = byteArray)
{
Marshal.StructureToPtr(structure, (IntPtr)byteArrayPtr, true);
}
return byteArray;
}
}
Second:
第二:
public class Adam_Robinson
{
private static T BytesToStruct<T>(byte[] rawData) where T : struct
{
T result = default(T);
GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);
try
{
IntPtr rawDataPtr = handle.AddrOfPinnedObject();
result = (T)Marshal.PtrToStructure(rawDataPtr, typeof(T));
}
finally
{
handle.Free();
}
return result;
}
/// <summary>
/// no Copy. no unsafe. Gets a GCHandle to the memory via Alloc
/// </summary>
/// <typeparam name="selectedT"></typeparam>
/// <param name="structure"></param>
/// <returns></returns>
public static byte[] StructToBytes<T>(T structure) where T : struct
{
int size = Marshal.SizeOf(structure);
byte[] rawData = new byte[size];
GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);
try
{
IntPtr rawDataPtr = handle.AddrOfPinnedObject();
Marshal.StructureToPtr(structure, rawDataPtr, false);
}
finally
{
handle.Free();
}
return rawData;
}
}
Third:
第三:
/// <summary>
/// http://stackoverflow.com/questions/2623761/marshal-ptrtostructure-and-back-again-and-generic-solution-for-endianness-swap
/// </summary>
public class DanB
{
/// <summary>
/// uses Marshal.Copy! Not run in unsafe. Uses AllocHGlobal to get new memory and copies.
/// </summary>
public static byte[] GetBytes<T>(T structure) where T : struct
{
var size = Marshal.SizeOf(structure); //or Marshal.SizeOf<selectedT>(); in .net 4.5.1
byte[] rawData = new byte[size];
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.StructureToPtr(structure, ptr, true);
Marshal.Copy(ptr, rawData, 0, size);
Marshal.FreeHGlobal(ptr);
return rawData;
}
public static T FromBytes<T>(byte[] bytes) where T : struct
{
var structure = new T();
int size = Marshal.SizeOf(structure); //or Marshal.SizeOf<selectedT>(); in .net 4.5.1
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.Copy(bytes, 0, ptr, size);
structure = (T)Marshal.PtrToStructure(ptr, structure.GetType());
Marshal.FreeHGlobal(ptr);
return structure;
}
}
回答by Palak.Maheria
Two answers,
两个答案,
Unsafe code means it is not managed by the CLR. You need to take care of resources it uses.
You cannot scale the performance because there are so many factors effecting it. But definitely using pointers will be much faster.
不安全的代码意味着它不受 CLR 管理。您需要照顾它使用的资源。
您无法扩展性能,因为影响它的因素太多了。但肯定使用指针会快得多。
回答by Uldis Valneris
Just wanted to add my experience to this old thread: We used Marshaling in sound recording software - we received real time sound data from mixer into native buffers and marshaled it to byte[]. That was real performance killer. We were forced to move to unsafe structs as the only way to complete the task.
只是想将我的经验添加到这个旧线程中:我们在录音软件中使用了编组 - 我们从混音器接收实时声音数据到本机缓冲区并将其编组到字节 []。那是真正的性能杀手。我们被迫转向不安全的结构体作为完成任务的唯一途径。
In case you don't have large native structs and don't mind that all data is filled twice - Marshaling is more elegant and much, much safer approach.
如果您没有大型本机结构并且不介意所有数据填充两次 - 封送处理是更优雅和更安全的方法。
回答by Ken Kin
Because you stated that your code calls to 3rd-party DLL, I think the unsafecode is more suited in you scenario. You ran into a particular situation of wapping variable-length array in a struct; I know, I know this kind of usage occurs all the time, but it's not alwaysthe case after all. You might want to have a look of some questions about this, for example:
因为你说你的代码调用了 3rd-party DLL,所以我认为不安全的代码更适合你的场景。您遇到了在 a 中移动可变长度数组struct的特殊情况;我知道,我知道这种用法一直在发生,但毕竟并非总是如此。您可能想看看有关此的一些问题,例如:
How do I marshal a struct that contains a variable-sized array to C#?
If .. I say if .. you can modify the third party libraries a bit for this particular case, then you might consider the following usage:
如果 .. 我说如果 .. 您可以针对这种特殊情况稍微修改第三方库,那么您可以考虑以下用法:
using System.Runtime.InteropServices;
public static class Program { /*
[StructLayout(LayoutKind.Sequential)]
private struct MyData {
public int Length;
public byte[] Bytes;
} */
[DllImport("MyLib.dll")]
// __declspec(dllexport) void WINAPI CreateMyDataAlt(BYTE bytes[], int length);
private static extern void CreateMyDataAlt(byte[] myData, ref int length);
/*
[DllImport("MyLib.dll")]
private static extern void DestroyMyData(byte[] myData); */
public static void Main() {
Console.WriteLine("=== C# test, using IntPtr and Marshal ===");
int length = 100*1024*1024;
var myData1 = new byte[length];
CreateMyDataAlt(myData1, ref length);
if(0!=length) {
// MyData myData2 = (MyData)Marshal.PtrToStructure(myData1, typeof(MyData));
Console.WriteLine("Length: {0}", length);
/*
if(myData2.Bytes!=IntPtr.Zero) {
byte[] bytes = new byte[myData2.Length];
Marshal.Copy(myData2.Bytes, bytes, 0, myData2.Length); */
Console.WriteLine("First: {0}, last: {1}", myData1[0], myData1[length-1]); /*
}
else {
Console.WriteLine("myData.Bytes is IntPtr.Zero");
} */
}
else {
Console.WriteLine("myData is empty");
}
// DestroyMyData(myData1);
Console.ReadKey(true);
}
}
As you can see much of your original marshalling code is commented out, and declared a CreateMyDataAlt(byte[], ref int)for a coresponding modified external unmanaged function CreateMyDataAlt(BYTE [], int). Some of the data copy and pointer check turns to be unnecessary, that says, the code can be even simpler and probably runs faster.
正如您所看到的,您的大部分原始编组代码都被注释掉了,并CreateMyDataAlt(byte[], ref int)为相应的修改后的外部非托管函数声明了a CreateMyDataAlt(BYTE [], int)。一些数据复制和指针检查变得不必要,也就是说,代码可以更简单,并且可能运行得更快。
So, what's so different with the modification? The byte array is now marshalled directly without warpping in a structand passed to the unmanaged side. You don't allocate the memory within the unmanaged code, rather, just filling data to it(implementation details omitted); and after the call, the data needed is provided to the managed side. If you want to present that the data is not filled and should not be used, you can simply set lengthto zero to tell the managed side. Because the byte array is allocated within the managed side, it'll be collected sometime, you don't have to take care of that.
那么,修改后有什么不同呢?字节数组现在直接编组,没有在 a 中扭曲struct并传递到非托管端。您不在非托管代码中分配内存,而只是向其中填充数据(省略了实现细节);呼叫结束后,将需要的数据提供给被管端。如果您想表示数据未填充且不应使用,您可以简单地设置length为零以告诉托管方。因为字节数组是在托管端分配的,它会在某个时候被收集,你不必照顾它。
回答by Serge Pavlov
Considerations in Interoperabilityexplains why and when Marshaling is required and at what cost. Quote:
互操作性中的注意事项解释了为什么和何时需要编组以及成本是多少。引用:
- Marshaling occurs when a caller and a callee cannot operate on the same instance of data.
- repeated marshaling can negatively affect the performance of your application.
- 当调用方和被调用方无法对同一数据实例进行操作时,就会发生封送处理。
- 重复封送处理会对应用程序的性能产生负面影响。
Therefore, answering your question if
因此,如果回答您的问题
... using pointers for P/Invoking really faster than using marshaling ...
...使用指针进行 P/Invoking 真的比使用封送处理快 ...
first ask yourself a question if the managed code is able to operate on the unmanaged method return value instance. If the answer is yes then Marshaling and the associated performance cost is not required. The approximate time saving would be O(n)function where nof the size of the marshalled instance. In addition, not keeping both managed and unmanaged blocks of data in memory at the same time for the duration of the method (in "IntPtr and Marshal" example) eliminates additional overhead and the memory pressure.
首先问自己一个问题,托管代码是否能够对非托管方法返回值实例进行操作。如果答案是肯定的,则不需要编组和相关的性能成本。大约节省的时间是O(n)函数,其中n是编组实例的大小。此外,在方法的持续时间内(在“IntPtr 和 Marshal”示例中)不在内存中同时保留托管和非托管数据块可以消除额外的开销和内存压力。
What are the drawbacks of using unsafe code and pointers ...
使用不安全代码和指针的缺点是什么...
The drawback is the risk associated with accessing the memory directly through pointers. There is nothing less safe to it than using pointers in C or C++. Use it if needed and makes sense. More details are here.
缺点是与直接通过指针访问内存相关的风险。没有什么比在 C 或 C++ 中使用指针更安全的了。如果需要并且有意义,请使用它。更多细节在这里。
There is one "safety" concern with the presented examples: releasing of allocated unmanaged memory is not guaranteed after the managed code errors. The best practice is to
所提供的示例存在一个“安全”问题:在托管代码出错后不能保证释放已分配的非托管内存。最好的做法是
CreateMyData(out myData1, length);
if(myData1!=IntPtr.Zero) {
try {
// -> use myData1
...
// <-
}
finally {
DestroyMyData(myData1);
}
}
回答by Simon Bridge
For anyone still reading,
对于还在读书的人,
Something I don't think I saw in any of the answers, - unsafe code does present something of a security risk. It's not a huge risk, it would be something quite challenging to exploit. However, if like me you work in a PCI compliant organization, unsafe code is disallowed by policy for this reason.
我认为我在任何答案中都没有看到的东西 - 不安全的代码确实存在一些安全风险。这不是一个巨大的风险,它会是一个非常具有挑战性的利用。但是,如果像我一样您在符合 PCI 标准的组织中工作,那么出于这个原因,政策不允许使用不安全代码。
Managed code is normally very secure because the CLR takes care of memory location and allocation, preventing you from accessing or writing any memory you're not supposed to.
托管代码通常非常安全,因为 CLR 负责内存位置和分配,防止您访问或写入任何不应该访问的内存。
When you use the unsafe keyword and compile with '/unsafe' and use pointers, you bypass these checks and create the potential for someone to use your application to gain some level of unauthorized access to the machine it is running on. Using something like a buffer-overrun attack, your code could be tricked into writing instructions into an area of memory that might then be accessed by the program counter (i.e. code injection), or just crash the machine.
当您使用 unsafe 关键字并使用 '/unsafe' 编译并使用指针时,您就绕过了这些检查,并创造了可能让某人使用您的应用程序来获得对运行它的机器的某种程度的未经授权的访问。使用诸如缓冲区溢出攻击之类的方法,您的代码可能会被诱骗将指令写入内存区域,然后程序计数器可能会访问该区域(即代码注入),或者只是使机器崩溃。
Many years ago, SQL server actually fell prey to malicious code delivered in a TDS packet that was far longer than it was supposed to be. The method reading the packet didn't check the length and continued to write the contents past the reserved address space. The extra length and content were carefully crafted such that it wrote an entire program into memory - at the address of the next method. The attacker then had their own code being executed by the SQL server within a context that had the highest level of access. It didn't even need to break the encryption as the vulnerability was below this point in the transport layer stack.
许多年前,SQL 服务器实际上成为 TDS 数据包中传递的恶意代码的牺牲品,该数据包的长度远比预期的要长。读取数据包的方法没有检查长度,而是继续写入超出保留地址空间的内容。额外的长度和内容经过精心设计,以便将整个程序写入内存 - 在下一个方法的地址处。然后,攻击者让 SQL 服务器在具有最高访问级别的上下文中执行他们自己的代码。它甚至不需要破解加密,因为漏洞在传输层堆栈中低于这一点。
回答by Tobias Knauss
I had the same question today and I was looking for some concrete measurement values, but I couldn't find any. So I wrote my own tests.
我今天有同样的问题,我正在寻找一些具体的测量值,但我找不到任何值。所以我写了我自己的测试。
The test is copying pixel data of a 10k x 10k RGB image. The image data is 300 MB (3*10^9 bytes). Some methods copy this data 10 times, others are faster and therefore copy it 100 times. The used copying methods include
测试是复制 10k x 10k RGB 图像的像素数据。图像数据为 300 MB(3*10^9 字节)。有些方法复制这个数据 10 次,有些方法更快,因此复制 100 次。使用的复制方法包括
- array access via byte pointer
- Marshal.Copy(): a) 1 * 300 MB, b) 1e9 * 3 bytes
- Buffer.BlockCopy(): a) 1 * 300 MB, b) 1e9 * 3 bytes
- 通过字节指针访问数组
- Marshal.Copy(): a) 1 * 300 MB, b) 1e9 * 3 字节
- Buffer.BlockCopy(): a) 1 * 300 MB, b) 1e9 * 3 字节
Test environment:
CPU: Intel Core i7-3630QM @ 2.40 GHz
OS: Win 7 Pro x64 SP1
Visual Studio 2015.3, code is C++/CLI, targeted .net version is 4.5.2, compiled for Debug.
测试环境:
CPU:Intel Core i7-3630QM @ 2.40 GHz
操作系统:Win 7 Pro x64 SP1
Visual Studio 2015.3,代码为 C++/CLI,目标 .net 版本为 4.5.2,为 Debug 编译。
Test results:
The CPU load is 100% for 1 core at all methods (equals 12.5% total CPU load).
Comparison of speed and execution time:
测试结果:
在所有方法中,1 个内核的 CPU 负载为 100%(等于总 CPU 负载的 12.5%)。
速度和执行时间对比:
method speed exec.time
Marshal.Copy (1*300MB) 100 % 100%
Buffer.BlockCopy (1*300MB) 98 % 102%
Pointer 4.4 % 2280%
Buffer.BlockCopy (1e9*3B) 1.4 % 7120%
Marshal.Copy (1e9*3B) 0.95% 10600%
Execution times and calculated average throughput written as comments in the code below.
执行时间和计算的平均吞吐量写为下面代码中的注释。
//------------------------------------------------------------------------------
static void CopyIntoBitmap_Pointer (array<unsigned char>^ i_aui8ImageData,
BitmapData^ i_ptrBitmap,
int i_iBytesPerPixel)
{
char* scan0 = (char*)(i_ptrBitmap->Scan0.ToPointer ());
int ixCnt = 0;
for (int ixRow = 0; ixRow < i_ptrBitmap->Height; ixRow++)
{
for (int ixCol = 0; ixCol < i_ptrBitmap->Width; ixCol++)
{
char* pPixel = scan0 + ixRow * i_ptrBitmap->Stride + ixCol * 3;
pPixel[0] = i_aui8ImageData[ixCnt++];
pPixel[1] = i_aui8ImageData[ixCnt++];
pPixel[2] = i_aui8ImageData[ixCnt++];
}
}
}
//------------------------------------------------------------------------------
static void CopyIntoBitmap_MarshallLarge (array<unsigned char>^ i_aui8ImageData,
BitmapData^ i_ptrBitmap)
{
IntPtr ptrScan0 = i_ptrBitmap->Scan0;
Marshal::Copy (i_aui8ImageData, 0, ptrScan0, i_aui8ImageData->Length);
}
//------------------------------------------------------------------------------
static void CopyIntoBitmap_MarshalSmall (array<unsigned char>^ i_aui8ImageData,
BitmapData^ i_ptrBitmap,
int i_iBytesPerPixel)
{
int ixCnt = 0;
for (int ixRow = 0; ixRow < i_ptrBitmap->Height; ixRow++)
{
for (int ixCol = 0; ixCol < i_ptrBitmap->Width; ixCol++)
{
IntPtr ptrScan0 = IntPtr::Add (i_ptrBitmap->Scan0, i_iBytesPerPixel);
Marshal::Copy (i_aui8ImageData, ixCnt, ptrScan0, i_iBytesPerPixel);
ixCnt += i_iBytesPerPixel;
}
}
}
//------------------------------------------------------------------------------
void main ()
{
int iWidth = 10000;
int iHeight = 10000;
int iBytesPerPixel = 3;
Bitmap^ oBitmap = gcnew Bitmap (iWidth, iHeight, PixelFormat::Format24bppRgb);
BitmapData^ oBitmapData = oBitmap->LockBits (Rectangle (0, 0, iWidth, iHeight), ImageLockMode::WriteOnly, oBitmap->PixelFormat);
array<unsigned char>^ aui8ImageData = gcnew array<unsigned char> (iWidth * iHeight * iBytesPerPixel);
int ixCnt = 0;
for (int ixRow = 0; ixRow < iHeight; ixRow++)
{
for (int ixCol = 0; ixCol < iWidth; ixCol++)
{
aui8ImageData[ixCnt++] = ixRow * 250 / iHeight;
aui8ImageData[ixCnt++] = ixCol * 250 / iWidth;
aui8ImageData[ixCnt++] = ixCol;
}
}
//========== Pointer ==========
// ~ 8.97 sec for 10k * 10k * 3 * 10 exec, ~ 334 MB/s
int iExec = 10;
DateTime dtStart = DateTime::Now;
for (int ixExec = 0; ixExec < iExec; ixExec++)
{
CopyIntoBitmap_Pointer (aui8ImageData, oBitmapData, iBytesPerPixel);
}
TimeSpan tsDuration = DateTime::Now - dtStart;
Console::WriteLine (tsDuration + " " + ((double)aui8ImageData->Length * iExec / tsDuration.TotalSeconds / 1e6));
//========== Marshal.Copy, 1 large block ==========
// 3.94 sec for 10k * 10k * 3 * 100 exec, ~ 7617 MB/s
iExec = 100;
dtStart = DateTime::Now;
for (int ixExec = 0; ixExec < iExec; ixExec++)
{
CopyIntoBitmap_MarshallLarge (aui8ImageData, oBitmapData);
}
tsDuration = DateTime::Now - dtStart;
Console::WriteLine (tsDuration + " " + ((double)aui8ImageData->Length * iExec / tsDuration.TotalSeconds / 1e6));
//========== Marshal.Copy, many small 3-byte blocks ==========
// 41.7 sec for 10k * 10k * 3 * 10 exec, ~ 72 MB/s
iExec = 10;
dtStart = DateTime::Now;
for (int ixExec = 0; ixExec < iExec; ixExec++)
{
CopyIntoBitmap_MarshalSmall (aui8ImageData, oBitmapData, iBytesPerPixel);
}
tsDuration = DateTime::Now - dtStart;
Console::WriteLine (tsDuration + " " + ((double)aui8ImageData->Length * iExec / tsDuration.TotalSeconds / 1e6));
//========== Buffer.BlockCopy, 1 large block ==========
// 4.02 sec for 10k * 10k * 3 * 100 exec, ~ 7467 MB/s
iExec = 100;
array<unsigned char>^ aui8Buffer = gcnew array<unsigned char> (aui8ImageData->Length);
dtStart = DateTime::Now;
for (int ixExec = 0; ixExec < iExec; ixExec++)
{
Buffer::BlockCopy (aui8ImageData, 0, aui8Buffer, 0, aui8ImageData->Length);
}
tsDuration = DateTime::Now - dtStart;
Console::WriteLine (tsDuration + " " + ((double)aui8ImageData->Length * iExec / tsDuration.TotalSeconds / 1e6));
//========== Buffer.BlockCopy, many small 3-byte blocks ==========
// 28.0 sec for 10k * 10k * 3 * 10 exec, ~ 107 MB/s
iExec = 10;
dtStart = DateTime::Now;
for (int ixExec = 0; ixExec < iExec; ixExec++)
{
int ixCnt = 0;
for (int ixRow = 0; ixRow < iHeight; ixRow++)
{
for (int ixCol = 0; ixCol < iWidth; ixCol++)
{
Buffer::BlockCopy (aui8ImageData, ixCnt, aui8Buffer, ixCnt, iBytesPerPixel);
ixCnt += iBytesPerPixel;
}
}
}
tsDuration = DateTime::Now - dtStart;
Console::WriteLine (tsDuration + " " + ((double)aui8ImageData->Length * iExec / tsDuration.TotalSeconds / 1e6));
oBitmap->UnlockBits (oBitmapData);
oBitmap->Save ("d:\temp\bitmap.bmp", ImageFormat::Bmp);
}
related information:
Why is memcpy() and memmove() faster than pointer increments?
Array.Copy vs Buffer.BlockCopy, Answer https://stackoverflow.com/a/33865267
https://github.com/dotnet/coreclr/issues/2430"Array.Copy & Buffer.BlockCopy x2 to x3 slower < 1kB"
https://github.com/dotnet/coreclr/blob/master/src/vm/comutilnative.cpp, Line 718 at the time of writing: Buffer.BlockCopy()uses memmove
相关信息:
为什么 memcpy() 和 memmove() 比指针增量快?
Array.Copy 与 Buffer.BlockCopy,回答 https://stackoverflow.com/a/33865267
https://github.com/dotnet/coreclr/issues/2430“Array.Copy & Buffer.BlockCopy x2 to x3 慢 < 1kB”
https://github.com/dotnet/coreclr/blob/master/src/vm/comutilnative.cpp,撰写本文时的第 718 行:Buffer.BlockCopy()使用memmove

