windows 在 C++ (Win32) 中解析 XML 的方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3080224/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 14:41:24  来源:igfitidea点击:

Ways to parse XML in C++ (Win32)

c++windowsxmlwinapi

提问by JWood

I'm looking for a way to parse XML in C++ in Windows and I've found a few such as MSXML, Xerces, TinyXml etc but I'm wondering which is best in terms of performance and features. My requirements are that it must be able to be static linked or have the source included in the project itself and must not require any additional toolits such as boost. MSXML would be the obvious choice as it's an MS library but it seems to be a COM library and rather convoluted to actually get any use out of it.

我正在寻找一种在 Windows 中用 C++ 解析 XML 的方法,我发现了一些方法,例如 MSXML、Xerces、TinyXml 等,但我想知道哪个在性能和功能方面最好。我的要求是它必须能够静态链接或将源包含在项目本身中,并且不得需要任何其他工具,例如 boost。MSXML 将是显而易见的选择,因为它是一个 MS 库,但它似乎是一个 COM 库并且相当复杂,无法真正利用它。

Does anyone have any suggestions as to something quick and simple to use?

有没有人对快速简单的使用有任何建议?

Thanks, J

谢谢,J

采纳答案by Henri

I used libxml with success. The API is a bit confusing and complicated, but once you get it it works pretty good. Besides it is stuffed with functionality, so if you need that, go with libxml. You dont have to worry about bloated binaries since you can only link the parts you need. You dont need to include the complete libxml if you only need to parse xml and dont use the xpath stuff for example

我成功地使用了 libxml。API 有点令人困惑和复杂,但是一旦你掌握它,它就可以很好地工作。此外它充满了功能,所以如果你需要它,请使用 libxml。您不必担心臃肿的二进制文件,因为您只能链接您需要的部分。例如,如果您只需要解析 xml 并且不使用 xpath 内容,则不需要包含完整的 libxml

回答by M. Williams

The best library that I've used and which is absolutely transparentin usage and understanding terms was pugixml.

我使用过并且在使用和理解方面绝对透明的最好的库是pugixml。

Extremely lightweight, very fast, flexible and convenient - what else could one expect?

极其轻便、非常快速、灵活且方便——人们还能期待什么?

回答by Naszta

Since all supported Windows version (including Windows XP SP3) includes MSXML 6.0, you should use MS XML 6.0. You should implement own ISAXContentHandlerclass and usually I implement an ISequentialStreamclass.

由于所有支持的 Windows 版本(包括 Windows XP SP3)都包含 MSXML 6.0,因此您应该使用 MS XML 6.0。您应该实现自己的ISAXContentHandler类,通常我会实现一个ISequentialStream类。

An ISequentialStream implementation for parse:

用于解析的 ISequentialStream 实现:

class MySequentialStream : public ISequentialStream
{
public:
  MySequentialStream( istream &is )
    : is(is), ref_count(0)
  {
    InitializeCriticalSection( &this->critical_section );
  };
  virtual ~MySequentialStream( void )
  {
    DeleteCriticalSection( &this->critical_section );
  }
  virtual HRESULT __stdcall QueryInterface( const IID &riid, void ** ppvObject )
  {
    if ( riid == IID_ISequentialStream )
    {
      *ppvObject = static_cast<void*>(this);
      this->AddRef();
      return S_OK;
    }
    if (riid == IID_IUnknown)
    {
      *ppvObject = static_cast<void*>(this);
      this->AddRef();
      return S_OK;
    }
    *ppvObject = 0;
    return E_NOINTERFACE;
  };
  virtual ULONG __stdcall AddRef( void )
  {
    return InterlockedIncrement(&this->ref_count);
  };
  virtual ULONG __stdcall Release( void )
  {
    ULONG nRefCount = InterlockedDecrement(&this->ref_count);
    if ( nRefCount == 0 ) delete this;
    return nRefCount;
  };    
  virtual HRESULT __stdcall Read( void *pv, ULONG cb, ULONG *pcbRead )
  {
    EnterCriticalSection( &this->critical_section );
    this->is.read( reinterpret_cast<char*>(pv), cb );
    *pcbRead = static_cast<ULONG>( this->is.gcount() );
    LeaveCriticalSection( &this->critical_section );
    return S_OK;
  };
  virtual HRESULT __stdcall Write( void const *pv, ULONG cb, ULONG *pcbWritten )
  {
    *pcbWritten = cb;
    return S_OK;
  };    
private:
  istream &is;
  CRITICAL_SECTION critical_section;
  ULONG ref_count;
};

You should implement an ISAXContentHandler class, too (of course you should fill the methods when you needed):

您也应该实现一个 ISAXContentHandler 类(当然,您应该在需要时填充这些方法):

class MyContentHandler : public ISAXContentHandler
{
public:
  MyContentHandler( void )
    : ref_count(0)
  {};
  virtual ~MyContentHandler( void ) {};
  virtual HRESULT __stdcall QueryInterface( const IID &riid, void ** ppvObject )
  {
    if ( riid == __uuidof(ISAXContentHandler) )
    {
      *ppvObject = static_cast<void*>(this);
      this->AddRef();
      return S_OK;
    }
    if (riid == IID_IUnknown)
    {
      *ppvObject = static_cast<void*>(this);
      this->AddRef();
      return S_OK;
    }
    *ppvObject = 0;
    return E_NOINTERFACE;
  };
  virtual ULONG __stdcall AddRef( void )
  {
    return InterlockedIncrement(&this->ref_count);
  };
  virtual ULONG __stdcall Release( void )
  {
    ULONG nRefCount = InterlockedDecrement(&this->ref_count);
    if ( nRefCount == 0 ) delete this;
    return nRefCount;
  };    
  virtual HRESULT __stdcall putDocumentLocator( ISAXLocator * pLocator) { return S_OK; };
  virtual HRESULT __stdcall startDocument( void ) { return S_OK; };
  virtual HRESULT __stdcall endDocument( void ) { return S_OK; };
  virtual HRESULT __stdcall startPrefixMapping( const wchar_t *pwchPrefix, int cchPrefix, const wchar_t *pwchUri, int cchUri ) { return S_OK; };
  virtual HRESULT __stdcall endPrefixMapping( const wchar_t *pwchPrefix, int cchPrefix) { return S_OK; };
  virtual HRESULT __stdcall startElement( const wchar_t *pwchNamespaceUri, int cchNamespaceUri, const wchar_t *pwchLocalName, int cchLocalName, const wchar_t *pwchQName, int cchQName, ISAXAttributes *pAttributes ) { return S_OK; };
  virtual HRESULT __stdcall endElement( const wchar_t *pwchNamespaceUri, int cchNamespaceUri, const wchar_t *pwchLocalName, int cchLocalName, const wchar_t *pwchQName, int cchQName) { return S_OK; };
  virtual HRESULT __stdcall characters( const wchar_t *pwchChars, int cchChars) { return S_OK; };
  virtual HRESULT __stdcall ignorableWhitespace( const wchar_t *pwchChars, int cchChars) { return S_OK; };
  virtual HRESULT __stdcall processingInstruction( const wchar_t *pwchTarget, int cchTarget, const wchar_t *pwchData, int cchData) { return S_OK; };
  virtual HRESULT __stdcall skippedEntity( const wchar_t *pwchName, int cchName) { return S_OK; };
protected:
  ULONG ref_count;
};

Then you could easily parse a stream:

然后您可以轻松解析流:

bool ParseStream( istream &is )
{
  if ( FAILED(CoInitialize(NULL)) )
   return false;

  ISAXXMLReader * reader = 0;
  if ( FAILED( CoCreateInstance( __uuidof(SAXXMLReader60), NULL, CLSCTX_ALL, __uuidof(ISAXXMLReader),(void**) &reader ) ) )
  {
   CoUninitialize()
   return false;
  }

  ISequentialStream * my_stream = new MySequentialStream(is);
  ISAXContentHandler * content_handler = new MyContentHandler;

  my_stream->AddRef();
  content_handler->AddRef();

  if ( FAILED( reader->putContentHandler( content_handler ) ) )
  {
   my_stream->Release();
   content_handler->Release();
   reader->Release();
   return false;
  }

  VARIANT var;
  var.vt = VT_UNKNOWN;
  var.punkVal = my_stream;
  VARIANT_BOOL success = FALSE;

  bool value = SUCCEEDED( reader->parse( var ) );

  my_stream->Release();
  content_handler->Release();
  reader->Release();
  return ( value && ( success != VARIANT_FALSE ) );
}

回答by Martin York

The heavyweight daddy of XML parsers is Xerces
A simpler easier parser Expatthere are C++ wrappersaround.

XML 解析器的重量级爸爸是Xerces
一个更简单的解析器Expat周围有C++ 包装器

There are a lot of XML parsers around.
A quick Google will find you plenty.

周围有很多 XML 解析器。
一个快速的谷歌会发现你很多。