使用 C++ 从 xml 文件中读取一行

Question

提问by tech_learner

My XML File has:

我的 XML 文件有：

< Package > xmlMetadata < /Package >

<包> xmlMetadata</包>

I am searching for a tag in this file and the text between the starting and closing tags of this has to be printed on console. i.e. in this case I want xmlMetadata to be printed on the console. Similarly it should go further in the file and print again if it encounters another < Package > tag in the same file.

我正在此文件中搜索标签，并且必须在控制台上打印此文件的开始和结束标签之间的文本。即在这种情况下，我希望将 xmlMetadata 打印在控制台上。同样，如果在同一文件中遇到另一个 <Package> 标记，它应该在文件中更进一步并再次打印。

Here is my code but it is printing the contents of the whole file:

这是我的代码，但它正在打印整个文件的内容：

{
    string line="< Package >";
    ifstream myfile (xmlFileName); //xmlFileName is xml file in which search is to done
    if (myfile.is_open())
    {
    while ( myfile.good() )
    {
      getline (myfile,line);
      std::cout<<line<< endl;
    }
    myfile.close();
    }
    else cout << "Unable to open file"; 
}

Displaying below my whole xml:

显示在我的整个 xml 下方：

< ? xml version="1.0" ? >
< fileStructure >
< Main_Package >
   File_Navigate
< /Main_Package >
< Dependency_Details >

< Dependency >
   < Package >
      xmlMetadata
   < /Package >
   < Header >
      xmlMetadata.h
   < /Header >
   < Header_path >
      C:\Dependency\xmlMetadata\xmlMetadata.h
   < /Header_path >
   < Implementation >
      xmlMetadata.cpp
   < /Implementation >
   < Implementation_path >
      C:\Dependency\xmlMetadata\xmlMetadata.cpp
   < /Implementation_path >
< /Dependency >

< Dependency >
   < Package >
      xmlMetadata1
   < /Package >
   < Header >
      xmlMetadata1.h
   < /Header >
   < Header_path >
      C:\Dependency\xmlMetadata\xmlMetadata1.h
   < /Header_path >
   < Implementation >
      xmlMetadata1.cpp
   < /Implementation >
   < Implementation_path >
      C:\Dependency\xmlMetadata\xmlMetadata1.cpp
   < /Implementation_path >
< /Dependency >

< /Dependency_Details >
< /fileStructure >

Answer 1

回答by Martin Beckett

Getline doesn't search for a line it simply reads each line into the variable "line", you then have to search in that "line" for the text you want.

Getline 不搜索一行，它只是将每一行读入变量“line”，然后您必须在该“line”中搜索您想要的文本。

   size_t found=line.find("Package");
   if (found!=std::string::npos) {
       cout << line;

BUT this is a bad way to handle XML - there is nothing stopping the XML writer from breaking the tag onto multiple lines. Unless this is a one off and you create the file you really should use a general XML parser to read the file and give you a list of tags.

但这是处理 XML 的一种糟糕方式——没有什么能阻止 XML 编写者将标签分成多行。除非这是一次性的并且您创建了文件，否则您真的应该使用通用 XML 解析器来读取文件并为您提供标签列表。

There are a bunch of very easy to use XML parsers, such as TinyXML

有一堆非常容易使用的 XML 解析器，比如 TinyXML

EDIT (different xml now posted) - that's the problem with using regex to parse xml, you don't know how the xml will break lines. You can keep adding more and more layers of complexity until you have written your own xml parser - just use one of What is the best open XML parser for C++?

编辑（现在发布了不同的 xml）-这是使用正则表达式解析 xml 的问题，您不知道 xml 将如何断行。在编写自己的 xml 解析器之前，您可以不断添加越来越多的复杂层 - 只需使用C++ 的最佳开放 XML 解析器是什么？

Answer 2

回答by karlphillip

This is not the way you should parse an XML file, but since you don't want to use a parser library this code might get you started.

这不是您解析 XML 文件的方式，但由于您不想使用解析器库，此代码可能会让您入门。

File: demo.xml

文件：demo.xml

<? xml version="1.0" ?>
<fileStructure>
<Main_Package>
   File_Navigate
</Main_Package>
<Dependency_Details>

<Dependency>
   <Package>
      xmlMetadata
   </Package>
   <Header>
      xmlMetadata.h
   </Header>
   <Header_path>
      C:\Dependency\xmlMetadata\xmlMetadata.h
   </Header_path>
   <Implementation>
      xmlMetadata.cpp
   </Implementation>
   <Implementation_path>
      C:\Dependency\xmlMetadata\xmlMetadata.cpp
   </Implementation_path>
</Dependency>

<Dependency>
   <Package>
      xmlMetadata1
   </Package>
   <Header>
      xmlMetadata1.h
   </Header>
   <Header_path>
      C:\Dependency\xmlMetadata\xmlMetadata1.h
   </Header_path>
   <Implementation>
      xmlMetadata1.cpp
   </Implementation>
   <Implementation_path>
      C:\Dependency\xmlMetadata\xmlMetadata1.cpp
   </Implementation_path>
</Dependency>

</Dependency_Details>
</fileStructure>

The basic idea of the code is whileyou are reading each lineof the file, strip the white spaces that are in the beginning and store the new-stripped-stringinto tmp, and then try to match it to one of the tags you are looking for. Once you find the begin-tag, keep printing the following lines until the close-tagis found.

代码的基本思想是在您阅读文件的每一行时，去除开头的空格并将新去除的字符串存储到tmp 中，然后尝试将其与您所在的标签之一匹配寻找。找到begin-tag 后，继续打印以下行，直到找到close-tag。

File: parse.cpp

文件：parse.cpp

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main()
{
    string line;
    ifstream in("demo.xml");

    bool begin_tag = false;
    while (getline(in,line))
    {
        std::string tmp; // strip whitespaces from the beginning
        for (int i = 0; i < line.length(); i++)
        {
            if (line[i] == ' ' && tmp.size() == 0)
            {
            }
            else
            {
                tmp += line[i];
            }
        }

        //cout << "-->" << tmp << "<--" << endl;

        if (tmp == "<Package>")
        {
            //cout << "Found <Package>" << endl;
            begin_tag = true;
            continue;
        }
        else if (tmp == "</Package>")
        {
            begin_tag = false;
            //cout << "Found </Package>" << endl;
        }

        if (begin_tag)
        {
            cout << tmp << endl;
        }
    }
}

Outputs:

输出：

xmlMetadata
xmlMetadata1

Answer 3

回答by karlphillip

A single line of tags on a file can hardly be described as XML. Anyway, if you really want to parse a XML file, this could be accomplished so much easier using a parser library like RapidXML. This pageis an excellent resource.

文件上的一行标记几乎不能用 XML 来描述。无论如何，如果你真的想解析一个 XML 文件，使用像RapidXML这样的解析器库可以更容易地完成。这个页面是一个很好的资源。

The code below is my attempt to read the following XML (yes, a XML file must have a header):

下面的代码是我尝试读取以下 XML（是的，XML 文件必须有标题）：

File: demo.xml

文件：demo.xml

<?xml version="1.0" encoding="utf-8"?>
<rootnode version="1.0" type="example">
    <Package> xmlMetadata </Package>
</rootnode>

A quick note: rapidxml is consisted only of headers. On my system I unzipped the library to /usr/include/rapidxml-1.13, so the code below could be compiled with:

快速说明：rapidxml 仅由标题组成。在我的系统上，我将库解压缩到/usr/include/rapidxml-1.13，因此可以使用以下代码编译以下代码：

g++ read_tag.cpp -o read_tag -I/usr/include/rapidxml-1.13/

File: read_tag.cpp

文件：read_tag.cpp

#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <rapidxml.hpp>

using namespace std;
using namespace rapidxml;


int main()
{
    string input_xml;
    string line;
    ifstream in("demo.xml");

    // read file into input_xml
    while(getline(in,line))
        input_xml += line;

    // make a safe-to-modify copy of input_xml
    // (you should never modify the contents of an std::string directly)
    vector<char> xml_copy(input_xml.begin(), input_xml.end());
    xml_copy.push_back('##代码##');

    // only use xml_copy from here on!
    xml_document<> doc;
    // we are choosing to parse the XML declaration
    // parse_no_data_nodes prevents RapidXML from using the somewhat surprising
    // behavior of having both values and data nodes, and having data nodes take
    // precedence over values when printing
    // >>> note that this will skip parsing of CDATA nodes <<<
    doc.parse<parse_declaration_node | parse_no_data_nodes>(&xml_copy[0]);

    // alternatively, use one of the two commented lines below to parse CDATA nodes,
    // but please note the above caveat about surprising interactions between
    // values and data nodes (also read http://www.ffuts.org/blog/a-rapidxml-gotcha/)
    // if you use one of these two declarations try to use data nodes exclusively and
    // avoid using value()
    //doc.parse<parse_declaration_node>(&xml_copy[0]); // just get the XML declaration
    //doc.parse<parse_full>(&xml_copy[0]); // parses everything (slowest)

    // since we have parsed the XML declaration, it is the first node
    // (otherwise the first node would be our root node)
    string encoding = doc.first_node()->first_attribute("encoding")->value();
    // encoding == "utf-8"

    // we didn't keep track of our previous traversal, so let's start again
    // we can match nodes by name, skipping the xml declaration entirely
    xml_node<>* cur_node = doc.first_node("rootnode");
    string rootnode_type = cur_node->first_attribute("type")->value();
    // rootnode_type == "example"

    // go straight to the first Package node
    cur_node = cur_node->first_node("Package");
    string content = cur_node->value(); // if the node doesn't exist, this line will crash

    cout << content << endl;
}

Outputs:

输出：

xmlMetadata

使用 C++ 从 xml 文件中读取一行

提问by tech_learner

回答by Martin Beckett

回答by karlphillip

回答by karlphillip

相关推荐

最近更新

标签

使用 C++ 从 xml 文件中读取一行

提问by tech_learner

回答by Martin Beckett

回答by karlphillip

回答by karlphillip

相关推荐

C++ std::shared_ptr 用法和信息

C++ 将 stdout/stderr 重定向到字符串

C++ Boost Program Options 中的向量参数

C++ 读写二进制文件

相关推荐

最近更新

标签