如何使用 C# 解析 XSD 以从 <xsd:simpleType> 元素中获取信息?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11569264/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 18:22:51  来源:igfitidea点击:

How to parse an XSD to get the information from <xsd:simpleType> elements using C#?

c#c#-4.0xsd

提问by Jyina

I have an XSD with multiple complex types and simple types (part of the file shown below). I need to parse this document to get maxLength from each of the simpletypes that are referenced in the complex types. Can anyone please throw some advice on how to implement this? I need to implement this in a generic way so if I query on "Setup_Type" it should give the below output. Thank you!

我有一个包含多种复杂类型和简单类型的 XSD(下面显示的文件的一部分)。我需要解析此文档以从复杂类型中引用的每个简单类型中获取 maxLength。任何人都可以就如何实现这一点提出一些建议吗?我需要以通用方式实现这一点,所以如果我查询“Setup_Type”,它应该给出以下输出。谢谢!

NewSetup/Amount = 12 (The name attributes from element tags separated by "/" and maxLength from the nested simpleType)

NewSetup/Amount = 12(来自嵌套 simpleType 的元素标签的 name 属性由“/”和 maxLength 分隔)

NewSetup/Name = 50

新设置/名称 = 50

<xsd:complexType name="Setup_Type">
  <xsd:sequence>
    <xsd:element name="NewSetup" type="NewSetup_Type" minOccurs="1" maxOccurs="1" />
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="NewSetup_Type">
  <xsd:sequence>
    <xsd:element name="Amount" type="Amount_Type"  minOccurs="1" maxOccurs="1" />
    <xsd:element name="Name" type="Name_Type"  minOccurs="1" maxOccurs="1" />
  </xsd:sequence>
</xsd:complexType>

<xsd:simpleType name="Amount_Type">
  <xsd:annotation>
    <xsd:documentation>Amount</xsd:documentation>
  </xsd:annotation>
  <xsd:restriction base="xsd:string">
    <xsd:maxLength value="12" />
  </xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="Name_Type">
  <xsd:annotation>
    <xsd:documentation>Name</xsd:documentation>
  </xsd:annotation>
  <xsd:restriction base="xsd:string">
    <xsd:maxLength value="50" />
  </xsd:restriction>
</xsd:simpleType>

采纳答案by psubsee2003

I have seen similar questions asked in the past (full disclosure, I've ask a similar questionmyself). Parsing an XSD is not for the faint of heart.

我过去曾看到过类似的问题(完全披露,我自己也问过类似的问题)。解析 XSD 不适合胆小的人。

You basically have 2 options, first is easier to implement, but can be broken more easily by minor changes to the XSD. the 2nd is a more robust but hard to implement.

您基本上有 2 个选项,第一个更容易实现,但可以通过对 XSD 的微小更改更容易地破坏。第二个更强大但难以实施。

Option 1:

选项1:

Parsing the XSD with LINQ (or other C# XML parser if you prefer). Since an XSD is just an XML, you can load it into an XDocumentand just read it via LINQ.

使用 LINQ(或其他 C# XML 解析器,如果您愿意)解析 XSD。由于 XSD 只是一个 XML,您可以将其加载到 一个XDocument并通过 LINQ 读取它。

For just a sample of your own XSD:

对于您自己的 XSD 示例:

<xsd:simpleType name="Amount_Type">
  <xsd:annotation>
    <xsd:documentation>Amount</xsd:documentation>
  </xsd:annotation>
  <xsd:restriction base="xsd:string">
    <xsd:maxLength value="12" />
  </xsd:restriction>
</xsd:simpleType>

You can access the MaxLength:

您可以访问 MaxLength:

var xDoc = XDocument.Load("your XSD path");
var ns = XNamespace.Get(@"http://www.w3.org/2001/XMLSchema");

var length = (from sType in xDoc.Element(ns + "schema").Elements(ns + "simpleType")
              where sType.Attribute("name").Value == "Amount_Type"
              from r in sType.Elements(ns + "restriction")
              select r.Element(ns + "maxLength").Attribute("value")
                      .Value).FirstOrDefault();

This does not offer a very easy method for parsing by type name, especially for extended types. To use this you need to know the exact path for each element you are looking for.

这并没有提供一种非常简单的按类型名称解析的方法,尤其是对于扩展类型。要使用它,您需要知道要查找的每个元素的确切路径。

Option 2:

选项 2:

This is far too complex for a quick answer (note: see the edit below - I had some time and put together a working solution), so I am going to encourage you to look at my own question I linked above. In it, I linked a great blogthat shows how to seriously break down the XSD into pieces and might allow you to perform the type of search you want. You have to decide if it is worth the effort to develop it (the blog shows an implementation with XmlReaderthat contains an XML that is validated against the XSD in question, but you can easily accomplish this by directly loading the XSD and parsing it.

这对于快速回答来说太复杂了(注意:请参阅下面的编辑 - 我有一些时间并整理了一个可行的解决方案),因此我鼓励您查看我在上面链接的自己的问题。在其中,我链接了一个很棒的博客,该博客展示了如何认真地将 XSD 分解为多个部分,并可能允许您执行所需的搜索类型。您必须决定是否值得开发它(该博客展示了一个XmlReader包含针对相关 XSD 验证的 XML的实现,但您可以通过直接加载 XSD 并对其进行解析来轻松完成此操作。

2 key idea to find in the blog are:

在博客中找到的 2 个关键想法是:

// in the getRestriction method (reader in this context is an `XmlReader` that 
//  contains a XML that is being validated against the specific XSD
if (reader.SchemaInfo.SchemaElement == null) return null;
simpleType = reader.SchemaInfo.SchemaElement.ElementSchemaType as XmlSchemaSimpleType;
if (simpleType == null) return null;
restriction = simpleType.Content as XmlSchemaSimpleTypeRestriction;

// then in the getMaxLength method
if (restriction == null) return null;
List<int> result = new List<int>();
foreach (XmlSchemaObject facet in restriction.Facets) {
if (facet is XmlSchemaMaxLengthFacet) result.Add(int.Parse(((XmlSchemaFacet) facet).Value));

I actually tried the same thing last year to parse an XSD as part of a complicated data validation method. It took me the better part of a week to really understand what was happening an to adapt the methods in the blog to suit my purposes. It is definitely the best way to implement exactly what you want.

去年我实际上尝试了同样的事情来解析 XSD 作为复杂数据验证方法的一部分。我花了一周的大部分时间来真正了解正在发生的事情,并调整博客中的方法以适应我的目的。这绝对是实现您想要的功能的最佳方式。

If you want to try this with a standalone schema, you can load the XSD into an XmlSchemaSetobject, then use the GlobalTypesproperty to help you find the specific type you are looking for.

如果您想使用独立模式尝试此操作,您可以将 XSD 加载到一个XmlSchemaSet对象中,然后使用该GlobalTypes属性来帮助您找到您正在寻找的特定类型。



EDIT:I pulled up my old code and started putting together the code to help you.

编辑:我拿出我的旧代码并开始整理代码来帮助你。

First to load your schema:

首先加载您的架构:

XmlSchemaSet set; // this needs to be accessible to the methods below,
                  //  so should be a class level field or property

using (var fs = new FileStream(@"your path here", FileMode.Open)
{
    var schema = XmlSchema.Read(fs, null);

    set = new XmlSchemaSet();
    set.Add(schema);
    set.Compile();
}

The following methods should give you close to what you want based on the XSD you provided. It should be pretty adaptable to deal with more complex structures.

以下方法应该根据您提供的 XSD 为您提供接近所需的内容。它应该非常适合处理更复杂的结构。

public Dictionary<string, int> GetElementMaxLength(String xsdElementName)
{
    if (xsdElementName == null) throw new ArgumentException();
    // if your XSD has a target namespace, you need to replace null with the namespace name
    var qname = new XmlQualifiedName(xsdElementName, null);

    // find the type you want in the XmlSchemaSet    
    var parentType = set.GlobalTypes[qname];

    // call GetAllMaxLength with the parentType as parameter
    var results = GetAllMaxLength(parentType);

    return results;
}

private Dictionary<string, int> GetAllMaxLength(XmlSchemaObject obj)
{
    Dictionary<string, int> dict = new Dictionary<string, int>();

    // do some type checking on the XmlSchemaObject
    if (obj is XmlSchemaSimpleType)
    {
        // if it is a simple type, then call GetMaxLength to get the MaxLength restriction
        var st = obj as XmlSchemaSimpleType;
        dict[st.QualifiedName.Name] = GetMaxLength(st);
    }
    else if (obj is XmlSchemaComplexType)
    {

        // if obj is a complexType, cast the particle type to a sequence
        //  and iterate the sequence
        //  warning - this will fail if it is not a sequence, so you might need
        //  to make some adjustments if you have something other than a xs:sequence
        var ct = obj as XmlSchemaComplexType;
        var seq = ct.ContentTypeParticle as XmlSchemaSequence;

        foreach (var item in seq.Items)
        {
            // item will be an XmlSchemaObject, so just call this same method
            //  with item as the parameter to parse it out
            var rng = GetAllMaxLength(item);

            // add the results to the dictionary
            foreach (var kvp in rng)
            {
                dict[kvp.Key] = kvp.Value;
            }
        }
    }
    else if (obj is XmlSchemaElement)
    {
        // if obj is an XmlSchemaElement, the you need to find the type
        //  based on the SchemaTypeName property.  This is why your 
        //  XmlSchemaSet needs to have class-level scope
        var ele = obj as XmlSchemaElement;
        var type = set.GlobalTypes[ele.SchemaTypeName];

        // once you have the type, call this method again and get the dictionary result
        var rng = GetAllMaxLength(type);

        // put the results in this dictionary.  The difference here is the dictionary
        //  key is put in the format you specified
        foreach (var kvp in rng)
        {
            dict[String.Format("{0}/{1}", ele.QualifiedName.Name, kvp.Key)] = kvp.Value;
        }
    }

    return dict;
}

private Int32 GetMaxLength(XmlSchemaSimpleType xsdSimpleType)
{
    // get the content of the simple type
    var restriction = xsdSimpleType.Content as XmlSchemaSimpleTypeRestriction;

    // if it is null, then there are no restrictions and return -1 as a marker value
    if (restriction == null) return -1;

    Int32 result = -1;

    // iterate the facets in the restrictions, look for a MaxLengthFacet and parse the value
    foreach (XmlSchemaObject facet in restriction.Facets)
    {
        if (facet is XmlSchemaMaxLengthFacet)
        {
            result = int.Parse(((XmlSchemaFacet)facet).Value);
            break;
        }
    }

    return result;
}

Then the usage is pretty simple, you just need to call the GetElementMaxLength(String)method and it will return a dictionary of the names in the format you provided with the value as the max length:

然后用法非常简单,您只需要调用该GetElementMaxLength(String)方法,它就会以您提供的格式返回名称字典,并将值作为最大长度:

var results = GetElementMaxLength("Setup_Type");

foreach (var item in results)
{
    Console.WriteLine("{0} | {1}", item.Key, item.Value);                
}

回答by Diego De Vita

My solution may not be exactly what you are looking for. Probably you'd prefer using System.Xml classes to handle such informations. I don't know how much generic you'd like this parser to be, anyway these are just my 2 cents. My code just uses regular expressions designed to correctly face 99% of possibilities (I guess). Someone would call this like shooting a fly with a gun. Anyway that's it:

我的解决方案可能并不完全符合您的要求。可能您更喜欢使用 System.Xml 类来处理此类信息。我不知道你希望这个解析器有多少通用性,反正这些只是我的 2 美分。我的代码只使用正则表达式,旨在正确面对 99% 的可能性(我猜)。有人会说这就像用枪打苍蝇一样。反正就是这样:

using System.Text.RegularExpressions;
using System.IO;

static class Program
{
    static void main()
    {
        XsdFile file = new XsdFile(@"c:\temp\test.xsd");
        Console.WriteLine(file.Query("Setup_Type"));
    }
}

public class XsdFile
{

    Dictionary<string, XsdType> types;

    public XsdFile(string path)
    {
        string xsdBody = File.ReadAllText(path);
        types = XsdType.CreateTypes(xsdBody);
    }

    public string Query(string typename) {
        return Query(typename, "");
    }

    private string Query(string typename, string parent)
    {
        XsdType type;
        if (types.TryGetValue(typename, out type))
        {
            if (type.GetType() == typeof(ComplexType))
            {
                StringBuilder sb = new StringBuilder();
                ComplexType complexType = (ComplexType)type;
                foreach (string elementName in complexType.elements.Keys)
                {
                    string elementType = complexType.elements[elementName];
                    sb.AppendLine(Query(elementType, parent + "/" + elementName));
                }
                return sb.ToString();
            }
            else if (type.GetType() == typeof(SimpleType))
            {
                SimpleType simpleType = (SimpleType)type;
                return string.Format("{0} = {1}", parent, simpleType.maxLength);
            }
            else {
                return "";
            }
        }
        else
        {
            return "";
        }
    }
}

public abstract class XsdType
{

    string name;

    public XsdType(string name)
    {
        this.name = name;
    }

    public static Dictionary<string, XsdType> CreateTypes(string xsdBody)
    {

        Dictionary<string, XsdType> types = new Dictionary<string, XsdType>();

        MatchCollection mc_types = Regex.Matches(xsdBody, @"<xsd:(?<kind>complex|simple)Type[\s\t]+(?<attributes>[^>]+)>(?<body>.+?)</xsd:Type>", RegexOptions.Singleline);
        foreach (Match m_type in mc_types)
        {
            string typeKind = m_type.Groups["kind"].Value;
            string typeAttributes = m_type.Groups["attributes"].Value;
            string typeBody = m_type.Groups["body"].Value;
            string typeName;
            Match m_nameattribute = Regex.Match(typeAttributes, @"name[\s\t]*=[\s\t]*""(?<name>[^""]+)""", RegexOptions.Singleline);
            if (m_nameattribute.Success)
            {
                typeName = m_nameattribute.Groups["name"].Value;
                if (typeKind == "complex")
                {
                    ComplexType current_type = new ComplexType(typeName);
                    MatchCollection mc_elements = Regex.Matches(typeBody, @"<xsd:element(?<attributes>.+?)/>", RegexOptions.Singleline);
                    foreach (Match m_element in mc_elements)
                    {
                        Dictionary<string, string> elementAttributes = ParseAttributes(m_element.Groups["attributes"].Value);
                        string elementName;
                        string elementType;
                        if (!elementAttributes.TryGetValue("name", out elementName))
                            continue;
                        if (!elementAttributes.TryGetValue("type", out elementType))
                            continue;
                        current_type.elements.Add(elementName, elementType);
                    }
                    types.Add(current_type.name, current_type);
                }
                else if (typeKind == "simple")
                {
                    Match m_maxLength = Regex.Match(typeBody, @"<xsd:restriction[^>]+>.+?<xsd:maxLength.+?value=""(?<maxLength>[^""]+)""", RegexOptions.Singleline);
                    if (m_maxLength.Success)
                    {
                        string maxLength = m_maxLength.Groups["maxLength"].Value;
                        SimpleType current_type = new SimpleType(typeName);
                        current_type.maxLength = maxLength;
                        types.Add(current_type.name, current_type);
                    }
                }
            }
            else
            {
                continue;
            }
        }
        return types;
    }

    private static Dictionary<string, string> ParseAttributes(string value)
    {
        Dictionary<string, string> attributes = new Dictionary<string, string>();
        MatchCollection mc_attributes = Regex.Matches(value, @"(?<name>[^=\s\t]+)[\s\t]*=[\s\t]*""(?<value>[^""]+)""", RegexOptions.Singleline);
        foreach (Match m_attribute in mc_attributes)
        {
            attributes.Add(m_attribute.Groups["name"].Value, m_attribute.Groups["value"].Value);
        }
        return attributes;
    }

}

public class SimpleType : XsdType
{

    public string maxLength;

    public SimpleType(string name)
        : base(name)
    {
    }

}

public class ComplexType : XsdType
{

    //(name-type)
    public Dictionary<string, string> elements = new Dictionary<string,string>();

    public ComplexType(string name)
        : base(name)
    {
    }

}

回答by Nathan Andrew Mullenax

public class result_tree
{
    public string nodevalue = "";

    public bool IsTerminal { get { return ChildCount == 0; } }

    public List<result_tree> children = new List<result_tree>();

    public int ChildCount { get { return children.Count; } }

    public result_tree(string v) { nodevalue = v; }

    private void print_children(bool skip, string prefix)
    {
        if (IsTerminal)
            Console.WriteLine(prefix + (prefix.Length==0?"":"/") + nodevalue);
        else
            foreach (result_tree rt in children)
                rt.print_children(false,prefix + (prefix.Length == 0 ? "" : "/") + (skip?"":nodevalue));
    }

    public void print_children()
    {
        print_children(true,"");
    }
}

static class Program
{
    private static void ValidationCallBack(object sender, ValidationEventArgs args)
    {
        Console.WriteLine(args.Message);
    }

    public static result_tree results;



    static string deref_simple(XmlSchemaSimpleType simp)
    {
        XmlSchemaSimpleTypeRestriction xsstr = (XmlSchemaSimpleTypeRestriction)simp.Content;
        foreach (object o in xsstr.Facets)
        {
            if (o.GetType() == typeof(XmlSchemaMaxLengthFacet))
            {
                XmlSchemaMaxLengthFacet fac = (XmlSchemaMaxLengthFacet)o;
                return fac.Value;
            }
        }
        return "";
    }

    static result_tree deref_complex(XmlSchema xs, XmlSchemaComplexType cplx)
    {
        result_tree rt = new result_tree(cplx.Name);

        if (cplx.Particle.GetType() == typeof(XmlSchemaSequence))
        {
            XmlSchemaSequence seq = (XmlSchemaSequence)cplx.Particle;
            foreach (object o in seq.Items)
            {
                if (o.GetType() == typeof(XmlSchemaElement))
                {
                    XmlSchemaElement elem = (XmlSchemaElement)o;

                    XmlQualifiedName name = elem.SchemaTypeName;

                    result_tree branch;

                    object referto = xs.SchemaTypes[name];
                    if (referto.GetType() == typeof(XmlSchemaComplexType))
                    {
                        branch = deref_complex(xs,(XmlSchemaComplexType)referto);
                        branch.nodevalue = elem.Name;
                    }
                    else if (referto.GetType() == typeof(XmlSchemaSimpleType))
                    {
                        XmlSchemaSimpleType st = (XmlSchemaSimpleType)referto;

                        branch = new result_tree(elem.Name + " = " + deref_simple(st).ToString());
                    }
                    else
                    {
                        branch = null;
                    }
                    if( branch != null )
                        rt.children.Add(branch);

                }
            }
        }

        return rt;
    }

    /// <summary>
    /// The main entry point for the application.
    /// </summary>
    [STAThread]
    static void Main()
    {

        StreamReader sr = new StreamReader("aschema.xml");
        XmlSchema xs = XmlSchema.Read(sr, ValidationCallBack);
        XmlSchemaSet xss = new XmlSchemaSet();
        xss.Add(xs);
        xss.Compile();

        Console.WriteLine("Query: ");
        string q = Console.ReadLine();

        XmlQualifiedName xqn = new XmlQualifiedName(q);

        if (xs.SchemaTypes.Contains(xqn))
        {
            object o = xs.SchemaTypes[xqn];
            if (o.GetType() == typeof(XmlSchemaComplexType))
            {
                results = deref_complex(xs, (XmlSchemaComplexType)o);
                results.print_children();
            }   
        }
        else
        {
            Console.WriteLine("Not found!");
        }

    }
}