Forum Discussion

null null's avatar
null null
Copper Contributor
Jul 31, 2018

How do I extract annotation IDs in an XSD file using Excel formula?

I have a XSD file which contains a list of annotation IDs and documentation. I wanted to extract all the annotation IDs and documentations into a table. I am wondering whether I could write some excel formula to search and extract them.

 

XSD File Content: 

 

<xsd:annotation id="disney">
<xsd:documentation>This is the Disney.</xsd:documentation>
<xsd:annotation id="disney.mickeyFamily”>
<xsd:documentation>
This is the Mickey Family.
</xsd:documentation>
<xsd:attribute name="version" use="required">
<xsd:annotation id="disney.mickeyFamily.mickeyMouse”>
<xsd:documentation>
This is the Mickey Mouse.
</xsd:documentation>
<xsd:attribute name="version" use="required">
</xsd:annotation>
</xsd:annotation>
</xsd:annotation>

Expected Result:

Annotation ID                                     Documentation
disney This is the Disney.
disney.mickeyFamily This is the Mickey Family.
disney.mickeyFamily.mickeyMouse This is the Mickey Mouse.

 

1 Reply

  • SergeiBaklan's avatar
    SergeiBaklan
    Diamond Contributor

    Hi,

     

    Possible way is to use Power Query which has connector to XML files. First step is to generate XML schema for your XSD file, e.g. here http://xmlgrid.net/genXml.html

    I tried with your file, but received parser error, thus modified it a bit manually

    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    
    <xsd:annotation id="disney">
    <xsd:documentation>This is the Disney.</xsd:documentation>
    </xsd:annotation>
    
    <xsd:annotation id="disney.mickeyFamily">
    <xsd:documentation>This is the Mickey Family.</xsd:documentation>
    <xsd:attribute name="version" use="required"/>
    </xsd:annotation>
    
    
    <xsd:annotation id="disney.mickeyFamily.mickeyMouse">
    <xsd:documentation>This is the Mickey Mouse.</xsd:documentation>
    <xsd:attribute name="version" use="required"/>
    </xsd:annotation>
    
    </xsd:schema>

    I'm not a specialist in XSD and XML and not sure how exactly that shall be.

    After that query above XML file and Power Query in couple of steps

    let
        Source = Xml.Tables(File.Contents("C:\Tests\test.xsd"), null, 1252),
        RemoveUnused = Table.SelectColumns(Source,{"Table"}),
        ExpandTable = Table.ExpandTableColumn(RemoveUnused, "Table", {"Attribute:id","documentation"}, {"Annotation ID", "Documentation"})
    in
        ExpandTable

    generates result like this

Resources