How do I extract annotation IDs in an XSD file using Excel formula?

Copper Contributor

I have a XSD file which contains a list of annotation IDs and documentation. I wanted to extract all the annotation IDs and documentations into a table. I am wondering whether I could write some excel formula to search and extract them.

 

XSD File Content: 

 

<xsd:annotation id="disney">
<xsd:documentation>This is the Disney.</xsd:documentation>
<xsd:annotation id="disney.mickeyFamily”>
<xsd:documentation>
This is the Mickey Family.
</xsd:documentation>
<xsd:attribute name="version" use="required">
<xsd:annotation id="disney.mickeyFamily.mickeyMouse”>
<xsd:documentation>
This is the Mickey Mouse.
</xsd:documentation>
<xsd:attribute name="version" use="required">
</xsd:annotation>
</xsd:annotation>
</xsd:annotation>

Expected Result:

Annotation ID                                     Documentation
disney This is the Disney.
disney.mickeyFamily This is the Mickey Family.
disney.mickeyFamily.mickeyMouse This is the Mickey Mouse.

 

1 Reply

Hi,

 

Possible way is to use Power Query which has connector to XML files. First step is to generate XML schema for your XSD file, e.g. here http://xmlgrid.net/genXml.html

I tried with your file, but received parser error, thus modified it a bit manually

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:annotation id="disney">
<xsd:documentation>This is the Disney.</xsd:documentation>
</xsd:annotation>

<xsd:annotation id="disney.mickeyFamily">
<xsd:documentation>This is the Mickey Family.</xsd:documentation>
<xsd:attribute name="version" use="required"/>
</xsd:annotation>


<xsd:annotation id="disney.mickeyFamily.mickeyMouse">
<xsd:documentation>This is the Mickey Mouse.</xsd:documentation>
<xsd:attribute name="version" use="required"/>
</xsd:annotation>

</xsd:schema>

I'm not a specialist in XSD and XML and not sure how exactly that shall be.

After that query above XML file and Power Query in couple of steps

let
    Source = Xml.Tables(File.Contents("C:\Tests\test.xsd"), null, 1252),
    RemoveUnused = Table.SelectColumns(Source,{"Table"}),
    ExpandTable = Table.ExpandTableColumn(RemoveUnused, "Table", {"Attribute:id","documentation"}, {"Annotation ID", "Documentation"})
in
    ExpandTable

generates result like this

image.png