XML in Go
Decoding
In this section we’ll examine the rules of the xml decoder and provide examples for each.
1. If the struct has a field type []byte or string with tag “,innerxml”, Unmarshal accumulates the raw XML nested inside in that field. The rest of the rules still apply.
x := `<address>
<street>123 Main St</street>
</address>`
type Address struct {
Contents string `xml:",innerxml"`
}
var addr Volume
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Contents)
// <street>123 Main St</street>
2. If the struct has a field named XMLName of type Name, Unmarshal records the element name in that field.
x := `<address>
<street>123 Main St</street>
</address>`
type Address struct {
XMLName xml.Name
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr.XMLName.Local)
// address
3. If the XMLName field has an associated tag of the form “name” or “namespace-URL name”, the XML element must have the given name (and, optionally, name space) or else Unmarshal returns an error.
x := `<address>
<street>123 Main St</street>
</address>`
type Address struct {
XMLName xml.Name `xml:"city"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
//expected element type <city> but have <address>
}
4. If the XML element has an attribute whose name matches a struct field name with an associated tag containing “,attr” or the explicit name in a struct field tag of the form “name,attr”, Unmarshal records the attribute value in that field.
x := `<address id="999">
<street>123 Main St</street>
</address>`
type Address struct {
Id string `xml:"id,attr"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Id)
//999
5. If the XML element has an attribute not handled by the previous rule and the struct has a field with an associated tag containing “,any,attr”, Unmarshal records the attribute value in the first such field.
x := `<address id="999" span="888">
<street>123 Main St</street>
</address>`
type Address struct {
Id string `xml:"id,attr"`
OtherAttr string `xml:",any,attr"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr)
// {Id:999, OtherAttr:888}
6. If the XML element contains character data, that data is accumulated in the first struct field that has tag “,chardata”. The struct field may have type []byte or string. If there is no such field, the character data is discarded.
x := `<address>
<street>123 Main St</street>
</address>`
type Address struct {
Street struct {
Text string `xml:",chardata"`
} `xml:"street"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Street.Text)
7. If the XML element contains comments, they are accumulated in the first struct field that has tag “,comment”. The struct field may have type []byte or string. If there is no such field, the comments are discarded.
x := `<address>
<!-- an xml comment -->
<street>123 Main St</street>
</address>`
type Address struct {
Comment string `xml:",comment"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Comment)
// an xml comment
8. If the XML element contains a sub-element whose name matches the prefix of a tag formatted as “a” or “a>b>c”, unmarshal will descend into the XML structure looking for elements with the given names, and will map the innermost elements to that struct field. A tag starting with “>” is equivalent to one starting with the field name followed by “>”.
This one is worth dividing in to several examples: The most basic is when a sub-element name that matches a tag.
x := `<address>
<street>123 Main St</street>
</address>`
type Address struct {
Street string `xml:"street"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr.Street)
Now let’s see how we can deal with deeper nestings.
<address>
<street>
<value> 123 Main St</value>
</street>
</address>
Option 1 is to follow the rules we have alrady seen and use a struct.
x := `<address>
<street>
<value> 123 Main St</value>
</street>
</address>`
type Street struct {
Value string `xml:"value"`
}
type Address struct {
Street Street `xml:"street"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr.Street)
Option 2 is to is to descend the xml using >
.
x := `<address>
<street>
<value>123 Main St</value>
</street>
</address>`
type Address struct {
Street string `xml:"street>value"`
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Street)
The same example works for qualified names. You might see xml like this
<address>
<ns:street>123 Main St</ns:street>
</address>
and be tempted to write a struct like this:
type Address struct {
Street string `xml:"ns:street"`
}
That won’t decode correctly! You can leave the xml tag as street
, or if for some reason you need the prefix, replace the colon with a space
type Address struct {
Street string `xml:"ns street"`
}
9. If the XML element contains a sub-element whose name matches a struct field’s XMLName tag and the struct field has no explicit name tag as per the previous rule, unmarshal maps the sub-element to that struct field.
The xml struct tag can be left off for nested structs that contain a XMLName tag
x := `<address>
<street>
<value> 123 Main St</value>
</street>
</address>`
type Street struct {
XMLName xml.Name `xml:"street"`
Value string `xml:"value"`
}
type Address struct {
Street Street //no xml tag needed
}
var addr Address
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr.Street)
Writing a customer Unmarshaller
This example takes the value of an xml field, parses the date, and separates it in to three struct fields.
type Time struct {
Date Date `xml:"date"`
}
type Date struct {
Year int
Month time.Month
Day int
}
func (date *Date) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var s string
err := d.DecodeElement(&s, &start)
if err != nil {
return err
}
t, err := time.Parse("2006-01-02", s)
if err != nil {
return err
}
date.Year = t.Year()
date.Month = t.Month()
date.Day = t.Day()
return nil
}
func main() {
x := `<time>
<date>2025-02-14</date>
</time>`
var addr Time
err := xml.Unmarshal([]byte(x), &addr)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v", addr.Date)
}
Debugging
Manually inspecting how an element is tokenized can be helpful when a document isn’t decoding the way you expect.
x := `<address>
<street>123 Main St</street>
</address>`
d := xml.NewDecoder(bytes.NewBuffer([]byte(x)))
for {
token, err := d.Token()
if err != nil {
if err == io.EOF {
break
}
}
fmt.Printf("%T, %+v", token, token)
}
Dealing with large xml files
The honeymoon phase of xml in Go ends when you start dealing with deeply nested xml. do You write a struct per layer, do you write overly long struct tags to get the one field you need?
x := `<layer1>
<layer2>
<layer3>
<layer4>Hello, World!</layer4>
</layer3>
</layer2>
</layer1>`
type Layer1 struct {
Layer2 Layer2 `xml:"layer2"`
}
type Layer2 struct {
Layer3 Layer3 `xml:"layer3"`
}
type Layer3 struct {
Layer4 string `xml:"layer4"`
}
// alternatively
type PathedLayer4 struct {
Msg string `xml:"layer2>layer3>layer4"`
}
There are libraries that can simplify either approach.