Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the gd-system-plugin domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /var/www/wp-includes/functions.php on line 6114
Python Tip: Convert XML Tree To A Dictionary – Eric Scrivner
Python Tip: Convert XML Tree To A Dictionary
home // page // Python Tip: Convert XML Tree To A Dictionary

Python Tip: Convert XML Tree To A Dictionary

I was doing a simple XML integration with SOAP service today and it really struck me that a lot of the data manipulation would be easier if the data was a dictionary. In addition, the XML returned was guaranteed to be fairly small and have only a handful of schemas – so a full-blown SAX parser wasn’t really necessary as there was no risk of overflowing memory with the raw XML data. So I decided to write a simple recursive algorithm to do the conversion. I’m posting it here in the hopes it saves someone else a bit of time in the future:

 

import xml.etree.ElementTree

def make_dict_from_tree(element_tree):
    """Traverse the given XML element tree to convert it into a dictionary.

    :param element_tree: An XML element tree
    :type element_tree: xml.etree.ElementTree
    :rtype: dict
    """
    def internal_iter(tree, accum):
        """Recursively iterate through the elements of the tree accumulating
        a dictionary result.

        :param tree: The XML element tree
        :type tree: xml.etree.ElementTree
        :param accum: Dictionary into which data is accumulated
        :type accum: dict
        :rtype: dict
        """
        if tree is None:
            return accum

        if tree.getchildren():
            accum[tree.tag] = {}
            for each in tree.getchildren():
                result = internal_iter(each, {})
                if each.tag in accum[tree.tag]:
                    if not isinstance(accum[tree.tag][each.tag], list):
                        accum[tree.tag][each.tag] = [
                            accum[tree.tag][each.tag]
                        ]
                    accum[tree.tag][each.tag].append(result[each.tag])
                else:
                    accum[tree.tag].update(result)
        else:
            accum[tree.tag] = tree.text

        return accum

    return internal_iter(element_tree, {})

make_dict_from_tree(xml.etree.ElementTree.fromstring(xml_string))

This seems to “Do The Right Thing” — for example, if you give it the following test data:

<DATA>
  <Items>
    <Item>
      <Name>Ha</Name>
      <Name>Hu</Name>
    </Item>
    <Item>
      <Name>Da</Name>
      <Name>Du</Name>
    </Item>
  </Items>
</DATA>

You get the following dictionary out:

{
  'DATA': {
    'Items': {
      'Item': [{'Name': ['Ha', 'Hu']}, {'Name': ['Da', 'Du']}]
    }
  }
}

 

NOTE: For the CS geeks out there, this does an post-order traversal of the XML tree. Also, this does not handle attributes.

EDIT: There’s a pretty concise answer on StackOverflow, but the results it returns are different from what I wanted.