Question 1

xmltodict converts XML to a Python dictionary. It supports namespaces. I can follow the example on the homepage and successfully remove a namespace. However, I cannot remove the namespace from my XML and cannot identify why? Here is my XML:

<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus">
<section1mystatus:field1="data1"mystatus:field2="data2" />
<section2mystatus:lineA="outputA"mystatus:lineB="outputB" />
</status>

And using:

xmltodict.parse(xml,process_namespaces=True,namespaces={'http://localhost/mystatus':None})

I get:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'@http://localhost/mystatus:field1', u'data1'), (u'@http://localhost/mystatus:field2', u'data2')])), (u'section2', OrderedDict([(u'@http://localhost/mystatus:lineA', u'outputA'), (u'@http://localhost/mystatus:lineB', u'outputB')]))]))])

instead of:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'field1', u'data1'), (u'field2', u'data2')])), (u'section2', OrderedDict([(u'lineA', u'outputA'), (u'@lineB', u'outputB')]))]))])

Am I making some simple mistake, or is there something about my XML that prevents the process_namespace modification from working correctly?

Question 2

xmltodict is based on expat, so namespaces should applied to the class name, not attribute names:

<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus"><mystatus:section1 field1="data1" field2="data2" /><mystatus:section2 lineA="outputA" lineB="outputB" />
</status>

When parsed with:

foo = xmltodict.parse(xml,process_namespaces=True,namespaces={'http://localhost/mystatus':None})

outputs:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'@field1', u'data1'), (u'@field2', u'data2')])), (u'section2', OrderedDict([(u'@lineA', u'outputA'), (u'@lineB', u'outputB')]))]))])

Accessing it is easy:

# Get attribute 'lineA' from class 'section2' from class 'status'
>>> foo.get('status').get('section2').get('@lineA')
u'outputA'

Attribute namespaces are only required when you have multiple attributes of the same name (e.g. multiple id's or multiple prices, etc), in which case, I couldn't get expat or xmltodict to parse it correctly. YMMV though.

Remove namespace with xmltodict in Python

Related Q&A

Groupby count only when a certain value is present in one of the column in pandas

how to save tensorflow model to pickle file

PySide2 Qt3D mesh does not show up

Unable to import module lambda_function: No module named psycopg2._psycopg aws lambda function

RestrictedPython: Call other functions within user-specified code?

TypeError: object of type numpy.int64 has no len()

VS Code Pylance not highlighting variables and modules

How to compute Spearman correlation in Tensorflow

Pytorch loss is nan

How do you debug python code with kubernetes and skaffold?