Remove namespace with xmltodict in Python

2024/9/27 23:30:09

xmltodict converts XML to a Python dictionary. It supports namespaces. I can follow the example on the homepage and successfully remove a namespace. However, I cannot remove the namespace from my XML and cannot identify why? Here is my XML:

<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus">
<section1mystatus:field1="data1"mystatus:field2="data2" />
<section2mystatus:lineA="outputA"mystatus:lineB="outputB" />
</status>

And using:

xmltodict.parse(xml,process_namespaces=True,namespaces={'http://localhost/mystatus':None})

I get:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'@http://localhost/mystatus:field1', u'data1'), (u'@http://localhost/mystatus:field2', u'data2')])), (u'section2', OrderedDict([(u'@http://localhost/mystatus:lineA', u'outputA'), (u'@http://localhost/mystatus:lineB', u'outputB')]))]))])

instead of:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'field1', u'data1'), (u'field2', u'data2')])), (u'section2', OrderedDict([(u'lineA', u'outputA'), (u'@lineB', u'outputB')]))]))])

Am I making some simple mistake, or is there something about my XML that prevents the process_namespace modification from working correctly?

Answer

xmltodict is based on expat, so namespaces should applied to the class name, not attribute names:

<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus"><mystatus:section1 field1="data1" field2="data2" /><mystatus:section2 lineA="outputA" lineB="outputB" />
</status>

When parsed with:

foo = xmltodict.parse(xml,process_namespaces=True,namespaces={'http://localhost/mystatus':None})

outputs:

OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'@field1', u'data1'), (u'@field2', u'data2')])), (u'section2', OrderedDict([(u'@lineA', u'outputA'), (u'@lineB', u'outputB')]))]))])

Accessing it is easy:

# Get attribute 'lineA' from class 'section2' from class 'status'
>>> foo.get('status').get('section2').get('@lineA')
u'outputA'

Attribute namespaces are only required when you have multiple attributes of the same name (e.g. multiple id's or multiple prices, etc), in which case, I couldn't get expat or xmltodict to parse it correctly. YMMV though.

https://en.xdnf.cn/q/71405.html

Related Q&A

Groupby count only when a certain value is present in one of the column in pandas

I have a dataframe similar to the below mentioned database:+------------+-----+--------+| time | id | status |+------------+-----+--------+| 1451606400 | id1 | Yes || 1451606400 | id1 | Yes …

how to save tensorflow model to pickle file

I want to save a Tensorflow model and then later use it for deployment purposes. I dont want to use model.save() to save it because my purpose is to somehow pickle it and use it in a different system w…

PySide2 Qt3D mesh does not show up

Im diving into Qt3D framework and have decided to replicate a simplified version of this c++ exampleUnfortunately, I dont see a torus mesh on application start. Ive created all required entities and e…

Unable to import module lambda_function: No module named psycopg2._psycopg aws lambda function

I have installed the psycopg2 with this command in my package folder : pip install --target ./package psycopg2 # Or pip install -t ./package psycopg2now psycopg2 module is in my package and I have crea…

RestrictedPython: Call other functions within user-specified code?

Using Yuri Nudelmans code with the custom _import definition to specify modules to restrict serves as a good base but when calling functions within said user_code naturally due to having to whitelist e…

TypeError: object of type numpy.int64 has no len()

I am making a DataLoader from DataSet in PyTorch. Start from loading the DataFrame with all dtype as an np.float64result = pd.read_csv(dummy.csv, header=0, dtype=DTYPE_CLEANED_DF)Here is my dataset cla…

VS Code Pylance not highlighting variables and modules

Im using VS Code with the Python and Pylance extensions. Im having a problem with the Pylance extension not doing syntax highlight for things like modules and my dataframe. I would expect the modules…

How to compute Spearman correlation in Tensorflow

ProblemI need to compute the Pearson and Spearman correlations, and use it as metrics in tensorflow.For Pearson, its trivial :tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)But for Spe…

Pytorch loss is nan

Im trying to write my first neural network with pytorch. Unfortunately, I encounter a problem when I want to get the loss. The following error message: RuntimeError: Function LogSoftmaxBackward0 return…

How do you debug python code with kubernetes and skaffold?

I am currently running a django app under python3 through kubernetes by going through skaffold dev. I have hot reload working with the Python source code. Is it currently possible to do interactive deb…