parsing .xsd in python

2024/9/20 11:25:06

I need to parse a file .xsd in Python as i would parse an XML.
I am using libxml2.
I have to parse an xsd that look as follow:

<xs:complexType name="ClassType">
<xs:sequence><xs:element name="IeplcHeader"><xs:complexType><xs:sequence><xs:element name="device-number" type="xs:integer" fixed="1"/></xs:sequence><xs:attribute name="version" type="xs:integer" use="required" fixed="0"/></xs:complexType></xs:element>

when i access with

doc.xpathEval('//xs:complexType/xs:sequence/xs:element[@name="IeplcHeader"]'):

tells me that cannot find the path.

while if i remove all the xs: as follow

<complexType name="ClassType"><sequence><element name="IeplcHeader"><complexType><sequence><element name="device-number" type="xs:integer" fixed="1"/></sequence><attribute name="version" type="xs:integer" use="required" fixed="0"/></complexType></element>

in this way it works

doc.xpathEval('//complexType/sequence/element[@name="IeplcHeader"]'):

Does anyone knows how can i get read of this problem fixing a prefix? righ now i am preparsing the file removing the xs: but it's an orrible solution and i really hope to be able to find a better solution.

(I did not try with py-dom-xpath yet and i do not know if may work even with the xs:)

thanks, ste

Answer

If you have to deal with xsd files, maybe also using them to validate xml files I suggest you to pass to lxml that has a good support for XMLSchema files.

example code:

from lxml import etree
from cStringIO import StringIOf = StringIO()f = StringIO('''\<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="a" type="AType"/><xsd:complexType name="AType"><xsd:sequence><xsd:element name="b" type="xsd:string" /></xsd:sequence></xsd:complexType></xsd:schema>
''')    xmlschema_doc = etree.parse(f)xmlschema_doc.xpath('xsd:element',namespaces={"xsd": "http://www.w3.org/2001/XMLSchema"})

results in:

[<Element {http://www.w3.org/2001/XMLSchema}element at 0x9a17f2c>]
https://en.xdnf.cn/q/72303.html

Related Q&A

How to get the params from a saved XGBoost model

Im trying to train a XGBoost model using the params below: xgb_params = {objective: binary:logistic,eval_metric: auc,lambda: 0.8,alpha: 0.4,max_depth: 10,max_delta_step: 1,verbose: True }Since my input…

Reverse Label Encoding giving error

I label encoded my categorical data into numerical data using label encoderdata[Resi] = LabelEncoder().fit_transform(data[Resi])But I when I try to find how they are mapped internally usinglist(LabelEn…

how to check if a value exists in a dataframe

hi I am trying to get the column name of a dataframe which contains a specific word,eg: i have a dataframe,NA good employee Not available best employer not required well mana…

Do something every time a module is imported

Is there a way to do something (like print "funkymodule imported" for example) every time a module is imported from any other module? Not only the first time its imported to the runtime or r…

Unit Testing Interfaces in Python

I am currently learning python in preperation for a class over the summer and have gotten started by implementing different types of heaps and priority based data structures.I began to write a unit tes…

Python Pandas average based on condition into new column

I have a pandas dataframe containing the following data:matchID server court speed 1 1 A 100 1 2 D 200 1 3 D 300 1 …

Merging same-indexed rows by taking non-NaNs from all of them in pandas dataframe

I have a sparse dataframe with duplicate indices. How can I merge the same-indexed rows in a way that I keep all the non-NaN data from the conflicting rows?I know that you can achieve something very c…

Approximating cos using the Taylor series

Im using the Taylors series to calculate the cos of a number, with small numbers the function returns accurate results for example cos(5) gives 0.28366218546322663. But with larger numbers it returns i…

How to apply max min boundaries to a value without using conditional statements

Problem:Write a Python function, clip(lo, x, hi) that returns lo if x is less than lo; hi if x is greater than hi; and x otherwise. For this problem, you can assume that lo < hi.Dont use any conditi…

pandas to_json() redundant backslashes

I have a .csv file containing data about movies and Im trying to reformat it as a JSON file to use it in MongoDB. So I loaded that csv file to a pandas DataFrame and then used to_json method to write i…