lxml tree connection and properties

2024/10/7 20:29:43

I have a .dtsx file so, I have multiple components with connections, so I need to extract component that have especific connection, but I can not handle that, example:

<components><component refId="Component_1 Name" componentClassID="componentClassID" contactInfo="contactInfo" description="description" name="name" usesDispositions="true" version="6"><properties><property dataType="System.String" description="description" expressionType="Notify" name="SqlCommandParam" UITypeEditor="UITypeEditor">QUERY THAT i NEED TO GET</property></properties><connections><connection refId="Name" connectionManagerID="Package.ConnectionManagers[BI_SYNC]" connectionManagerRefId="Package.ConnectionManagers[BI_SYNC]" description="description" name="OleDbConnection" /></connections></component><component refId="Component_2 Name" componentClassID="componentClassID" contactInfo="contactInfo" description="description" name="PartnerService" usesDispositions="true" version="6"><properties><property dataType="System.String" description="description" expressionType="Notify" name="SqlCommandParam" UITypeEditor="UITypeEditor">QUERY THAT I DONT NEED TO GET</property></properties><connections><connection refId="Name" connectionManagerID="Package.ConnectionManagers[BI_STG]" connectionManagerRefId="Package.ConnectionManagers[BI_STG]" description="description" name="OleDbConnection" /></connections></component>
</components>

I need to get the query where connectionManagerID="Package.ConnectionManagers[BI_SYNC]" But I cannot handle that because they are in the same level, properties and connections

Code that I am using is +/- like:

for cnt, element in enumerate(root.xpath(".//*")):if cnt == 0:file = root.attrib["{www.microsoft.com/SqlServer/Dts}ObjectName"]data["file_name"] = file + ".dtsx"if element.tag == con_tag:if element.attrib.get("{www.microsoft.com/SqlServer/Dts}ObjectName"):if element.attrib.get("{www.microsoft.com/SqlServer/Dts}ObjectName", None) == "BI_SYNC":conn_name = element.attrib.get("{www.microsoft.com/SqlServer/Dts}ObjectName", None)conn_dtsid = element.attrib.get("{www.microsoft.com/SqlServer/Dts}DTSID", None)data["conn_name"] = conn_namedata["conn_dtsid"] = conn_dtsidif element.tag == exec_tag:for cnt_0, element_0 in enumerate(element):if element_0.tag == execs_tag:for cnt_1, element_1 in enumerate(element_0): # Get package name
Answer

Since properties and connections are both children of the same component, you can use xpath to select the component based on the connection, then select the property.

So instead of a lot of nested if and for statements, try something like...

from lxml import etreexml = """<root xmlns="www.microsoft.com/SqlServer/Dts"><components><component refId="Component_1 Name" componentClassID="componentClassID" contactInfo="contactInfo" description="description" name="name" usesDispositions="true" version="6"><properties><property dataType="System.String" description="description" expressionType="Notify" name="SqlCommandParam" UITypeEditor="UITypeEditor">QUERY THAT i NEED TO GET</property></properties><connections><connection refId="Name" connectionManagerID="Package.ConnectionManagers[BI_SYNC]" connectionManagerRefId="Package.ConnectionManagers[BI_SYNC]" description="description" name="OleDbConnection"/></connections></component><component refId="Component_2 Name" componentClassID="componentClassID" contactInfo="contactInfo" description="description" name="PartnerService" usesDispositions="true" version="6"><properties><property dataType="System.String" description="description" expressionType="Notify" name="SqlCommandParam" UITypeEditor="UITypeEditor">QUERY THAT I DONT NEED TO GET</property></properties><connections><connection refId="Name" connectionManagerID="Package.ConnectionManagers[BI_STG]" connectionManagerRefId="Package.ConnectionManagers[BI_STG]" description="description" name="OleDbConnection"/></connections></component></components>
</root>
"""root = etree.fromstring(xml)ns = {"dts": "www.microsoft.com/SqlServer/Dts"}for property_elem in root.xpath(".//dts:component[dts:connections/dts:connection[@connectionManagerID='Package.ConnectionManagers[BI_SYNC]']]/dts:properties/dts:property", namespaces=ns):print(etree.tostring(property_elem).decode())

This outputs the following to show that it selects the correct property...

<property xmlns="www.microsoft.com/SqlServer/Dts" dataType="System.String" description="description" expressionType="Notify" name="SqlCommandParam" UITypeEditor="UITypeEditor">QUERY THAT i NEED TO GET</property>

A couple of notes...

  • I added a root element with the default namespace so my root variable would work similar to what you already have.
  • I use the namespaces kwarg so I could use a prefix in my xpath instead of using Clark notation. (Cleaner in my opinion.)
https://en.xdnf.cn/q/118781.html

Related Q&A

Python recursive function call with if statement

I have a question regarding function-calls using if-statements and recursion. I am a bit confused because python seems to jump into the if statements block even if my function returns "False"…

How can I list all 1st row values in an Excel spreadsheet using OpenPyXL?

Using the OpenPyXL module with Python 3.5, I was able to figure out how many columns there are in a spreadsheet with:In [1]: sheet.max_column Out [1]: 4Then I was able to list the values in each of the…

Using matplotlib on non-0 MPI rank causes QXcbConnection: Could not connect to display

I have written a program that uses mpi4py to do some job (making an array) in the node of rank 0 in the following code. Then it makes another array in the node of rank 1. Then I plot both the arrays. T…

ioerror errno 13 permission denied: C:\\pagefile.sys

Below is my code, what I am trying to achieve is walking through the OS generating a MD5 hash of every file the code is functional, however, I receive the error in the title "ioerror errno 13 perm…

How can PyUSB be understood? [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

Resize image in python without using resize() - nearest neighbor

For an assignment I want to resize a .jpg image with a python code, but without using the pil.image.resize() function or another similar function. I want to write the code myself but I cant figure out …

Concatenate two dataframes based on no of rows

I have two dataframes:a b c d e f 2 4 6 6 7 1 4 7 9 9 5 87 9 65 8 2Now I want to create a new dataframe like this:a b c d e f 2 4 6 6 7 1 4 7 9 9 5 8 That is, I only want the rows of the …

Having Problems with AzureChatOpenAI()

people. Im trying to use the AzureChatOpenAI(), but even if I put the right parameters, it doesnt work. Here it is: from langchain_core.messages import HumanMessage from langchain_openai import AzureCh…

Finding cycles in a dictionary

I have a dictionary which has values as:m = {1: 2, 7: 3, 2: 1, 4: 4, 5: 3, 6: 9}The required output should be cyclic values like 1:2 -> 2:1 = cycle, which both are present in dictionary. 4:4 is also…

Create a dataframe from HTML table in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 9…