Getting text between xml tags with minidom [duplicate]

2024/9/8 10:14:33

I have this sample xml document snippet

<root><foo>bar</foo><foo>baz</foo>
</root>

I'm using python's minidom method from xml.dom. I am reading in tags with getElementsByTagName("foo"). How do I get the text between the tags? And if the tags were nested, how would I get those?

Answer

So if you need to get the text out then you can do the following:

import xml.dom.minidom
document = "<root><foo>bar</foo><foo>baby</foo></root>"
dom = xml.dom.minidom.parseString(document)def getText(nodelist):rc = []for node in nodelist:if node.nodeType == node.TEXT_NODE:rc.append(node.data)return ''.join(rc)def handleTok(tokenlist):texts = ""for token in tokenlist:texts += " "+ getText(token.childNodes)return texts
foo = dom.getElementsByTagName("foo")
text = handleTok(foo)
print text

They have a good example on the site: http://docs.python.org/library/xml.dom.minidom.html

EDIT: For nested tags, check the example on the site.

https://en.xdnf.cn/q/72705.html

Related Q&A

OpenCV Error: Unknown error code -49 in Python

I am trying to learn face detection in python-3.6 using cv2.I follow the src given in a book.I have already installed opencv-python(3.2.0) by pip.xml and .jpg files are all in the same path with python…

Python Exchange ActiveSync Library

Is anyone familiar with an Exchange ActiveSync library or open source client for python? Ive done preliminary searching with little to no success. Ive seen some examples for C#, but I figured Id ask a…

Tastypie: How can I fill the resource without database?

I want to grab some information from Foursquare , add some fields and return it via django-tastypie. UPDATE:def obj_get_list(self, request=None, **kwargs):near = if near in request.GET and request.GET…

Is there a way to protect built-ins in python?

My question arises from this question, in which a user got himself confused by unknowingly rebinding the built-in global set. Is there a straightforward way to get python to warn you when you attempt t…

Generate thumbnail for arbitrary audio file

I want to represent an audio file in an image with a maximum size of 180180 pixels.I want to generate this image so that it somehow gives a representation of the audio file, think of it like SoundCloud…

Extract specific text lines?

I have a large several hudred thousand lines text file. I have to extract 30,000 specific lines that are all in the text file in random spots. This is the program I have to extract one line at a time:b…

Listing users for certain DB with PyMongo

What Im trying to acheiveIm trying to fetch users for a certain database.What I did so farI was able to find function to list the databases or create users but none for listing the users, I thought ab…

Using python selenium for Microsoft edge

I am trying to use pythons selenium for Microsoft edge but I keep getting this error:WebDriverException: Message: unknown error: cannot find Microsoft Edge binaryI downloaded the latest version of the …

get all unicode variations of a latin character

E.g., for the character "a", I want to get a string (list of chars) like "aāăą" (not sure if that example list is complete...) (basically all unicode chars with names "Latin…

How do I install Django on Ubuntu 11.10?

Im using The Definitive guide to installing Django on ubuntu and ironically need something more definitive because I cant make it work.(I have followed the steps before this on the link above) Here is …