Whats difference between findall() and iterfind() of xml.etree.ElementTree

2024/10/15 18:24:19

I write a program using just like below

from xml.etree.ElementTree import ETxmlroot = ET.fromstring([my xml content])for element in xmlroot.iterfind(".//mytag"):do some thing

it works fine on my python (v2.7.1), but after I copy it to another computer installed with python v2.6.x, iterfind() is not supported, on python document, below description listed

findall(match)

Finds all matching subelements, by tag name or path. Returns a list containing all matching elements in document order.

iterfind(match)

Finds all matching subelements, by tag name or path. Returns an iterable yielding all matching elements in document order.

New in version 2.7.

my question is: these 2 function is same or not? what's difference between these two functions

Answer

Like indicated in the docs -

  1. findall returns the complete list of elements matching the match xpath , we can use subscripts to access them , example -

    >>> root = ET.fromstring("<a><b>c</b></a>")
    >>> root.findall("./b")
    [<Element 'b' at 0x02048C90>]
    >>> lst = root.findall("./b")
    >>> lst[0]
    <Element 'b' at 0x02048C90>
    

We can also use for loop to iterate through the list.

  1. iterfind returns an iterator (generator), it does not return the list , in this case we cannot use subscripts to access the element, we can only use it in places where iterators are accepted, an example would be in a for loop.

iterfind would be faster than findall in cases where you actually want to iterate through the returned list(which is most of the time from my experience) , since findall has to create the complete list before returning, whereas iterfind finds (yields) the next element that matches the match only on iterating and call to next(iter) (which is what is internally called when iterating through the list using for or such constructs).

In cases where you want the list, Both seem to have similar timing.

Performance test for both cases -

In [1]: import xml.etree.ElementTree as ETIn [2]: x = ET.fromstring('<a><b>c</b><b>d</b><b>e</b></a>')In [3]: def foo(root):...:     d = root.findall('./b')...:     for  y in d:...:         pass...: In [4]: def foo1(root):...:     d = root.iterfind('./b')...:     for y in d:...:         pass...: In [5]: %timeit foo(x)
100000 loops, best of 3: 9.24 µs per loopIn [6]: %timeit foo1(x)
100000 loops, best of 3: 6.65 µs per loopIn [7]: def foo2(root):...:     return root.findall('./b')...: In [8]: def foo3(root):...:     return list(root.iterfind('./b'))...: In [9]: %timeit foo2(x)
100000 loops, best of 3: 8.54 µs per loopIn [10]: %timeit foo3(x)
100000 loops, best of 3: 8.4 µs per loop
https://en.xdnf.cn/q/69252.html

Related Q&A

How to convert string dataframe column to datetime as format with year and week?

Sample Data:Week Price 2011-31 1.58 2011-32 1.9 2011-33 1.9 2011-34 1.9I have a dataframe like above and I wanna convert Week column type from string to datetime.My Code:data[Date_Time…

Tensorflow - ValueError: Shape must be rank 1 but is rank 0 for ParseExample/ParseExample

I have a .tfrecords file of the Ubuntu Dialog Corpus. I am trying to read in the whole dataset so that I can split the contexts and utterances into batches. Using tf.parse_single_example I was able to …

Navigating Multi-Dimensional JSON arrays in Python

Im trying to figure out how to query a JSON array in Python. Could someone show me how to do a simple search and print through a fairly complex array please?The example Im using is here: http://eu.bat…

Numpy, apply a list of functions along array dimension

I have a list of functions of the type:func_list = [lambda x: function1(input),lambda x: function2(input),lambda x: function3(input),lambda x: x]and an array of shape [4, 200, 200, 1] (a batch of image…

Database first Django models

In ASP.NET there is entity framework or something called "database first," where entities are generated from an existing database. Is there something similar for Django? I usually work with …

How to use pythons Structural Pattern Matching to test built in types?

Im trying to use SPM to determine if a certain type is an int or an str. The following code: from typing import Typedef main(type_to_match: Type):match type_to_match:case str():print("This is a St…

Importing app when using Alembic raises ImportError

I am trying to study how to use alembic in flask, I want to import a method in flask app:tree . . ├── README.md ├── alembic │ ├── README │ ├── env.py │ ├── env.pyc │ ├── s…

Git add through python subprocess

I am trying to run git commands through python subprocess. I do this by calling the git.exe in the cmd directory of github.I managed to get most commands working (init, remote, status) but i get an err…

How to unread a line in python

I am new to Python (2.6), and have a situation where I need to un-read a line I just read from a file. Heres basically what I am doing.for line in file:print linefile.seek(-len(line),1)zz = file.readli…

typeerror bytes object is not callable

My code:import psycopg2 import requests from urllib.request import urlopen import urllib.parse uname = " **** " pwd = " ***** " resp = requests.get("https://api.flipkart.net/se…