AttributeError: xml.etree.ElementTree.Element object has no attribute encode

2024/11/13 9:31:29

I'm trying to make a desktop notifier, and for that I'm scraping news from a site. When I run the program, I get the following error.

news[child.tag] = child.encode('utf8')
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'encode'

How do I resolve it? I'm completely new to this. I tried searching for solutions, but none of them worked for me.

Here is my code:

import requests
import xml.etree.ElementTree as ET# url of news rss feed
RSS_FEED_URL = "http://www.hindustantimes.com/rss/topnews/rssfeed.xml"def loadRSS():'''utility function to load RSS feed'''# create HTTP request response objectresp = requests.get(RSS_FEED_URL)# return response contentreturn resp.contentdef parseXML(rss):'''utility function to parse XML format rss feed'''# create element tree root objectroot = ET.fromstring(rss)# create empty list for news itemsnewsitems = []# iterate news itemsfor item in root.findall('./channel/item'):news = {}# iterate child elements of itemfor child in item:# special checking for namespace object content:mediaif child.tag == '{http://search.yahoo.com/mrss/}content':news['media'] = child.attrib['url']else:news[child.tag] = child.encode('utf8')newsitems.append(news)# return news items listreturn newsitemsdef topStories():'''main function to generate and return news items'''# load rss feedrss = loadRSS()# parse XMLnewsitems = parseXML(rss)return newsitems
Answer

You're trying to convert a str to bytes, and then store those bytes in a dictionary. The problem is that the object you're doing this to is an xml.etree.ElementTree.Element, not a str.

You probably meant to get the text from within or around that element, and then encode() that. The docs suggests using the itertext() method:

''.join(child.itertext())

This will evaluate to a str, which you can then encode().

Note that the text and tail attributes might not contain text (emphasis added):

Their values are usually strings but may be any application-specific object.

If you want to use those attributes, you'll have to handle None or non-string values:

head = '' if child.text is None else str(child.text)
tail = '' if child.text is None else str(child.text)
# Do something with head and tail...

Even this is not really enough. If text or tail contain bytes objects of some unexpected (or plain wrong) encoding, this will raise a UnicodeEncodeError.

Strings versus Bytes

I suggest leaving the text as a str, and not encoding it at all. Encoding text to a bytes object is intended as the last step before writing it to a binary file, a network socket, or some other hardware.

For more on the difference between bytes and characters, see Ned Batchelder's "Pragmatic Unicode, or, How Do I Stop the Pain?" (36 minute video from PyCon US 2012). He covers both Python 2 and 3.

Example Output

Using the child.itertext() method, and not encoding the strings, I got a reasonable-looking list-of-dictionaries from topStories():

[...,{'description': 'Ayushmann Khurrana says his five-year Bollywood journey has ''been “a fun ride”; adds success is a lousy teacher while ''failure is “your friend, philosopher and guide”.','guid': 'http://www.hindustantimes.com/bollywood/i-am-a-hardcore-realist-and-that-s-why-i-feel-my-journey-has-been-a-joyride-ayushmann-khurrana/story-KQDR7gMuvhD9AeQTA7tbmI.html','link': 'http://www.hindustantimes.com/bollywood/i-am-a-hardcore-realist-and-that-s-why-i-feel-my-journey-has-been-a-joyride-ayushmann-khurrana/story-KQDR7gMuvhD9AeQTA7tbmI.html','media': 'http://www.hindustantimes.com/rf/image_size_630x354/HT/p2/2017/06/26/Pictures/actor-ayushman-khurana_24f064ae-5a5d-11e7-9d38-39c470df081e.JPG','pubDate': 'Mon, 26 Jun 2017 10:50:26 GMT ','title': "I am a hardcore realist, and that's why I feel my journey "'has been a joyride: Ayushmann...'},
]
https://en.xdnf.cn/q/72344.html

Related Q&A

How to parse code (in Python)?

I need to parse some special data structures. They are in some somewhat-like-C format that looks roughly like this:Group("GroupName") {/* C-Style comment */Group("AnotherGroupName")…

Using OpenCV detectMultiScale to find my face

Im pretty sure I have the general theme correct, but Im not finding any faces. My code reads from c=cv2.VideoCapture(0), i.e. the computers videocamera. I then have the following set up to yield where …

Get marginal effects for sklearn logistic regression

I want to get the marginal effects of a logistic regression from a sklearn modelI know you can get these for a statsmodel logistic regression using .get_margeff(). Is there nothing for sklearn? I want…

How to use win32com.client.constants with MS Word?

Whats wrong with this code? Why win32com.client.constants doesnt have attribute wdWindowStateMinimize?>>> import win32com.client >>> w=win32com.client.Dispatch("Word.Applicatio…

How to properly patch boto3 calls in unit test

Im new to Python unit testing, and I want to mock calls to the boto3 3rd party library. Heres my stripped down code:real_code.py:import boto3def temp_get_variable(var_name):return boto3.client(ssm).ge…

import a github into jupyter notebook directly?

Hey Im creating a jupyter notebook, would like to install: https://github.com/voice32/stock_market_indicators/blob/master/indicators.py which is a python program not sure how to do it directly so anybo…

Django : Call a method only once when the django starts up

I want to initialize some variables (from the database) when Django starts. I am able to get the data from the database but the problem is how should I call the initialize method . And this should be o…

Mocking instance attributes

Please help me understand why the following doesnt work. In particular - instance attributes of a tested class are not visible to Pythons unittest.Mock.In the example below bar instance attribute is no…

Are there any good 3rd party GUI products for Python? [closed]

Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.We don’t allow questi…

not able to get root window resize event

im trying to display the size(dimension) of the root window (top level window) on a label. whenever the user resize the window, new window dimensions should be displayed on the label. I tried to bind t…