Manage quotation marks in XPath (lxml)

2024/10/15 6:15:36

I want to extract web elements from the table 'MANUFACTURING AT A GLANCE' in the given website. But the name of the row has ' (single quote). This is interfering with my syntax. How do I overcome this issue? This code works for other rows.

import requests
from lxml import html, etreeism_pmi_url = 'https://www.instituteforsupplymanagement.org/ismreport/mfgrob.cfm?SSO=1'
page = requests.get(ism_pmi_url)
tree = html.fromstring(page.content)PMI_CustomerInventories = tree.xpath('//strong[text()="Customers' Inventories"]/../../following-sibling::td/p/text()')
PMI_CustomerInventories_Curr_Val = PMI_CustomerInventories[0]
Answer

this is my approach to avoid your problem. maybe is not what you really need, but could help to you to get the idea.

#!/usr/bin/env python
# -*- coding: utf-8 -*-import lxml.html
import re
import requests
import lxml.html
from pprint import pprintdef load_lxml(response):return lxml.html.fromstring(response.text)url = 'https://www.instituteforsupplymanagement.org/ismreport/mfgrob.cfm?SSO=1'
response = requests.get(url)
root = load_lxml(response)headers = []
data = []
for index,row in enumerate(root.xpath('//*[@id="home_feature_container"]/div/div/div/span/table[2]/tbody/tr')):rows = []for cindex,column in enumerate(row.xpath('./th//text() | ./td//text()')):if cindex == 1:continuecolumn = column.strip()if index == 0 or not column:continueelif index == 1:headers.append(column)else:rows.append(column)if rows and len(rows) == 6:data.append(rows)data.insert(0,headers)pprint(data)    

Result:

[['Series Index','Feb','Series Index','Jan','Percentage','Point','Change','Direction','Rate of Change','Trend* (Months)'],['65.1', '60.4', '+4.7', 'Growing', 'Faster', '6'],['62.9', '61.4', '+1.5', 'Growing', 'Faster', '6'],['54.2', '56.1', '-1.9', 'Growing', 'Slower', '5'],['54.8', '53.6', '+1.2', 'Slowing', 'Faster', '10'],['51.5', '48.5', '+3.0', 'Growing', 'From Contracting', '1'],['47.5', '48.5', '-1.0', 'Too Low', 'Faster', '5'],['68.0', '69.0', '-1.0', 'Increasing', 'Slower', '12'],['57.0', '49.5', '+7.5', 'Growing', 'From Contracting', '1'],['55.0', '54.5', '+0.5', 'Growing', 'Faster', '12'],['54.0', '50.0', '+4.0', 'Growing', 'From Unchanged', '1']]
[Finished in 2.9s]
https://en.xdnf.cn/q/117862.html

Related Q&A

exception capture in threads and parent

How do you nicely capture exceptions in Python threads?If you have a threaded python app sys.excepthook does not capture errors in children.When a child raises an exception the parent will continue to…

Writing a program that compares 2 numbers in Python [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

Run Python3 without activating the virtual environment

My objective is to run Python 3 code on the AWS Lambda Service, which currently only supports Python 2.7. These are the steps I have done.Since I work on a Mac, setup a docker image similar to the AWS …

Matplotlib functions in tkinter

This is my first python project so I understand that this problem may seem a bit stupid. I am trying to create a Mandelbrot renderer. I am piecing code together from tutorials and code that I understan…

sudo su user -c with arguments not working

I am trying to execute command from python as another "user" with:command = "sudo su user -c player --standard=1 -o 2" subprocess.Popen(command.split(), shell=False, stdin=None, std…

Grouping data on column value

Hi I have data (in excel and text file as well) like C1 C2 C31 p a1 q b2 r c2 s dAnd I want the output like:C1 C2 C31 p,q a,b2 r,s c,dHow can I group the data…

Memory Error Python Processing Large File Line by Line

I am trying to concatenate model output files, the model run was broken up in 5 and each output corresponds to one of those partial run, due to the way the software outputs to file it start relabelling…

python assign literal value of a dictionary to key of another dictionary

I am trying to form a web payload for a particular request body but unable to get it right. What I need is to pass my body data as below data={file-data:{"key1": "3","key2&quo…

python regex findall span

I wanna find all thing between <span class=""> and </span> p = re.compile(<span class=\"\">(.*?)\</span>, re.IGNORECASE) text = re.findall(p, z)for exampl…

Why cant I view updates to a label while making an HTTP request in Python

I have this code :def on_btn_login_clicked(self, widget):email = self.log_email.get_text()passw = self.log_pass.get_text()self.lbl_status.set_text("Connecting ...")params = urllib.urlencode({…