replacing html tags with BeautifulSoup

2024/5/20 19:29:02

I'm currently reformatting some HTML pages with BeautifulSoup, and I ran into bit of a problem.

My problem is that the original HTML has things like this:

<li><p>stff</p></li>

and

<li><div><p>Stuff</p></div></li>

as well as

<li><div><p><strong>stff</strong></p></div><li>

With BeautifulSoup I hope to eliminate the div and the p tags, if they exists, but keep the strong tag.

I'm looking through the beautiful soup documentation and couldn't find any. Ideas?

Thanks.

Answer

This question probably refered to an older version of BeautifulSoup because with bs4 you can simply use the unwrap function:

s = BeautifulSoup('<li><div><p><strong>stff</strong></p></div><li>')
s.div.unwrap()
>> <div></div>
s.p.unwrap()
>> <p></p>
s
>> <html><body><li><strong>stff</strong></li><li></li></body></html>
https://en.xdnf.cn/q/73288.html

Related Q&A

LightGBM: train() vs update() vs refit()

Im implementing LightGBM (Python) into a continuous learning pipeline. My goal is to train an initial model and update the model (e.g. every day) with newly available data. Most examples load an alread…

GTK: create a colored regular button

How do I do it? A lot of sites say I can just call .modify_bg() on the button, but that doesnt do anything. Im able to add an EventBox to the button, and add a label to that, and then change its color…

How to Normalize similarity measures from Wordnet

I am trying to calculate semantic similarity between two words. I am using Wordnet-based similarity measures i.e Resnik measure(RES), Lin measure(LIN), Jiang and Conrath measure(JNC) and Banerjee and P…

How to open chrome developer console using Selenium in Python?

I am trying to open developer console in chrome using selenium webdriver. I am doingfrom selenium import webdriverfrom selenium.webdriver.common import action_chains, keys...browser = webdriver.Chrome(…

How to enable an allow-insecure-localhost flag in Chrome from selenium?

I want to enable "allow-insecure-localhost" flag from selenium. How I can do it?selenium: 3.12.0, Python:3.6.5Chrome driver creation code:def create_driver():options = Options()if sys.plat…

Getting pandas dataframe from list of nested dictionaries

I am new to Python so this may be pretty straightforward, but I have not been able to find a good answer for my problem after looking for a while. I am trying to create a Pandas dataframe from a list o…

Seaborn catplot combined with PairGrid

I am playing with the Titanic dataset, and trying to produce a pair plot of numeric variables against categorical variables. I can use Seaborns catplot to graph a plot of one numeric variable against o…

Control individual linewidths in seaborn heatmap

Is it possible to widen the linewidth for sepcific columns and rows in a seaborn heatmap?For example, can this heatmapimport numpy as np; np.random.seed(0) import seaborn as sns; sns.set() uniform_dat…

openerp context in act_window

In OpenERP 6.1 this act_window:<act_windowdomain="[(id, =, student)]"id="act_schedule_student"name="Student"res_model="school.student"src_model="school.s…

Djangos redirects app doesnt work with URL parameters

I recently installed Djangos default redirects app on my site using the exact instructions specified:Ensured django.contrib.sites framework is installed. Added django.contrib.redirects to INSTALLED_APP…