Extracting Fields Names of an HTML form

Extracting Fields Names of an HTML form - Python

2024/11/15 0:22:26

Assume that there is a link "http://www.someHTMLPageWithTwoForms.com" which is basically a HTML page having two forms (say Form 1 and Form 2). I have a code like this ...

import httplib2
from BeautifulSoup import BeautifulSoup, SoupStrainer
h = httplib2.Http('.cache')
response, content = h.request('http://www.someHTMLPageWithTwoForms.com')
for field in BeautifulSoup(content, parseOnlyThese=SoupStrainer('input')):if field.has_key('name'):print field['name']

This returns me all the field names that belong both to the Form 1 and Form 2 of my HTML page. Is there any way I can get only the Field names that belong to a particular form (say Form 2 only)?

Answer

If it's only 2 forms you may try this one:

from BeautifulSoup import BeautifulSoupforms = BeautifulSoup(content).findAll('form')
for field in forms[1]:if field.has_key('name'):print field['name']

If it's not only about the 2nd form you make it more specific (by an id or class attributs

from BeautifulSoup import BeautifulSoupforms = BeautifulSoup(content).findAll(attrs={'id' : 'yourFormId'})
for field in forms[0]:if field.has_key('name'):print field['name']

Extracting Fields Names of an HTML form - Python

Related Q&A

Best way to combine a permutation of conditional statements

Fast way to get N Min or Max elements from a list in Python

Continue if else in inline for Python

HTTPError: HTTP Error 403: Forbidden on Google Colab

Pandas partial melt or group melt

How do I detect when my window is minimized with wxPython?

Pandas: How to select a column in rolling window

What is the fastest way in Cython to create a new array from an existing array and a variable

Subclassing and built-in methods in Python

How to load Rs .rdata files into Python?