Target web page:
http://www.immi.gov.au/skilled/general-skilled-migration/estimated-allocation-times.htm
The section I want to extract:
<tr><td>Skilled – Independent (Residence) subclass 885<br />online</td><td>N/A</td><td>N/A</td><td>N/A</td><td>15 May 2011</td><td>N/A</td></tr>
Once the code finds this section by searching the keyword "subclass 885
online", it should then print the date which is within the 5th tag which is "15 May 2011" as shown above.
It's just a monitor for myself to keep an eye on the progress of my immigration application.
"Beau--ootiful Soo--oop!
Beau--ootiful Soo--oop!
Soo--oop of the e--e--evening,
Beautiful, beauti--FUL SOUP!"
--Lewis Carroll, Alice's Adventures in Wonderland
I think this is exactly what he had in mind!
The Mock Turtle would probably do something like this:
>>> from BeautifulSoup import BeautifulSoup
>>> import urllib2
>>> url = 'http://www.immi.gov.au/skilled/general-skilled-migration/estimated-allocation-times.htm'
>>> page = urllib2.urlopen(url)
>>> soup = BeautifulSoup(page)
>>> for row in soup.html.body.findAll('tr'):
... data = row.findAll('td')
... if data and 'subclass 885online' in data[0].text:
... print data[4].text
...
15 May 2011
But I'm not sure it would help, since that date has already passed!
Good luck with the application!