Question 1

I need python regex to extract url's from html, example html code :

<a href=""http://a0c5e.site.it/r"" target=_blank><font color=#808080>MailUp</font></a>
<a href=""http://www.site.it/prodottiLLPP.php?id=1"" class=""txtBlueGeorgia16"">Prodotti</a>
<a href=""http://www.site.it/terremoto.php"" target=""blank"" class=""txtGrigioScuroGeorgia12"">Terremoto</a>
<a class='mini' href='http://www.site.com/remove/professionisti.aspx?Id=65&Code=xhmyskwzse'>clicca qui.</a>`

I need extract only:

 http://a0c5e.site.it/rhttp://www.site.it/prodottiLLPP.php?id=1http://www.site.it/terremoto.phphttp://www.site.com/remove/professionisti.aspx?Id=65&Code=xhmyskwzse

Question 2

Regex might solve your problem, but consider using BeautifulSoup

>>> html = """<a href="http://a0c5e.site.it/r" target=_blank><font color=#808080>MailUp</font></a>
<a href="http://www.site.it/prodottiLLPP.php?id=1" class=""txtBlueGeorgia16"">Prodotti</a>
<a href="http://www.site.it/terremoto.php" target=""blank"" class=""txtGrigioScuroGeorgia12"">Terremoto</a>
<a class='mini' href='http://www.site.com/remove/professionisti.aspx?Id=65&Code=xhmyskwzse'>clicca qui.</a>`"""
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup(html)
>>> [e['href'] for e in soup.findAll('a')]
[u'http://a0c5e.site.it/r', u'http://www.site.it/prodottiLLPP.php?id=1', u'http://www.site.it/terremoto.php', u'http://www.site.com/remove/professionisti.aspx?Id=65&Code=xhmyskwzse']

From Jon Clements

soup.findAll('a', {'href': True})

On a different note, your href quotaion in your html snippet is incorrect.

python url extract from html

Related Q&A

Regex match each character at least once [closed]

How to cluster with K-means, when number of clusters and their sizes are known [closed]

Converting German characters (like , etc) from Mac Roman to UTF (or similar)?

Caesar cipher without knowing the Key

how to convert u\uf04a to unicode in python [duplicate]

How can I display a nxn matrix depending on users input?

How to launch 100 workers in multiprocessing?

Indexes of a list Python

str object is not callable - CAUTION: DO NO USE SPECIAL FUNCTIONS AS VARIABLES

Using `wb.save` results in UnboundLocalError: local variable rel referenced before assignment