I've been able to isolate a row in a html table using Beautiful Soup in Python 2.7. Been a learning experience, but happy to get that far. Unfortunately I'm a bit stuck on this next bit.
I need to get the link that follows the "Select document Remittance Report I format XLS" input. As this can change order of appearance, it needs to be dynamic. I'm not sure how to find that input and then grab the link that follows it.
I've been trying some findAll and nextSibling methods but my inexperience with python and beautiful soup is holding me back. The BeautifulSoup documentation is great but going a bit over my head.
<tr class="odd"><td header="c1">Report Download</td><td header="c2"><input aria-label="Select Report format PDF" id="documentChkBx0" name="documentChkBx" type="checkbox" value="5446"/><a href="/a/document.html?key=5446"><img alt="Portable Document Format" src="/img/icons/icon_PDF.gif"></img></a><input aria-label="Select Report format XLS" id="documentChkBx1" name="documentChkBx" type="checkbox" value="5447"/><a href="/a/document.html?key=5447"><img alt="Excel Spreadsheet Format" src="/img/icons/icon_XLS.gif"></img></a></td><td header="c4">04/27/2015</td><td header="c5">05/26/2015</td><td header="c6">05/26/2015 10:00AM EDT</td>
</tr>