beautifulsoup - python finding index of tag in string -



beautifulsoup - python finding index of tag in string -

html

<div class="productdescriptionwrapper"> <p>a worm worth getting hands dirty over. on 6 feet of crawl space, playhut&rsquo;s wiggly worm brightly colored , friendly play structure. </p> <ul> <li>6ft of crawl through fun</li> <li>18&rdquo; diameter easy crawl through</li> <li>bright colorful design</li> <li>product measures: 18&quot;&quot;diam x 60&quot;&quot;l</li> <li>recommended ages: 3 years &amp; up<br /> &nbsp;</li> </ul> <p><strong>intended indoor use</strong></p>

code

def getbullets(self, soup): bulletlist = [] bullets = str(soup.findall('div', {'class': 'productdescriptionwrapper'})) bullets_re = re.compile('<li>(.*)</li>') bullets_pat = str(re.findall(bullets_re, bullets)) index = bullets_pat.findall('</li>') print index

how extract p tags , li tags? thanks!

notice following:

>>> beautifulsoup import beautifulsoup >>> html = """ <what have above> """ >>> soup = beautifulsoup(html) >>> bullets = soup.findall('div', {'class': 'productdescriptionwrapper'}) >>> ptags = bullets[0].findall('p') >>> print ptags [<p>a worm worth getting hands dirty over. on 6 feet of crawl space, playhut&rsquo;s wiggly worm brightly colored , friendly play structure. </p>, <p><strong>intended indoor use</strong></p>] >>> print ptags[0].text worm worth getting hands dirty over. on 6 feet of crawl space, playhut&rsquo;s wiggly worm brightly colored , friendly play structure.

you can @ contents of li tags in similar manner.

python beautifulsoup

Comments

Popular posts from this blog

How do I check if an insert was successful with MySQLdb in Python? -

delphi - blogger via idHTTP : error 400 bad request -

postgresql - ERROR: operator is not unique: unknown + unknown -