python - xpath function starts-with not returning all required information -
python - xpath function starts-with not returning all required information -
when using xpath within scrapy shell select email addresses (p) tag on webpage xpath returns links within particular paragraph. such i've attempted utilize starts-with function farther refine info want returned, successful cuts off ends of email addresses.
hxs.select('//*[@id="rightcol02"]/p/a[starts-with(@href,"mailto")]')
above returns incomplete email addresses.
when running hxs.select without starts-with function have observed following:
hxs.select('//*[@id="xxxxxxx"]/p/a') - (returns links ends of url's , email addresses cutting off.)
hxs.select('//*[@id="xxxxxxx"]/p/a/@href') - (returns finish email address , url.)
question how starts-with capture entire email address?
i have tried next unsure syntax should be:
hxs.select('//*[@id="xxxxxxxx"]/p/a/@href[starts-with("mailto:")]')
python xpath scrapy startswith
Comments
Post a Comment