php - Executing a long running screen scraping script -
php - Executing a long running screen scraping script -
i have screen scraping script in php on godaddy shared lamp server running via command-line.
the script scrapes, parses , stores required info in database. takes 1.5 seconds entire process per page, , needs scrape close 10,000 pages (and each of pages, fetch cookies 2 others, making total of 30k pages curl
ed).
the entire script take 5 hours run. have done memory profiling, , memory consumption stays more or less constant throughout run - not increase.
if run script overnight, godaddy notice abnormal it? cpu consumption should not much how bad bandwidth consumption of fetching 3 pages per 1.5 seconds duration of 5 hours be? plenty raise alarms on godaddy's end?
if yes, suppose break script run through 1500 pages, , halt 1 hr , resume. should that?
for sake of not leaving question unanswered, i'll post answer:
i ran script overnight. took 5 hours run , neither terminated godaddy nor did receive notice, guess fine them.
initially having memory issues script run out of memory allocated me, apparently pre-php 5.3 bug (more details on here). 1 time fixed, hovered @ 32-34mb ram usage entire while. no clue cpu comsumption or bandwidth usage.
php curl screen-scraping shared-hosting
Comments
Post a Comment