php - Executing a long running screen scraping script -



php - Executing a long running screen scraping script -

i have screen scraping script in php on godaddy shared lamp server running via command-line.

the script scrapes, parses , stores required info in database. takes 1.5 seconds entire process per page, , needs scrape close 10,000 pages (and each of pages, fetch cookies 2 others, making total of 30k pages curled).

the entire script take 5 hours run. have done memory profiling, , memory consumption stays more or less constant throughout run - not increase.

if run script overnight, godaddy notice abnormal it? cpu consumption should not much how bad bandwidth consumption of fetching 3 pages per 1.5 seconds duration of 5 hours be? plenty raise alarms on godaddy's end?

if yes, suppose break script run through 1500 pages, , halt 1 hr , resume. should that?

for sake of not leaving question unanswered, i'll post answer:

i ran script overnight. took 5 hours run , neither terminated godaddy nor did receive notice, guess fine them.

initially having memory issues script run out of memory allocated me, apparently pre-php 5.3 bug (more details on here). 1 time fixed, hovered @ 32-34mb ram usage entire while. no clue cpu comsumption or bandwidth usage.

php curl screen-scraping shared-hosting

Comments

Popular posts from this blog

How do I check if an insert was successful with MySQLdb in Python? -

delphi - blogger via idHTTP : error 400 bad request -

postgresql - ERROR: operator is not unique: unknown + unknown -