• VBS to download web pages (Win98)

    Author
    Topic
    #409575

    I wasn’t really sure of where to post this query, but thought it might lend itself to scripting. I’m wanting to cycle through a series of URLs and download/save all of the web pages. These links are cgi links, with pages generated at the server. I know that there are about 70 pages in the “series”, with their URLs varying only by one number. For example:
    http://www.somesite.com/cgi-bin/showlist.p…s&Showpage=*&Listings=30

    The * would take on any value from 0 to (about) 70, for the pages of interest.

    I know that I can type this to a browser (say, with *=34) and get the correct page. But efforts to cycle through the series with a download manager won’t work, for whatever reason. I’d like a script to do the looping and downloading. I’d only need to save the HTML on the page (graphics and anything else not needed).

    Any help with this would be appreciated.

    Alan

    Viewing 3 reply threads
    Author
    Replies
    • #874252

      I hope you don’t plan to do this to the Lounge! Our server is working hard enough as it is. grin

      The ADODB Stream object so much in the news of late as a conduit for spyware could be your answer. I’ve never used it, but clearly it can do the job. Or should I say, could do the job. You probably would have to reverse the kill bit in the registry in order to script it… probably a good idea to reset it after you’re done.

      If the server denies connections that do not supply appropriate referrer or useragent strings, or saved cookies, then the Stream object probably won’t help you. It is a bit of work to script Internet Explorer itself, but that is another option. (I’ve only done that from Word VBA, not from VBS.)

      • #874463

        I hope you don’t plan to do this to the Lounge!


        It hadn’t crossed my mind, but it’s an interesting idea, now that you mention it. evilgrin

        I will have a look at the ADODB Stream object. I always forget about the VBA possibility stupidme. I guess it’s one of those old habits – if it’s too trivial for a program, then go for a script.

        I actually managed to coerce a download manager to recurse a list of the repetitious URLs, which I built using Excel. It wouldn’t entertain a file on HD, so I had to upload it as a HTML webpage, and use the URL of that for the base of the download “tree”. This particular software deals with “complex” URLs (form submissions, cgi etc.) by setting up a “special” temporary proxy address in the browser, then somehow sucking out the appropriate URLs for downloading. Don’t ask me how.

        Thanks

        Alan

      • #874464

        I hope you don’t plan to do this to the Lounge!


        It hadn’t crossed my mind, but it’s an interesting idea, now that you mention it. evilgrin

        I will have a look at the ADODB Stream object. I always forget about the VBA possibility stupidme. I guess it’s one of those old habits – if it’s too trivial for a program, then go for a script.

        I actually managed to coerce a download manager to recurse a list of the repetitious URLs, which I built using Excel. It wouldn’t entertain a file on HD, so I had to upload it as a HTML webpage, and use the URL of that for the base of the download “tree”. This particular software deals with “complex” URLs (form submissions, cgi etc.) by setting up a “special” temporary proxy address in the browser, then somehow sucking out the appropriate URLs for downloading. Don’t ask me how.

        Thanks

        Alan

    • #874253

      I hope you don’t plan to do this to the Lounge! Our server is working hard enough as it is. grin

      The ADODB Stream object so much in the news of late as a conduit for spyware could be your answer. I’ve never used it, but clearly it can do the job. Or should I say, could do the job. You probably would have to reverse the kill bit in the registry in order to script it… probably a good idea to reset it after you’re done.

      If the server denies connections that do not supply appropriate referrer or useragent strings, or saved cookies, then the Stream object probably won’t help you. It is a bit of work to script Internet Explorer itself, but that is another option. (I’ve only done that from Word VBA, not from VBS.)

    • #874291

      Actually, I’ve done this, but I can’t find the details at the moment. If you do an internet search, using ‘readyState’ and ‘WSH’ as keywords, you should find some good examples of reading a web page through scripting.

      • #874465

        This is a very useful tip Briana cheers. This page is heading towards exactly what I want – not just the pages, but processed information from them. I can spot the possibility for a recursive loop just glancing at this example. Now that I’ve managed to get the files though post 406081 I might try to adapt this for post-processing.

        Thank you Briana, for a great lead thankyou.

        Alan

      • #874466

        This is a very useful tip Briana cheers. This page is heading towards exactly what I want – not just the pages, but processed information from them. I can spot the possibility for a recursive loop just glancing at this example. Now that I’ve managed to get the files though post 406081 I might try to adapt this for post-processing.

        Thank you Briana, for a great lead thankyou.

        Alan

    • #874292

      Actually, I’ve done this, but I can’t find the details at the moment. If you do an internet search, using ‘readyState’ and ‘WSH’ as keywords, you should find some good examples of reading a web page through scripting.

    Viewing 3 reply threads
    Reply To: VBS to download web pages (Win98)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: