• Selective News Grabber

    Author
    Topic
    #360874

    Anyone know where I might find something like this, or how I might cobble together two apps that would give me this utility?
    Peter

    The Need:

    Every day I visit major newspaper, scientific journal and focused news-gathering sites, looking for headlines on stories that relate to the science and business of biology and genomics. Most of the sites have required me to register and to accept a cookie. Some have required me to use a password that is saved. I go only to sites that show news headlines and short summaries, with a clickable link to each full-length item. A typical day will have me open and read 30-50 such links in real time. I also later visit specialist Yahoo! message boards, scan their contents, and in any day typically find 3-5 messages I would like to capture and store cumulatively with little effort in folders already designated on my PC, without fear of overwriting their earlier contents.

    The Problem:

    Currently available usenet and web page grabbers are indiscriminate. They download all they find at all levels on a tree, down to the designated level. The user cannot be selective within a level, cannot tell the grabber to selectively ignore some or all of the levels higher on the tree, and cannot tell the grabber to ignore other branches on the same tree.

    My dream utility would work as follows:

    1) I would go to a msg. board or newspaper site (The New York Times is a good example http://www.nytimes.com/)

    2) I would scan down the lists of offered links on msg. threads or news items, right clicking on each item of interest – and instead of opening, each link would be accumulated on a clipboard by the utility, annotated with a source tag.

    3) When I’ve been to all the boards and newspapers I visit every day, the utility would then let me review the full list of the links I’ve ticked for opening, so that I can remove some of them if the count for the day is getting too high or stories have been duplicated on different sites.

    4) The utility would then use Task Scheduler to open IE 5 for me while at a meeting or after hours, automatically open each link in turn, grab the text contents and park them for me in a daily file, to be read later. I rarely want embedded pictures or graphics, but a yes/no choice for them against each target would be really neat. I would also like the utility to be compatible with the use of WebWasher, to avoid downloading advertisements and pop-ups.

    Viewing 3 reply threads
    Author
    Replies
    • #544456

      Could this be done by adding something to the ‘Send to’ list, i.e. highlight item to be saved, right-click and send to a destination from where it can be viewed later? Don’t know how to do this, but it seems feasible.

      • #544461

        I think that works only for the whole page, but not for items accessed by links on the page.

        Right now I’m wonder whether I can bend Clipmate Board Extender to my purpose, but I haven’t figured out how to “run” the clipboard using Windows’ task scheduler.

    • #544521

      That’s got possibilities — especially if someone can point me to a utility that can be timed to dial-up and retrieve the list of links in one pass, and without having to be nursemaid at the keyboard.

      What surprises me most is that what I’m looking for is not already long available, from ‘way back when we used to live simply and happily by the CompuServe forums …

      Is there a ClipBoard Extender out there who can show me how to get from here to ‘clunk, click – eureka!’?

    • #544610

      Maybe this would work: “Cogitum Co-Citer is a tool for creating the collections of texts from the Internet. It captures the selected text, its Internet address, its title and date of adding to the collection.” It works GREAT for me and FREE !!!!! http://www.cogitum.com I hope it works for you.

      Coinman33

      • #544617

        Three cheers for Pennsylvania! I’m on my way over there right now, and I’ll report back when I’ve given it a whirl.
        Many thanks,
        Peter

      • #544621

        Coinman,
        Co-Citer looks to be a really great and smart text grabber for formal web-based research projects. I’ve downloaded it for use in its own right, for which I think it will be invaluable.

        But it cannot yet grab and follow links, so a page I want must be already open, and I don’t see that it will be possible to get it to queue targets on a clipboard, and then have it open the browser and vacuum up the full-length news items I want while I’m away from my desk.

        Apart from being able to schedule accumulated d/l targets, I guess I’m really trying to do what Co-Citer does, but in reverse: Co-Citer presumes that the user has read the item in full, has left it open, and wants to grab some or all of it. But I want to read headlines and synopses, and to have the grabber go back later to get the full stories from the link at the top or bottom of the headline or synopsis — so that I can read the day’s clipping file when I have leisure, and not be tied to my keyboard while cycling through 30-50 items per day.

        It’ll be very interesting to watch Cognitum’s products grow, using their sophisticated information model. I expect that they will integrate the two free text and image utilities, and probably add link grabbing. When they do, the resulting prodcts will be VERY powerful.
        Thanks for you help!
        Peter

    • #544625

      How about http://www.newsisfree.com? “This site collects headlines from 1957 sources around the web and lets you manage them in new ways.”

      • #544675

        Here the site controls access to the publications – and it also systematically collects information on the user’s interests and visit frequencies.

        But for me the biggest problem is that it just doesn’t offer access to many of the science and technology publications and boards that I follow systematically on daily/weekly/monthly bases.

        Thanks for the heads-up!
        Peter

    Viewing 3 reply threads
    Reply To: Selective News Grabber

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: