• Address parsing

    Author
    Topic
    #1769247

    Does anyone know of samples of address parsing and validation code. I would to parse American addresses to their component parts in the same way that CASS-certified software must do in order to faciliate ZIP Plus 4 look up. I do NOT need CASS-certified software however. The stored values will be put into fields for GIS software to use.

    Viewing 2 reply threads
    Author
    Replies
    • #1782716

      Hi Eddie –

      If the address ‘fields’ are comma delimited, i assume there is no problem.

      The problem is when the fields are unknown. One can make some educated guesses, but there are no guarantees. For example
      123 main street anytown astate azip
      or
      123 apple blossom lane SE apt a-1 name of the retirement village anytown astate azipplus4

      etc.

      What kind of case are you dealing with?

      • #1782718

        My idea would be to allow users to enter the address line in one continuous line (the street address line that is) and then have a function that parses the line to its component parts. According to the USPS, the majority of addresses have a proper structure, something like street number, pre-directional, street name, street type, post-directional, secondary designator (Apt, Bldg, and so) secondary address (the Apt, Bldg, and so on number). Of course, the fun comes with Rural Routes, PO Boxes and the like. The USPS recommends parsing from the right. On first brush, I can picture the process. But I was wondering if someone already has done this.

        Ideally, I could allow three or more lines, including the delivery line (city, state, and zip) and parse the whole thing. Something along the lines of how MS Outlook does it.

        Also, ideally if there was any point during the parsing process that was ambiguous, the user would be presented with what can be deduced programmatically, and prompted to finish the parsing. (For example, Illinois and Wisconsin have street numbers that begin with letters, not numbers (E300, E301 and so on).

        • #1782720

          I would go with the Outlook style, simply because it is ‘well known’. Allow 3 lines for the ‘address’, and a line each for the city state zip.

          Why bother parsing when the addressee _knows_ how it’s s’posed to look?

          • #1782721

            > Why bother…

            1. Unfortunately, data entry people don’t always take the time to check it on it. The names are handwritten on petition sheets. Sometimes the address is garbled and they are making a best guess at it. Some of our offices are entering 20,000 names/addresses a year. Sometimes speed is more important than accuracy.

        • #1782726

          Hawaii is even worse, and Utah has addresses on a grid (24 East Temple 12 South Temple). Some companies have a whole address or zip code to themselves, and you can’t rely on the city to match the address. The spelling in the USPS database may not match the address you’ve been given, so you’ll never be able to validate it. I’ve worked with apps that attempted to do this, and they are SLOW, even the best of them. Your best best is to give the users three lines for the company name/street address and have the Enter key cause them to advance to the next field. You can waste a lot of time trying to get this right, and it probably won’t ever work as well as just letting them put it in the way it should appear. I know, because I beat my head against this same wall for over 3 years before I learned my lesson.

    • #1782746

      A third party software might help. some were for names but might work for addresses. One is ParseRat and the other is Splitter 2000.
      http://www.users.bigpond.com/wemba/splitter.htm
      http://www.zdnet.com/downloads/stories/info/0,,0017U8,.html
      Do a search for ParseRat.

    • #1782748

      Look for a Third Party Utility that might be adapted for addresses instead of names.

      Splitter 2000
      From InfoPlan
      http://www.users.bigpond.com/wemba/splitter.htm
      http://www.zdnet.com/downloads/stories/info/0,,0017U8,.html

      ParseRat

      You might try Excel’s Text to Column feature. Have plenty of blank columns to the right. Back up the Excel file before you try. You can then copy/paste more than one cell into new fields. Then do an update query from Excel into Access.

    Viewing 2 reply threads
    Reply To: Address parsing

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: