• splitting a columm (excel 2003)

    Author
    Topic
    #408978

    hi everyone

    I have a (pretty much messy) column with adresses with combined adress and number.
    I want to seperate the number from the adress.
    the problem is that some adresses consist of multiple words ( so there are multiple spaces in the field)
    so the wizard ” text to colum” doenst help
    also the numbers vary from one to 4 digits
    for instance. “van der valk straat 320” or “damrak 1”
    so trimming right also doesn’t work

    the only I thing I think could work is to find the first space from RIGHT to Left(!) because there is always a space in between number and adress
    but I can’t find any function in excel to work with that.
    does anyone know a solution or a work around?

    thanks ahead!

    diederik

    Viewing 1 reply thread
    Author
    Replies
    • #868187

      If the house number is always the last “word” in the address, you can use the formulas below. But they will fail for situations like these:

      Herensingel 13 A
      Telderskade 3 huis

      since there is a space in what constitutes the “number”. If the addresses are written like this:

      Herensingel 13A
      Telderskade 3-huis

      it will be OK. Say that the addresses are in column A, starting with A1. Enter this formula in B1 to get the “street” part:

      =LEFT(A1,MAX((MID(A1,ROW(1:100),1)=” “)*ROW(1:100))-1)

      and confirm with Ctrl+Shift+Enter (it is an array formula). Next, enter this formula in C1 to get the “number” part:

      =MID(A1,MAX((MID(A1,ROW(1:100),1)=” “)*ROW(1:100))+1,100)

      and confirm with Ctrl+Shift+Enter. Finally, fill down as far as needed. See attached example.

      • #868199

        Hans,
        you’re great!
        thank you very much!

        Diederik

      • #868200

        Hans,
        you’re great!
        thank you very much!

        Diederik

      • #868525

        Hi Hans,

        I was a little to early with my reply yesterday.

        I still have a question about the formula
        I don’t completely understand what the ROW part does in the the formula
        and something goes wrong when I paste the formula down
        see example.

        how does that ROW command exactly work?
        (I do not quite understand the Array thing in this situation)
        thanks

        diederik

        • #868527

          You can fill down the formulas if you make the reference to 1:100 absolute: $1:$100.

          The idea behind these array formulas is from Using Array Formulas in Excel OFC-10 by Bob Umlas.

          ROW($1:$100) evaluates to the array (1, 2, 3, …, 100).
          MID(A1,ROW($1:$100),1) takes A1 apart into individual characters, for example (“D”, “a”, “m”, “r”, “a”, “k”, ” “, “1”, “”, “”, …)
          MID(A1,ROW($1:$100),1)=” ” compares the characters to a space, resulting in an array of TRUE/FALSE: (FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, …)
          This is multiplied by the character index, using the fact that FALSE = 0 and TRUE = 1:
          (MID(A1,ROW($1:$100),1)=” “)*ROW($1:$100) results in (0, 0, 0, 0, 0, 0, 7, 0, 0, …)
          The non-zero entries correspond to spaces in the original string.
          MAX((MID(A1,ROW($1:$100),1)=” “)*ROW($1:$100)) returns the index of the last space in the string, in this example 7.

          For the street part (column , we take everything to the left of this character, for the number part (column C), everything to the right of it.

          • #868531

            Hi Hans,

            Thanks for the explanation.
            That link to explaning array functions is also very usefull!!

            Greetings again.

            Diederik

            • #868941

              Here’s an alternative way without arrays.

              Ken

            • #869035

              Here is the non-array formula, even another space behind the “number” still can do the work.

              Herensingel 13 A
              Telderskade 3 huis B

              Enter this formula in cell B1 to get the “street” part:

              =LEFT(A1,MIN(FIND(“~”,SUBSTITUTE(A1,{0;1;2;3;4;5;6;7;8;9},”~”)&”~”))-1)

              Then, in cell C1 enter this formula to get the “number” part:

              =SUBSTITUTE(A1,B1,””)

              Regards

            • #869077

              Welcome to Woody’s Lounge. Nice formulas! thumbup

              If the street name contains a number, the result is still incorrect, but no need to worry. The Dutch postal service uses an application to split addresses; it contains algorithms to handle most cases, but falls back on a list of exceptions for the really difficult addresses.

            • #869101

              Thanks Hans,

              Just play a way in looking for the digit position, in lieu of last space position

              Sometimes I found that it is danger to use the formula in look for last space postion, if the raw source has not cleaned ( that is the data with some hidden space or nonprintable characters )

              The formula can be modified, if the street name contains a number

              Example:
              Heren2singel 13 A

              B1 :
              =LEFT(A1,SMALL(FIND(“~”,SUBSTITUTE(A1,{0;1;2;3;4;5;6;7;8;9},”~”)&”~”),2)-1)

              Regards

            • #869102

              Thanks Hans,

              Just play a way in looking for the digit position, in lieu of last space position

              Sometimes I found that it is danger to use the formula in look for last space postion, if the raw source has not cleaned ( that is the data with some hidden space or nonprintable characters )

              The formula can be modified, if the street name contains a number

              Example:
              Heren2singel 13 A

              B1 :
              =LEFT(A1,SMALL(FIND(“~”,SUBSTITUTE(A1,{0;1;2;3;4;5;6;7;8;9},”~”)&”~”),2)-1)

              Regards

            • #869078

              Welcome to Woody’s Lounge. Nice formulas! thumbup

              If the street name contains a number, the result is still incorrect, but no need to worry. The Dutch postal service uses an application to split addresses; it contains algorithms to handle most cases, but falls back on a list of exceptions for the really difficult addresses.

            • #869036

              Here is the non-array formula, even another space behind the “number” still can do the work.

              Herensingel 13 A
              Telderskade 3 huis B

              Enter this formula in cell B1 to get the “street” part:

              =LEFT(A1,MIN(FIND(“~”,SUBSTITUTE(A1,{0;1;2;3;4;5;6;7;8;9},”~”)&”~”))-1)

              Then, in cell C1 enter this formula to get the “number” part:

              =SUBSTITUTE(A1,B1,””)

              Regards

            • #868943

              Here’s an alternative way without arrays.

              Ken

          • #868532

            Hi Hans,

            Thanks for the explanation.
            That link to explaning array functions is also very usefull!!

            Greetings again.

            Diederik

        • #868528

          You can fill down the formulas if you make the reference to 1:100 absolute: $1:$100.

          The idea behind these array formulas is from Using Array Formulas in Excel OFC-10 by Bob Umlas.

          ROW($1:$100) evaluates to the array (1, 2, 3, …, 100).
          MID(A1,ROW($1:$100),1) takes A1 apart into individual characters, for example (“D”, “a”, “m”, “r”, “a”, “k”, ” “, “1”, “”, “”, …)
          MID(A1,ROW($1:$100),1)=” ” compares the characters to a space, resulting in an array of TRUE/FALSE: (FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, …)
          This is multiplied by the character index, using the fact that FALSE = 0 and TRUE = 1:
          (MID(A1,ROW($1:$100),1)=” “)*ROW($1:$100) results in (0, 0, 0, 0, 0, 0, 7, 0, 0, …)
          The non-zero entries correspond to spaces in the original string.
          MAX((MID(A1,ROW($1:$100),1)=” “)*ROW($1:$100)) returns the index of the last space in the string, in this example 7.

          For the street part (column , we take everything to the left of this character, for the number part (column C), everything to the right of it.

      • #868526

        Hi Hans,

        I was a little to early with my reply yesterday.

        I still have a question about the formula
        I don’t completely understand what the ROW part does in the the formula
        and something goes wrong when I paste the formula down
        see example.

        how does that ROW command exactly work?
        (I do not quite understand the Array thing in this situation)
        thanks

        diederik

    • #868188

      If the house number is always the last “word” in the address, you can use the formulas below. But they will fail for situations like these:

      Herensingel 13 A
      Telderskade 3 huis

      since there is a space in what constitutes the “number”. If the addresses are written like this:

      Herensingel 13A
      Telderskade 3-huis

      it will be OK. Say that the addresses are in column A, starting with A1. Enter this formula in B1 to get the “street” part:

      =LEFT(A1,MAX((MID(A1,ROW(1:100),1)=” “)*ROW(1:100))-1)

      and confirm with Ctrl+Shift+Enter (it is an array formula). Next, enter this formula in C1 to get the “number” part:

      =MID(A1,MAX((MID(A1,ROW(1:100),1)=” “)*ROW(1:100))+1,100)

      and confirm with Ctrl+Shift+Enter. Finally, fill down as far as needed. See attached example.

    Viewing 1 reply thread
    Reply To: splitting a columm (excel 2003)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: