I am trying to extract different parts from text strings, for the purpose of later using AutoFilter or some other filter etc.
Among other information in a sheet I have a column with text strings. A basic example could look like this:
ABC123_XY4567_Name;Another name_Third name_**.xyz
The information that is interesting is the numbers and names and eventual last codes (**), it can even be some more (_**_##). In this first step I’m working with, the first name (“Name”), is the same for all the text strings, so I’m not extracting that at the moment.
The problem is that sometimes the string looks a little different, but still with the main separators. Ex. another ABC code in string. At the moment I haven’t bothered with those who have two codes, i.e. extracting the second in another column but guess it would be simple and useful.
ABC123_ABC456_XY4567;1_Name;Another name_Third name_**.xyz
But as you can see it sometimes also is part of a set (“1_Name”) and then it has another first “;” suddenly before “Name”.
So, if we look at the first example and I extract “Another name”, I could search for “;” add 1, and get the start position for A in “Another name”. Works OK. But since it sometimes is another “;” earlier in the string I must use better approach. IF it’s an earlier “;” it’s always after XY code, I could then search for XY instead and add 6 or something to get a start position for the real search for “;”, i.e. the second. But I get some problem with the 1_.
The first parts are easy:
Let’s say the first string above is in B10, then I could set up columns for: “ABC” “XY” “Another name” and “Third name”.
ABC
=MID(B10;4;3) gives 123
XY
=MID(B10;SEARCH(“XY”;B10)+2;4) gives 4567
Another name
=MID(B10;SEARCH(“;”;B10)+1;SEARCH(“_”;B10;SEARCH(“;”;B10)+1)-SEARCH(“;”;B10)-1) gives Another name.
But if the string looks like these:
ABC789_XY4567;1_Name;Another name_Third name_**.xyz
ABC123_ABC789_XY4567;1_Name;Another name_Third name_**.xyz
with or without two ABC codes, but the odd ;1_ before name, I don’t get Another name, I get 1. It’s the first semicolon that’s the problem.
And last the Third name; many times the string ends with special codes, (not exact but as example) _** or _## or _**_##. Thus when extracting Third name I initially searched for last “_” to get end position for Third name. Worked OK.
=MID(B10;SEARCH(“_”;B10;25)+1;SEARCH(“_”;B10;SEARCH(“_”;B10;25)+1)-SEARCH(“_”;B10;25)-1) gives Third name.
But if the string ends like this, i.e. no end codes after Third name:
_Third name.xyz
I get #VALUE error.
So problems are (at the moment ): Another name if a first “;” and Third name since string can end differently (code _** or just “.”).
Any help is greatly appreciated.