• Document has box pox — where do they come from? (Word 2000)

    Home » Forums » AskWoody support » Productivity software by function » MS Word and word processing help » Document has box pox — where do they come from? (Word 2000)

    Author
    Topic
    #389155

    I’ve attached part of a document to show you the little boxes that are appearing arbitrarily throughout. We don’t know where they come from, but we see them in documents periodically. You can’t search to them because you can’t paste them into the Find string. The only way we know to get rid of them is to scroll through the entire document manually and delete them individually.

    Has anyone seen these before?

    We’re using Word 2000 (9.0.6926 SP-3) on Windows 2000 Professional, Version 5.0.2195, Service Pack 3, Build 2195.

    Russ

    Viewing 2 reply threads
    Author
    Replies
    • #686202

      They seem to be the thing you get when you type Alt+0015 (using the number pad). But this isn’t searchable, so I guess that doesn’t help much. My guess is that it’s an artifact of the disintegration of a Word control code of some kind. I don’t see the pattern, but then, I don’t know what this document is supposed to look like.

      • #686211

        The document I attached was several selected portions of a larger document, so it didn’t show you an obvious pattern of where the boxes tended to appear. They mostly seem to occur at the beginning of something — a paragraph, a page, but not always, since sometimes they come right in the middle of a paragraph. However, on several that occurred in the middle of a paragraph, they appeared at the beginning of a new page where the paragraph had broken (i.e., the box was the first character on the new page).

    • #686208

      I agree with Jefferson on the Ascii code but you can search and replace these characters by finding ^15 and replacing with nothing.

      I don’t know how they get there by themselves but they are easy to get rid of.

      The macro I use to figure out the ascii and unicode of first selected character is below.
      Sub temp1()
      MsgBox “Ascii: ” & Asc(Selection.Characters(1).text) & vbCr & “Unicode: ” & AscW(Selection.Characters(1).text)
      End Sub

      • #686219

        Yo, Andrew!! I was just talking to the attorney who had the problem document and told him about your solution. He was able to do a Find/Replace and get them out. Thanks so much. We’ve been struggling with this for a long time.

        Russ

      • #702868

        Andrew, you helped me with this problem two months ago, but I can’t figure out how to make your solution work on this latest occurrence.

        I’ve attached a 3-line document that has a box in it (before the word “impair”). I used the macro you gave me to identify its Ascii value (63) and Unicode value (-557), but I don’t know what to do with those values. I tried searching to ^63 and -557, but no luck. What am I doing wrong? confused

        Russ

        • #702888

          Russ

          I don’t think my solution does work with this current problem and I don’t know why [puzzled]

          But there is a search and replace solution. You can copy this object and paste it into the find box and then search (and replace) will find it.

          I am using Word 97 on this machine so I havent tested this on Word 2000 or 2002 but I expect them to be the same.

          • #702890

            Andrew,

            Unfortunately, those boxes won’t copy into the Find string. That’s why I needed an alternate solution. I thought we had found the cure-all in your macro, but I guess it doesn’t always apply. Thanks for getting back. –Russ

            • #702892

              In my version of Word on this machine (Word 97 SR2) copying the box to the find does work. Are you trying to do this in the application or by using VBA code?

              I don’t know why that would have changed in Word 2000. Do you have access to an older version of Office to check whether its possible there?

              I will test this on another machine (Word 2002) tonight and let you know if I have any luck.

            • #702895

              Hi Andrew:
              I tested it on a Word 2000 machine & it didn’t work. I tried to search using copy & paste, ^63, ^063, & ^0063. No luck.

            • #703083

              Thanks, guys. When I get to work today, I’ll fiddle with some of what you mentioned.

              Andrew, I have Word 98 on a Mac here at home and the box won’t copy into the Find string on it either. Odd that you can do it in Word 97.

        • #702908

          I think we need Klaus Linke to weigh in here. I’m sure at one point he explained where the negative values come from and how to make them useful, but of course, I can’t access it right now.

          Even if I don’t understand how Word handles Unicode strings, it does seem to understand this, which I threw into the Immediate window and it did lock on to the critters:

          selection.Find.Execute chrw(-557)

          If you record a Replace All macro with some other character, open the macro editor and use this code in place of whatever you recorded with, perhaps that will do the trick.

          • #703144

            Jefferson, your suggestion was a big help. I recorded a Replace All macro with another character (“r”), opened the macro editor and replaced the “r” (including the quote marks) with “chrw(-557)” (without the quote marks). I then ran the macro, which found and deleted the box. Interestingly, when I opened the Find dialog box, that box character was still in the Find string, left over from running the macro. And sure enough, I did a Find Next and it found the box. Thanks.

            One additional thing I just discovered. After running the macro that deleted the box, I did an Undo to get the box back. But instead of the box reappearing, a non-breaking hyphen appeared in its place. I guess that tells us something, but I’m not sure what.

            By the way, how do you bold something in our posts here in Woody’s Lounge? I tried Ctrl-B, but it didn’t work.

            • #703146

              >> By the way, how do you bold something in our posts here in Woody’s Lounge?

              See Help 19. You can either use the 1-Click TagPanel that is available to the right of the subject box when you’re creating a post or reply, or type the “tags” yourself. For instance, to make a word bold, put before it and after it: this becomes this.

            • #703148

              Wow, Hans, you are quick. Thanks again.

          • #705006

            Missed this — I’m just back from holidays in Switzerland Switzerland claude
            And I can’t add much to what you already figured out.

            VBA only knows signed “short” Integers (of length 2 bytes), so any hex number between &H8000 and &HFFFF appears as a negative decimal number.
            As Alan figured out, the decimal code -557 would “really” be character U+FDD3, which, according to the Unicode standard, belongs to the subset “Arabic Presentation Forms – A”.
            All characters from this subset aren’t really necessary since they are contextual variants of the primary Arabic letters, spacing forms, ligatures, and the like.
            U+FDD3 and a few others are marked as “”, which usually means they had some character associated with them in an old code page which are better represented by some other Unicode characters.

            Maybe it’s some punctuation or spacing character from an old implementation of storing Arabic text.

            If the document was converted from some document containing Arabic text, that could explain these funny characters — but I’m just guessing.

            cheers Klaus

            • #705280

              Klaus, thanks for your reply. The document was something we scanned using OmniPage Pro, Version 10.0 (OCR Engine Version 10000) and and then saved as a Word document (in the format “Word 97/2000” within Windows 2000).

              Russ

            • #705298

              > The document was something we scanned using OmniPage Pro

              ouch Ouch! I’ve used OmniPage a bit myself, and the people who are responsible for the Word export filter don’t seem to have much of a clue about Word.

              So it’s probably just some random junk after all disappointed
              cheers Klaus

            • #705302

              Hmmm. . . There is a certain randomness of the boxes’ appearance. They appear in several kinds of places–sometimes in place of quotation marks or page breaks, sometimes right in the middle of a line for no apparent reason–not real consistent.

            • #705318

              I have been using OmniPage Pro since version 10, and now 12 and never had this problem.

              DaveA I am so far behind, I think I am First
              Genealogy....confusing the dead and annoying the living

            • #705321

              Dave, I haven’t been able to recreate a situation where the boxes appear, so I still don’t have a handle on it.

              I tried scanning the identical text that produced these latest boxes, then saving the document in Word format, but no boxes appeared.

              Since we use both Word and WordPerfect at my firm and since we’ve never had this problem with WordPerfect, my solution for scanning is to save the document in WordPerfect, then paste as Unformatted Text into Word.

            • #705322

              Dave, I haven’t been able to recreate a situation where the boxes appear, so I still don’t have a handle on it.

              I tried scanning the identical text that produced these latest boxes, then saving the document in Word format, but no boxes appeared.

              Since we use both Word and WordPerfect at my firm and since we’ve never had this problem with WordPerfect, my solution for scanning is to save the document in WordPerfect, then paste as Unformatted Text into Word.

            • #705319

              I have been using OmniPage Pro since version 10, and now 12 and never had this problem.

              DaveA I am so far behind, I think I am First
              Genealogy....confusing the dead and annoying the living

            • #705303

              Hmmm. . . There is a certain randomness of the boxes’ appearance. They appear in several kinds of places–sometimes in place of quotation marks or page breaks, sometimes right in the middle of a line for no apparent reason–not real consistent.

            • #705299

              > The document was something we scanned using OmniPage Pro

              ouch Ouch! I’ve used OmniPage a bit myself, and the people who are responsible for the Word export filter don’t seem to have much of a clue about Word.

              So it’s probably just some random junk after all disappointed
              cheers Klaus

            • #705281

              Klaus, thanks for your reply. The document was something we scanned using OmniPage Pro, Version 10.0 (OCR Engine Version 10000) and and then saved as a Word document (in the format “Word 97/2000” within Windows 2000).

              Russ

    • #702960

      Hi Russ

      I believe that this “anomoly” can occur when a character is inserted using Insert -> Symbol. The box shows up as the D3 FD pair in a hex editor, or 211 253 decimal. This is out of character (pun grin) with Word’s method of storing other characters, in which the second byte is always 00.

      Don’t ask me why, but I think this is the root of the problem. It used to create problems in the RTF for the Windows Help compiler, and one had to paste symbols from the Windows Character Map rather than use the Insert -> Symbol. And asc values 40, 63 and 95 are apparently the offenders.

      Alan

    Viewing 2 reply threads
    Reply To: Document has box pox — where do they come from? (Word 2000)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: