• Cleaning up HTML (97 & XP)

    Author
    Topic
    #399940

    With Office 97, I could paste text from Word or Excel into FrontPage 97 and get fairly clean HTML. Office XP is not smart enough to handle this fairly simple task. It adds a lot of extra junk, especially with Excel XP. I can run MS Filter to get rid of a little bit of this fat but a bunch remains. Any suggestions on how to either make Office XP work as well as Office 97 or a utility that will truly clean up the bloated files Office XP produces?

    Ronny

    Viewing 6 reply threads
    Author
    Replies
    • #776639

      I paste to Notepad, then copy from there and paste into FrontPage. But this might not be slick enough for your needs.

      Many solutions have been suggested in the past but I’m not sure that a satisfactory one really has been found. Boards to search include this one, as well as the application boards (Word and Excel) and the Web Design forum. Good luck!!

      • #776840

        That is a good idea for handling Word files. Thanks, I’ll start doing that. I would not, of course, work with Excel files. They get blasted a lot worse than Word files do.

        Ronny

        • #777196

          Do you do VBA? Hans posted a VBA procedure for Excel that converts it to a table for the Lounge. Lounge markup is very similar to HTML, so with a little tweaking (start by switching [ and ] brackets to ), it just might do what you need (you’d paste directly into the HTML pane). Check out post 164109.

          • #777439

            I’m afraid I don’t know the first thing about VBA. Anyway, Office 97 keeps the shading when I do the paste and I would really like to do that. I am a teacher and I am posting my grade book online. (Students have codes not names.) I shade any assignment that was not turned in as red. That helps the students spot what they are not doing.

            Ronny

            • #778126

              With a bit (okay a lot!) of work I was able to capture two color fills, yellow and red. Here’s how you’d use it:

              1. Open the attached workbook (Enable macros) and your source workbook.
              2. In your source workbook, select the cells you want in your HTML page.
              3. Press Alt+F8 to open the Macros dialog and choose the one macro (I’d include the name, but i forgot to make a note of it before closing Excel).
              4. Assuming it doesn’t mention any errors ( crossfingers), the table code is on the clipboard and you now can paste it into the HTML pane of your FrontPage document.

              I tested in Excel 2002. Does it work for you?

            • #779435

              It is a good start. It does run without error and it transfers the data without any of the junk. With a straight Excel–>Frontpage (I use 97) you can paste into the regular screen. After running this macro, you have to be sure to paste in the HTML view. That is not a problem.

              The macro seems to strip out all the formatting. That is actually better than having the junk Microsoft adds and might be easier to work around. As you can see from the attached before (Excel) / after (Frontpage 97) it removes:

              1. Background colors
              2. Cell merges (minor)
              3. Number formatting
              4. Cell formatting, e.g. Bold

              I don’t know how much of the problems are due to my using the older Frontpage 97 rather than newer versions.

              Ronny

              PS: This is my gradebook so I’ve erased all names.

            • #780719

              I think it would be pretty easy for an Excel-VBA person to help you preserve the bold. The number formatting, I’m not sure, this would depend on the complexity, because the code could only handle so many options before it became too unwieldy. Regarding the background color, I took “red” to mean pure red, not pink. smile You or an Excel-VBA person could modify the code to do pink, once the RGB value is determined.

            • #780720

              I think it would be pretty easy for an Excel-VBA person to help you preserve the bold. The number formatting, I’m not sure, this would depend on the complexity, because the code could only handle so many options before it became too unwieldy. Regarding the background color, I took “red” to mean pure red, not pink. smile You or an Excel-VBA person could modify the code to do pink, once the RGB value is determined.

            • #826862

              I have found a utility that can do this for me. It is called Detagger

              http://www.jafsoft.com/%5B/url%5D

              It is shareware and costs $25.00. It was designed to remove HTML tags and convert to text but the author showed me how to use it to remove non-standard HTML tags (e.g. the XML tags) using a policy option along with the software. In case anyone else has this problem, I’ve attached the policy file to this message and I’ll show the author’s instructions below:

              >You need to select to remove non-standard attributes (to lose the “x:str=” etc) and to remove stylesheets (to lose the “style=”etc).
              >
              >I’ve attached a policy file that has the required option in it,

              It works perfectly. I end up with nothing but clean HTML. As soon as I ran a few tests, I went out and registred the software.

              My one limitation now is that the cell shading gets lost in the process. It is not visible when I first paste the contents into FrontPage from Excel so the problem is with the Excel to FrontPage conversion and not with the Detagger conversion. I hate it but it’s a small loss to put up with rather than having all the XML mess in the file.

              Ronny

            • #826863

              I have found a utility that can do this for me. It is called Detagger

              http://www.jafsoft.com/%5B/url%5D

              It is shareware and costs $25.00. It was designed to remove HTML tags and convert to text but the author showed me how to use it to remove non-standard HTML tags (e.g. the XML tags) using a policy option along with the software. In case anyone else has this problem, I’ve attached the policy file to this message and I’ll show the author’s instructions below:

              >You need to select to remove non-standard attributes (to lose the “x:str=” etc) and to remove stylesheets (to lose the “style=”etc).
              >
              >I’ve attached a policy file that has the required option in it,

              It works perfectly. I end up with nothing but clean HTML. As soon as I ran a few tests, I went out and registred the software.

              My one limitation now is that the cell shading gets lost in the process. It is not visible when I first paste the contents into FrontPage from Excel so the problem is with the Excel to FrontPage conversion and not with the Detagger conversion. I hate it but it’s a small loss to put up with rather than having all the XML mess in the file.

              Ronny

            • #779436

              It is a good start. It does run without error and it transfers the data without any of the junk. With a straight Excel–>Frontpage (I use 97) you can paste into the regular screen. After running this macro, you have to be sure to paste in the HTML view. That is not a problem.

              The macro seems to strip out all the formatting. That is actually better than having the junk Microsoft adds and might be easier to work around. As you can see from the attached before (Excel) / after (Frontpage 97) it removes:

              1. Background colors
              2. Cell merges (minor)
              3. Number formatting
              4. Cell formatting, e.g. Bold

              I don’t know how much of the problems are due to my using the older Frontpage 97 rather than newer versions.

              Ronny

              PS: This is my gradebook so I’ve erased all names.

            • #778127

              With a bit (okay a lot!) of work I was able to capture two color fills, yellow and red. Here’s how you’d use it:

              1. Open the attached workbook (Enable macros) and your source workbook.
              2. In your source workbook, select the cells you want in your HTML page.
              3. Press Alt+F8 to open the Macros dialog and choose the one macro (I’d include the name, but i forgot to make a note of it before closing Excel).
              4. Assuming it doesn’t mention any errors ( crossfingers), the table code is on the clipboard and you now can paste it into the HTML pane of your FrontPage document.

              I tested in Excel 2002. Does it work for you?

          • #777440

            I’m afraid I don’t know the first thing about VBA. Anyway, Office 97 keeps the shading when I do the paste and I would really like to do that. I am a teacher and I am posting my grade book online. (Students have codes not names.) I shade any assignment that was not turned in as red. That helps the students spot what they are not doing.

            Ronny

        • #777197

          Do you do VBA? Hans posted a VBA procedure for Excel that converts it to a table for the Lounge. Lounge markup is very similar to HTML, so with a little tweaking (start by switching [ and ] brackets to ), it just might do what you need (you’d paste directly into the HTML pane). Check out post 164109.

      • #776841

        That is a good idea for handling Word files. Thanks, I’ll start doing that. I would not, of course, work with Excel files. They get blasted a lot worse than Word files do.

        Ronny

    • #776967

      You might want to check out post 242602, I didn’t test Excel – so I’d be interested in your results.

      Cheers

      • #777443

        His method #5 is the only one he found that worked great:

        Method 5 – Double Filter the Word Document and Insert into FrontPage.
        First save the Word document as “Filtered” HTML.

        Excel does not appear to have a Filtered HTML option so the only good method he found cannot be used with Excel. I suspose I could paste the table into Word and then use this approach.

        I tried Excel -> Word -> MSClean -> FrontPage and it worked no better than a direct paste. In fact, it did worse since Word created the little GIF boxes and scattered dozen or more around my table.

        Interestingly, the HTML file was 2K larger after it was “cleaned” than before. Something strange is going on here.

        Ronny

      • #777444

        His method #5 is the only one he found that worked great:

        Method 5 – Double Filter the Word Document and Insert into FrontPage.
        First save the Word document as “Filtered” HTML.

        Excel does not appear to have a Filtered HTML option so the only good method he found cannot be used with Excel. I suspose I could paste the table into Word and then use this approach.

        I tried Excel -> Word -> MSClean -> FrontPage and it worked no better than a direct paste. In fact, it did worse since Word created the little GIF boxes and scattered dozen or more around my table.

        Interestingly, the HTML file was 2K larger after it was “cleaned” than before. Something strange is going on here.

        Ronny

    • #776968

      You might want to check out post 242602, I didn’t test Excel – so I’d be interested in your results.

      Cheers

    • #776969

      Ronny,

      (The following is a repeat of an earlier broadcast..) I also have found that Word XP (2002) puts an excessive amount of overhead code in FrontPage 2000. What I have done is copy from Word to the buffer and then in FrontPage 2000 perform a Paste Special – Normal Paragraphs with line breaks (edit/paste special). This will reduce all the ‘junk’ that Word added. Note that if you have any special formatting in Word you will lose it upon a Paste Special. Anyway, play with it until you find a satisfying setting.

      LMD

      • #777445

        That seems to be the tradeoff, lose your formatting or keep all the crap Microsoft adds to the file.

        Ronny

      • #777446

        That seems to be the tradeoff, lose your formatting or keep all the crap Microsoft adds to the file.

        Ronny

    • #776970

      Ronny,

      (The following is a repeat of an earlier broadcast..) I also have found that Word XP (2002) puts an excessive amount of overhead code in FrontPage 2000. What I have done is copy from Word to the buffer and then in FrontPage 2000 perform a Paste Special – Normal Paragraphs with line breaks (edit/paste special). This will reduce all the ‘junk’ that Word added. Note that if you have any special formatting in Word you will lose it upon a Paste Special. Anyway, play with it until you find a satisfying setting.

      LMD

    • #778154

      Is there a program that can clean up FrontPage generated HTML and take out all the redundant and unnecessary tags it adds?

      • #778388

        Which version of FP are you working with?

        • #779439

          Frontpage 97. Until this pasting thing came up with Office 2000/XP, it worked great and met all my needs so I never saw a reason to “upgrade”.

          Ronny

          • #782059

            BTW – did you try the Insert file function in FrontPage? I don’t have the older version on my system – but when I test with Excel XP/FP XP it comes across beautifully. However, I’m still struggling to get “clean” code from Word XP to FP XP. Apparently the HTML filter will not install with Office XP on first.

            Cheers

          • #782060

            BTW – did you try the Insert file function in FrontPage? I don’t have the older version on my system – but when I test with Excel XP/FP XP it comes across beautifully. However, I’m still struggling to get “clean” code from Word XP to FP XP. Apparently the HTML filter will not install with Office XP on first.

            Cheers

        • #779440

          Frontpage 97. Until this pasting thing came up with Office 2000/XP, it worked great and met all my needs so I never saw a reason to “upgrade”.

          Ronny

      • #778389

        Which version of FP are you working with?

      • #778503

        Check out this site – http://www.vorburger.ch/kissfp/%5B/url%5D. KissFP does a great job. I’ve used it with FP2000, XP and 2003. You’ll have to pay for it but it’s been invaluable to me.

      • #778504

        Check out this site – http://www.vorburger.ch/kissfp/%5B/url%5D. KissFP does a great job. I’ve used it with FP2000, XP and 2003. You’ll have to pay for it but it’s been invaluable to me.

    • #778155

      Is there a program that can clean up FrontPage generated HTML and take out all the redundant and unnecessary tags it adds?

    Viewing 6 reply threads
    Reply To: Cleaning up HTML (97 & XP)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: