• Converting multilingual PDF to Word

    • This topic has 5 replies, 5 voices, and was last updated 9 years ago.
    Author
    Topic
    #505254

    Hi there. I’m trying to take text from a PDF (https://www.cms.gov/CCIIO/Resources/Regulations-and-Guidance/Downloads/Appendix-B-Sample-Translated-Taglines.pdf) and put it into Word. Copying and pasting is not quite doing it. I have Adobe Acrobat, so I’m able to save it as a Word doc and an RTF, but even that doesn’t fully help. The languages that are giving us problem are Arabic, Burmese, and Persian.

    This is in Word 2010; we are scheduled to update to 2013 later this year, but for now I don’t have it.

    Is there something I need to install to properly convert the text?

    Viewing 3 reply threads
    Author
    Replies
    • #1559763

      There are quite a number of online PDF-to-Word converters, which work fairly well with a page of text, but are of variable success when you feed them filled-in forms.

      I would say that you would stand no chance whatever of converting a segment of PDF in those languages into Word! (Just my opinion!)

      If what you show is the full extent of what you need to convert, I would suggest a retype by someone who knows those languages.

      BATcher

      Plethora means a lot to me.

    • #1559789

      Or maybe Select and Copy each different-language section and Paste into an online translation page/site. Some listed here:
      https://duckduckgo.com/?q=online+translator
      You’d have to choose the language to translate from for each. Then Select and Copy each English version and Paste into the document.

      Before you wonder "Am I doing things right," ask "Am I doing the right things?"
    • #1559792

      Does the attached Word doc look correct as a Burmese copy?

      If so, I managed to do this by copying from the PDF file and pasting into Word after I had installed the Burmese font.

      • #1559848

        Does the attached Word doc look correct as a Burmese copy?

        If so, I managed to do this by copying from the PDF file and pasting into Word after I had installed the Burmese font.

        Unfortunately I can’t tell, as the text does not render on my machine. I will have to install the Burmese font. I appreciate the effort, though. The fact that you generated a Word document that looks good to you means that I can do the same thing when I install the font. That’s great news.

        The Arabic and Persian conversion threw me off because it looked like it was just garbled (PDF-to-Word conversions are not known for being flawless). It seems that the issue is that the words are right-to-left in the PDF, but they come over to Word as left-to-right. Looks like I may have to write a quick macro to reverse the order of the words.

        Will do more digging. I appreciate your input.

    • #1559797

      The OP didn’t say he wanted a translation, just a format conversion. My guess is that he lacks the appropriate fonts on his system. Acrobat was specifically designed to allow font embedding for accurate rendering on any system. However that only solves the font problem for Acrobat, not for Word.

      OP needs to install fonts for Arabic, Burmese, and Persian. While fonts are plentiful and cheap, these aren’t exactly standard fonts on Western computer systems. When I check my system I have a font called “Arabic Typesetting Regular”, but nothing for the other 2 languages. Unless they have been named in a way that would cause me to overlook them…

    Viewing 3 reply threads
    Reply To: Converting multilingual PDF to Word

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: