Converting pdf to docx or merging two docx

Muthuvelkumaran · January 26, 2022, 3:52pm

Can someone provide me solution on java code how to merge two docx OR HOW TO CONVERT PDF TO DOCX.

Currently I have byte arrays of docx[coming from outside via soap service] and pdf [generated from pega side].

karthickm2078 · May 16, 2022, 7:52am

@MuthuvelkumaranK0859 Hi Muthuvel,

Create a java function for conversation from DOCx to PDF. only thing is we need to have all below jars in PEGA

  Apache POI 3.15
    org.apache.poi.xwpf.converter.core-1.0.6.jar
    org.apache.poi.xwpf.converter.pdf-1.0.6.jar
    fr.opensagres.xdocreport.itext.extension-2.0.0.jar
    itext-2.1.7.jar
    ooxml-schemas-1.3.jar

Note: Please cross verify whether this jar are already exist. if yes straight forward to build the fucntion

sample below :

    public static void main(String[] args) {
        WordConvertPDF cwoWord = new WordConvertPDF();
        cwoWord.ConvertToPDF("D:/Test.docx", "D:/Test.pdf");
    }

    public void ConvertToPDF(String docPath, String pdfPath) {
        try {
            InputStream doc = new FileInputStream(new File(docPath));
            XWPFDocument document = new XWPFDocument(doc);
            PdfOptions options = PdfOptions.create();
            OutputStream out = new FileOutputStream(new File(pdfPath));
            PdfConverter.getInstance().convert(document, out, options);

Muthuvelkumaran · May 16, 2022, 8:46am

@karthickm2078
Thanks for the solution karthi, since we have already move on with another solution of using Qoppa for Conversion.
We need only one jar but we need to buy license keys from qoppa.

SinanH45 · August 15, 2022, 7:03pm

@karthickm2078 Hi, I am trying the same. I want to convert a generated Word File (.docx) from a Word-Template in a PDF File. I have already implemented and tested a simple Java Code with the POI Dependencies. My Java Code is a Maven Project. How can I integrate my written Java in Pega, so that I can use the Code to convert the docx in a step to a pdf. Thanks

karthickm2078 · August 23, 2022, 12:57pm

@SinanH45 Hi, If your code already ready in JAVA, if it executable then it is simpler to make that to convert in java. Keep always this in mind

Having all the libraries in PEGA ( Whatever library you are using in JAVA, same libraries should be imported in Pega)
instead of creating direct JAVA code, check the Java API and make it convert your code in Function where u can pass simple param to convert

For example : File name and type

then always to debug add a stack trace in code, since it is java we need to debug for long..

Another ways :

Adding simple code :

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;

import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfWriter;

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;

public class TestCon {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub

        POIFSFileSystem fs = null;  
        Document document = new Document();

         try {  
             System.out.println("Starting the test");  
             fs = new POIFSFileSystem(new FileInputStream("D:/Resume.doc"));  

             HWPFDocument doc = new HWPFDocument(fs);  
             WordExtractor we = new WordExtractor(doc);  

             OutputStream file = new FileOutputStream(new File("D:/test.pdf")); 

             PdfWriter writer = PdfWriter.getInstance(document, file);  

             Range range = doc.getRange();
             document.open();  
             writer.setPageEmpty(true);  
             document.newPage();  
             writer.setPageEmpty(true);  

             String[] paragraphs = we.getParagraphText();  
             for (int i = 0; i < paragraphs.length; i++) {  

                 org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
                // CharacterRun run = pr.getCharacterRun(i);
                // run.setBold(true);
                // run.setCapitalized(true);
                // run.setItalic(true);
                 paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");  
             System.out.println("Length:" + paragraphs[i].length());  
             System.out.println("Paragraph" + i + ": " + paragraphs[i].toString());  

             // add the paragraph to the document  
             document.add(new Paragraph(paragraphs[i]));  
             }  

             System.out.println("Document testing completed");  
         } catch (Exception e) {  
             System.out.println("Exception during test");  
             e.printStackTrace();  
         } finally {  
                         // close the document  
            document.close();  
                     }  
         }  
    }

Convert Docx (Word document) to PDF in 8.1 and above | Support Center

Conversation		Replies	Views
Conversion of Word & Excel documents to PDF from Case Attachments General lead-system-architect , case-management , cross-industry , 8-2-7 , dev-designer-studio	11	1960	August 12, 2021
Pega Java library upgrade Pega-as-a-Service pega-platform , lead-system-architect , devops , java-and-activities , financial-services , 7-1-9	4	199	December 10, 2024
Java Class not Found after import : com.spire.doc.Document User Experience user-experience , senior-system-architect , case-management , data-integration , financial-services , 8-6-2	3	130	October 22, 2024
Convert word document to PDF User Experience user-experience , system-architect , case-management , customer-journeys , financial-services , 8-7-1	8	1812	November 2, 2022
How to convert JPEG to PDF User Experience user-experience , senior-system-architect , financial-services , 8-3-1 , dev-designer-studio	4	420	August 11, 2021

Converting pdf to docx or merging two docx

Related topics