Overview
The main benefit of converting PDFs to Word documents is the ability to edit the text directly within the file. This is especially helpful if you want to make significant changes to your PDF. If most data of your PDF are in tabular form, you can choose to convert it to an Excel spreadsheet. In the following sections, I will introduce how to convert searchable PDF to Word and Excel, and how to convert PDF to images as well by using Spire.PDF for Java.
Installing Spire.Pdf.jar
If you create a Maven project, you can easily import the jar in your application using the following configurations. For non-Maven projects, download the jar file from
this link and manually add it as a dependency in your application.
- <repositories>
- <repository>
- <id>com.e-iceblue</id>
- <name>e-iceblue</name>
- <url>http:
- </repository>
- </repositories>
- <dependencies>
- <dependency>
- <groupId> e-iceblue </groupId>
- <artifactId>spire.pdf</artifactId>
- <verson>4.1.2</version>
- </dependency>
- </dependencies>
Convert PDF to DOC or DOCX
Conversion from PDF to Word or Excel is quite straightforward by using this library. Create a PdfDocument object to load the original PDF document, and then call saveToFile() method to save PDF in .doc, .docx, .xls, or .xlsx file format.
- import com.spire.pdf.FileFormat;
- import com.spire.pdf.PdfDocument;
-
- public class ConvertPdfToWord {
- public static void main(String[] args) {
-
- PdfDocument pdf = new PdfDocument();
-
- pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");
-
- pdf.saveToFile("ToWord.docx", FileFormat.DOCX);
- pdf.close();
- }
- }
Convert PDF to XLS or XLSX
- import com.spire.pdf.FileFormat;
- import com.spire.pdf.PdfDocument;
-
- public class ConvertPdfToExcel {
- public static void main(String[] args) {
-
- PdfDocument pdf = new PdfDocument();
-
- pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");
-
- pdf.saveToFile("ToExcel.xlsx", FileFormat.XLSX);
- pdf.close();
- }
- }
Convert PDF to PNG
Converting PDF to images requires a little more code, but it's not complicated at all. After a PDF file is loaded, call saveAsImage() method to save the specific page as image data. Then, write the data into a .png file by using the ImageIO.write() method.
- import com.spire.pdf.PdfDocument;
- import javax.imageio.ImageIO;
- import java.awt.image.BufferedImage;
- import java.io.File;
- import java.io.IOException;
-
- public class ConvertPdfToImage {
-
- public static void main(String[] args) throws IOException {
-
-
- PdfDocument pdf = new PdfDocument();
-
-
- pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");
-
-
- BufferedImage image;
-
-
- for (int i = 0; i < pdf.getPages().getCount(); i++) {
-
-
- image = pdf.saveAsImage(i);
-
-
- File file = new File(String.format("out/ToImage-%d.png", i));
- ImageIO.write(image, "PNG", file);
- }
- pdf.close();
- }
- }
Conclusion
There are many solutions out there on the internet that can do the file format conversion programmatically. This scenario has proven to be a reliable one. The converted document retains the layout and almost everything of the original file. Apart from the formats mentioned above, Spire.PDF also supports converting PDF to HTML, SVG, PDF/A, etc.