· tutorials · 9 min read

How To Convert HTML to PDF with Java

Discover the most efficient methods to convert HTML to PDF using Java and select the best tool for your needs

Discover the most efficient methods to convert HTML to PDF using Java and select the best tool for your needs

Introduction

Creating PDFs programmatically in Java is a common requirement, and in 2024 there are various tools and libraries that make this task easier. This article explores some of the most efficient methods for converting HTML to PDF using Java.

Why HTML to PDF?

  • Market Standard: The web’s backbone is HTML, making it a familiar and ubiquitous choice for content formatting.
  • Mature Technology: With its open standards, HTML is supported by a vast array of tools and technologies.
  • Versatility: The plethora of tools simplifies the conversion process.
  • Styling with CSS: Leverage CSS for sophisticated styling in your PDFs.
  • Multimedia Inclusion: HTML provides a rich starting point for media-rich documents.

HTML to PDF using Java Libraries

1. OpenPDF

OpenPDF is a free and open-source Java library that can be used to create and manipulate PDF documents. Here is a simple example demonstrating how to convert HTML to PDF using OpenPDF:

Step 1: Add OpenPDF Dependency

First, include OpenPDF in your project. If you are using Maven, add the following dependency to your pom.xml:

<dependency>
<groupId>com.github.librepdf</groupId>
<artifactId>openpdf</artifactId>
<version>1.3.26</version>
</dependency>

Step 2: Java Code to Convert HTML to PDF

Here is a basic Java code snippet that shows how to use OpenPDF to convert an HTML string to a PDF document:

import com.lowagie.text.Document;
import com.lowagie.text.Paragraph;
import com.lowagie.text.html.simpleparser.HTMLWorker;
import com.lowagie.text.pdf.PdfWriter;
import java.io.FileOutputStream;
import java.io.StringReader;
public class HtmlToPdfOpenPDF {
public static void main(String[] args) {
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream("HtmlToPdf.pdf"));
document.open();
String htmlString = "<html><body> This is my Project </body></html>";
HTMLWorker htmlWorker = new HTMLWorker(document);
htmlWorker.parse(new StringReader(htmlString));
document.close();
System.out.println("PDF created successfully.");
} catch (Exception e) {
e.printStackTrace();
}
}
}

This code snippet demonstrates a simple conversion of HTML content to a PDF document using OpenPDF.

2. Flying Saucer

Flying Saucer is a Java library that allows rendering XML or XHTML with CSS and converting it to PDF. Below is an example of how to use Flying Saucer for HTML to PDF conversion.

Step 1: Add Flying Saucer Dependency

Include Flying Saucer in your project. If you are using Maven, add the following dependencies to your pom.xml:

<dependency>
<groupId>org.xhtmlrenderer</groupId>
<artifactId>flying-saucer-core</artifactId>
<version>9.1.22</version>
</dependency>
<dependency>
<groupId>org.xhtmlrenderer</groupId>
<artifactId>flying-saucer-pdf</artifactId>
<version>9.1.22</version>
</dependency>

Step 2: Java Code to Convert HTML to PDF

Here’s a basic Java code snippet demonstrating the use of Flying Saucer to convert HTML to a PDF document:

import org.xhtmlrenderer.pdf.ITextRenderer;
import com.lowagie.text.DocumentException;
import java.io.*;
public class HtmlToPdfFlyingSaucer {
public static void main(String[] args) {
String inputFile = "path/to/your/input.html";
String outputFile = "path/to/your/output.pdf";
try {
String url = new File(inputFile).toURI().toURL().toString();
OutputStream os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);
os.close();
System.out.println("PDF created successfully.");
} catch (DocumentException | IOException e) {
e.printStackTrace();
}
}
}

This code snippet shows how to use Flying Saucer with the ITextRenderer class to render an HTML file as a PDF document.

3. iText

iText is a popular Java library for creating and manipulating PDF documents. It provides extensive capabilities for HTML to PDF conversion. Here’s a basic example of how to use iText for this purpose.

Step 1: Add iText Dependency

Include iText in your project. If you are using Maven, add the following dependency to your pom.xml:

<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext7-core</artifactId>
<version>7.1.9</version>
</dependency>

Step 2: Java Code to Convert HTML to PDF

Below is a Java code snippet demonstrating the use of iText to convert an HTML string to a PDF document:

import com.itextpdf.html2pdf.HtmlConverter;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
import java.io.*;
public class HtmlToPdfIText {
public static void main(String[] args) {
String htmlContent = "<html><body><h1>Hello World</h1><p>This is a test of iText HTML to PDF conversion.</p></body></html>";
String pdfDest = "output.pdf";
try {
PdfDocument pdf = new PdfDocument(new PdfWriter(pdfDest));
HtmlConverter.convertToPdf(htmlContent, pdf);
System.out.println("PDF created successfully.");
} catch (IOException e) {
e.printStackTrace();
}
}
}

This example demonstrates converting a simple HTML string to PDF using iText’s HtmlConverter class.

4. Apache PDFBox

Using Apache PDFBox for HTML to PDF Conversion

Apache PDFBox is a powerful open-source Java library used for handling PDF documents. It can be used to create new PDF documents, manipulate existing ones, and more. Here’s how you can use Apache PDFBox for HTML to PDF conversion.

Step 1: Add Apache PDFBox Dependency

Include Apache PDFBox in your project. If you are using Maven, add the following dependency to your pom.xml:

<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.24</version>
</dependency>

Step 2: Java Code to Convert HTML to PDF

Here’s a basic Java code snippet to demonstrate the conversion of HTML content to a PDF document using Apache PDFBox:

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import java.io.IOException;
public class HtmlToPdfPDFBox {
public static void main(String[] args) {
try (PDDocument document = new PDDocument()) {
PDPage page = new PDPage();
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.beginText();
contentStream.setFont(PDType1Font.HELVETICA, 12);
contentStream.newLineAtOffset(100, 700);
contentStream.showText("This is a sample text for PDFBox HTML to PDF Conversion.");
contentStream.endText();
contentStream.close();
document.save("PDFBoxExample.pdf");
System.out.println("PDF created successfully.");
} catch (IOException e) {
e.printStackTrace();
}
}
}

This example illustrates a simple use case of Apache PDFBox to create a PDF document with text content. Note that Apache PDFBox does not convert HTML to PDF directly but can be used to create PDFs programmatically.

Comparing the Libraries

When selecting a Java library for HTML to PDF conversion, various factors such as features, ease of use, and specific project requirements must be considered. Here’s a table comparing the libraries:

LibraryProsConsBest For
OpenPDFFree and open-source. Good for basic PDF generation and manipulation.Limited advanced features. May struggle with complex HTML and CSS.Simple PDF creation tasks where complex HTML rendering is not required.
Flying SaucerGood support for CSS and XHTML. Integrates well with other PDF libraries.Does not support JavaScript. Limited to XHTML conversion.Projects requiring accurate rendering of CSS-styled HTML.
iTextExtensive features for PDF manipulation. Supports HTML, CSS, JavaScript.Commercial license required for commercial use. Relatively complex API.Robust, enterprise-level applications requiring comprehensive PDF functionalities.
Apache PDFBoxPowerful for general PDF operations. Open-source. Supports text extraction, creation, and manipulation.Does not directly convert HTML to PDF. More suited for lower-level PDF operations.Custom PDF creation where direct HTML to PDF conversion is not the primary requirement.

Conclusion

Each library has its strengths and is suited for different use cases. OpenPDF and Apache PDFBox are great for straightforward PDF manipulation, while Flying Saucer and iText offer more advanced HTML rendering capabilities. Your choice will depend on the complexity of HTML content, necessity for CSS and JavaScript support, and the level of PDF manipulation required.

A better approach: HTML to PDF using Templated

The examples above demonstrate how to convert HTML to PDF and web pages to PDF using libraries. However, when it comes to tasks like generating PDFs using templates or keeping track of generated PDFs, additional steps are required.

For instance, to keep track of generated PDFs, you’ll need to develop your own system for tracking the files created. Similarly, if you want to use custom templates, such as those for invoice or certificates generation, you must create and manage those templates manually.

An alternative solution is to utilize Templated, an API-based platform designed for PDF and Image generation, which is ideal for handling such use cases. Their PDF generation API is powered by a Chromium-based rendering engine that fully supports JavaScript, CSS, and HTML.

This approach simplifies the process and eliminates the need for extensive manual template management and tracking. To get started with PDF generation using Templated, follow the documentation and harness the power of this efficient solution.

1. Create PDFs with templates

Using Templated you can design your PDF template using a drag-and-drop editor.
Check the video below to see an example of the editor:

After logging in you will see your Dashboard where you can manage your templated or create new ones:

How to create a template from the dashboard

From your Dashboard, you can design your own templates or customize a existing one from our Template Gallery. Bellow is the Certificate of Achievement Template you can use.
There are 100+ free templates available that you can pick and customize to your needs.

Shows the certicate template in the Template Editor

To start using Templated API, you need to get your API key that can be found on the API Integration tab on your dashboard.

Shows where to get the API key from the Dashboard

Now that you have your Templated account ready, let’s see how you can integrate your application with the API. In this example we will be using a certificate template to generate PDFs.

import java.net.HttpURLConnection;
import java.net.URL;
import java.io.OutputStream;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import org.json.JSONObject;
public class TemplatedApiRequest {
public static void main(String[] args) {
try {
URL url = new URL("https://api.templated.io/v1/render");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestProperty("Authorization", "Bearer ${YOUR_API_KEY}");
conn.setDoOutput(true);
JSONObject payload = new JSONObject();
payload.put("template", "template_id");
payload.put("format", "pdf");
JSONObject layers = new JSONObject();
// Add layers to the changes object
changes.put("date", "02/10/2024");
changes.put("name", "John Doe");
changes.put("signature", "Dr. Mark Brown");
changes.put("details", "This certificate is awarded to John Doe in recognition of their successful completion of Computer Science Degree on 02/10/2024.");
payload.put("changes", changes);
// Add layers to the changes object
JSONObject dateLayer = new JSONObject();
dateLayer.put("text", "02/10/2024");
layers.put("date", dateLayer);
JSONObject nameLayer = new JSONObject();
nameLayer.put("text", "John Doe");
layers.put("name", nameLayer);
JSONObject signatureLayer = new JSONObject();
signatureLayer.put("text", "Dr. Mark Brown");
layers.put("signature", signatureLayer);
JSONObject detailsLayer = new JSONObject();
detailsLayer.put("text", "This certificate is awarded to John Doe in recognition of their successful completion of Computer Science Degree on 02/10/2024.");
layers.put("details", detailsLayer);
payload.put("layers", layers);
OutputStream os = conn.getOutputStream();
os.write(payload.toString().getBytes());
os.flush();
os.close();
int responseCode = conn.getResponseCode();
System.out.println("Response Code : " + responseCode);
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
// Print the response
System.out.println(response.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
}

and If we check response we have the following

{
"renderUrl":"PDF_URL",
"status":"success",
"template":"YOUR_TEMPLATE_ID"
}

In the code above, using Templated to convert HTML to PDF is pretty simple. No additional libraries need to be installed. You only need to make a single API call, providing your data as the request body. That’s all there is to it!

You can use the renderUrl from the response to download or distribute the generated PDF.

Other languages

You you want to learn how to convert HTML to PDF in other languages here are other resources for you to explore:

Conclusion

PDF generation is now a standard part of every business application, and it shouldn’t be a source of stress for developers.

We’ve explored how to use third-party libraries for straightforward PDF generation. However, for more complex scenarios like template management, Templated offers a seamless solution through simple API calls.

To get started, sign up for a free account and begin automating your PDFs today!

Generate images and PDFs with a simple API

Generate social media visuals, banners, PDFs and more with our
 API and no-code integrations

Learn More
Back to Blog

Ready to start generating your images and PDFs?

Sign up to our free trial and try it for yourself

See our latest posts

View all posts »