· tutorials · 12 min read

Updated on

Convert HTML to PDF in Python with 5 Popular Libraries

Learn how to convert HTML to PDF in Python using the best libraries. Compare features, performance, and find the right solution for your needs

Summarize with AI:
Learn how to convert HTML to PDF in Python using the best libraries. Compare features, performance, and find the right solution for your needs

You’ve crafted the perfect HTML template. The layout is clean, the CSS is spot on, and everything looks great in the browser.

But then, when you try to generate the PDF, things start to go wrong. Margins shift, page breaks slice tables in half, fonts don’t render the way you expect, and complex CSS acts unpredictably.

What should be a straightforward “export to PDF” turns into a marathon of debugging rendering engines and adjusting styles for hours.

Hi, I’m Pedro, the founder of Templated, an API that automates images, PDFs, and videos.

In this guide, I’m going to share five popular Python libraries for converting HTML to PDF. We’ll explore how each one works, where they excel, and where they have their limitations.

I’ll also provide you with a quick comparison to help you pick the right library based on accuracy, performance, CSS support, and deployment complexity.

Plus, at the end, I’ll introduce you to a more scalable HTML to PDF solution that runs reliably without all the hassle of managing rendering and infrastructure.

Sounds good? Let’s check them out.

Why Generate PDF from HTML in Python?

When working with Python, you typically start with structured data JSON responses, database records, form submissions, analytics dashboards, invoices, or reports. The challenge is not generating a PDF file. The challenge is generating a well-designed, properly formatted document.

That’s where HTML becomes the most practical input format.

Instead of manually drawing text, tables, and layouts inside a PDF canvas, you define the structure using HTML and CSS, the same technologies used to design web pages. Python libraries can then render that HTML into a print-ready PDF.

This approach gives you:

  • Layout precision using CSS
  • Reusable and dynamic templates
  • Support for tables, charts, images, and branding
  • The ability to convert entire web pages via URL

In short, HTML acts as a presentation layer between your data and the final PDF output, which makes document automation far more scalable and maintainable.

If you only need to convert a few HTML files or URLs without setting up libraries, you can also try our free HTML to PDF converter tool for quick and instant conversion.

HTML to PDF using Python Libraries

Let’s examine five popular Python libraries for HTML-to-PDF conversion and understand where each one fits best.

i. Pyppeteer

Pyppeteer is a Python port of the Node library Puppeteer, which provides a high-level API over the Chrome DevTools Protocol. It’s like you are running a browser in your code that can do similar things that your browser can do. Puppeteer can be used to scrape data from websites, take screenshots for a website, and much more. Let’s see how we can utilize pyppeteer to generate PDFs from HTML.

First, we need to install pyppeteer with the following command:

Terminal window
pip install pyppeteer

Generate PDF from a website URL

import asyncio
from pyppeteer import launch
async def generate_pdf(url, pdf_path):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# Run the function
asyncio.get_event_loop().run_until_complete(generate_pdf('https://example.com', 'example.pdf'))

In the above code, if you see the generate_pdf method, we are doing the following things:

  1. Launching a new headless browser instance
  2. Opens a new tab or page in the headless browser and waits for it to be ready.
  3. Navigate to the URL specified in the url argument and wait for the page to load.
  4. Generates a PDF of the webpage. The PDF is saved at the location specified in pdf_path, and the format is set to A4.
  5. Closes the headless browser.

Generate PDF from Custom HTML content

import asyncio
from pyppeteer import launch
async def generate_pdf_from_html(html_content, pdf_path):
browser = await launch()
page = await browser.newPage()
await page.setContent(html_content)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# HTML content
html_content = '''
<!DOCTYPE html>
<html>
<head>
<title>PDF Example</title>
</head>
<body>
<h1>Hello, world!</h1>
</body>
</html>
'''
# Run the function
asyncio.get_event_loop().run_until_complete(generate_pdf_from_html(html_content, 'from_html.pdf'))

Above is another example using Pyppeteer on how we can use our own custom HTML content to generate PDFs. Let’s see what is happening in the method generate_pdf_from_html:

  1. Launching a new headless browser instance
  2. Opens a new tab or page in the headless browser and waits for it to be ready.
  3. Now we are explicitly setting the content of the page to our HTML content
  4. Generates a PDF of the webpage. The PDF is saved at the location specified in pdf_path, and the format is set to A4.
  5. Closes the headless browser.

ii. xhtml2pdf

xhtml2pdf is another Python library that lets you generate PDFs from HTML content. Let’s see xhtml2pdf in action.

The following command is to install xhtml2pdf:

Terminal window
pip install xhtml2pdf requests

To generate PDF from a website URL

Note that xhtml2pdf does not have an in-built feature to parse the URL, but we can use requests in Python to get the content from a URL.

from xhtml2pdf import pisa
import requests
def convert_url_to_pdf(url, pdf_path):
# Fetch the HTML content from the URL
response = requests.get(url)
if response.status_code != 200:
print(f"Failed to fetch URL: {url}")
return False
html_content = response.text
# Generate PDF
with open(pdf_path, "wb") as pdf_file:
pisa_status = pisa.CreatePDF(html_content, dest=pdf_file)
return not pisa_status.err
# URL to fetch
url_to_fetch = "https://google.com"
# PDF path to save
pdf_path = "google.pdf"
# Generate PDF
if convert_url_to_pdf(url_to_fetch, pdf_path):
print(f"PDF generated and saved at {pdf_path}")
else:
print("PDF generation failed")

In the above code, we are doing the following things in our method convert_url_to_pdf:

  1. First, we are using requests to get the webpage content from the URL.
  2. Once we get the content, we select the text part from the response using response.text
  3. Now the generating PDF part comes, we are using pisa.CreatePDF and pass our HTML content and PDF file name for the output.

Generating PDF from custom HTML content

from xhtml2pdf import pisa
def convert_html_to_pdf(html_string, pdf_path):
with open(pdf_path, "wb") as pdf_file:
pisa_status = pisa.CreatePDF(html_string, dest=pdf_file)
return not pisa_status.err
# HTML content
html = '''
<html>
<head>
<title>PDF Example</title>
</head>
<body>
<h1>Hey, this will turn into a PDF!</h1>
</body>
</html>
'''
# Create PDF
pdf_path = "example.pdf"
convert_html_to_pdf(html, pdf_path)

Creating a PDF from custom HTML content is similar to the process for URLs, with just one key difference: instead of passing a URL, we directly provide the actual HTML content to our creation method. The method then uses this custom HTML content to create the PDF.

If you’d like a more detailed, step-by-step walkthrough, check out our complete guide on how to convert HTML to PDF with xhtml2pdf.

iii. python-pdfkit

python-pdfkit is a Python wrapper for the wkhtmltopdf utility, which uses Webkit to convert HTML to PDF.

First, let’s install python-pdfkit with pip:

Terminal window
pip install pdfkit

Generate PDF from a website URL

import pdfkit
# URL to fetch
url = 'https://cnn.com'
# PDF path to save
pdf_path = 'example.pdf'
pdfkit.from_url(url, pdf_path)

pdfkit supports generating PDFs from website URLs out of the box just like Pyppeteer.

In the above code, as you can see, pdfkit is generating pdf just from one line code. pdfkit.from_url is all you need to generate a PDF.

Generate PDF from Custom HTML content

import pdfkit
# HTML content
html = '''
<html>
<head>
<title>PDF Example</title>
</head>
<body>
<h1>Hey, this will turn into a PDF!</h1>
</body>
</html>
'''
# PDF path to save
pdf_path = 'example.pdf'
# Create PDF
pdfkit.from_string(html, pdf_path)

To generate a PDF from custom HTML content using python-pdfkit, you simply need to use pdfkit.from_string and provide the HTML content along with the path for the PDF file.

For a complete setup guide, advanced configuration tips, and real-world examples, read our in-depth tutorial on how to convert HTML to PDF with python-pdfkit.

iv. Playwright

Playwright is a modern, lightweight library for headless browser automation. It supports multiple browsers (Firefox, Chromium, Edge, Safari) across platforms and languages, making it versatile for tasks like PDF generation.

To use Python as a converter for HTML to PDF with Playwright, follow these steps:

Step 1: Install Playwright:

Terminal window
pip install playwright
playwright install

Step 2: Generate PDF from Website URL:

import asyncio
from playwright.async_api import async_playwright
async def url_to_pdf(url, output_path):
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto(url)
await page.pdf(path=output_path)
await browser.close()
# Example usage
url = 'https://example.com'
output_path = 'example_url.pdf'
asyncio.run(url_to_pdf(url, output_path))

Step 3: Generate PDF from Custom HTML Content:

import asyncio
from playwright.async_api import async_playwright
async def html_to_pdf(html_content, output_path):
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.set_content(html_content)
await page.pdf(path=output_path)
await browser.close()
html_content = '''
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Sample HTML</title>
</head>
<body>
<h1>Hello, World!</h1>
<p>This is a sample HTML content to be converted to PDF.</p>
</body>
</html>
'''
output_path = 'custom_html.pdf'
asyncio.run(html_to_pdf(html_content, output_path))

v. WeasyPrint

WeasyPrint is a visual rendering engine that follows the W3C specifications for HTML and CSS. It’s known for its excellent CSS support and ability to generate high-quality PDFs without requiring external dependencies like browsers or rendering engines.

Step 1: Install WeasyPrint:

Terminal window
pip install weasyprint

Step 2: Generate PDF from Website URL:

from weasyprint import HTML
def url_to_pdf(url, output_path):
HTML(url=url).write_pdf(output_path)
# Example usage
url = 'https://example.com'
output_path = 'example_url.pdf'
url_to_pdf(url, output_path)

Step 3: Generate PDF from Custom HTML Content:

from weasyprint import HTML
def html_to_pdf(html_content, output_path):
HTML(string=html_content).write_pdf(output_path)
html_content = '''
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Sample HTML</title>
<style>
body { font-family: Arial, sans-serif; }
h1 { color: navy; }
</style>
</head>
<body>
<h1>Hello, World!</h1>
<p>This is a sample HTML content to be converted to PDF.</p>
</body>
</html>
'''
output_path = 'custom_html.pdf'
html_to_pdf(html_content, output_path)

Performance Tips for HTML to PDF Conversion

When converting HTML to PDF in Python, keep these optimization tips in mind:

1. Batch Processing

If you need to generate multiple PDFs, process them in batches:

from concurrent.futures import ThreadPoolExecutor
import asyncio
# For async libraries (Pyppeteer/Playwright)
async def batch_generate_async(urls):
tasks = [generate_pdf(url) for url in urls]
return await asyncio.gather(*tasks)
# For sync libraries (xhtml2pdf, pdfkit)
def batch_generate_sync(html_contents):
with ThreadPoolExecutor(max_workers=4) as executor:
futures = [executor.submit(generate_pdf, html) for html in html_contents]
return [f.result() for f in futures]

2. Resource Management

  • Browser-based libraries: Reuse browser instances instead of creating new ones
  • Memory: Clear cache and close pages after use
  • Timeouts: Set reasonable timeouts to prevent hanging

3. Caching

For frequently generated PDFs, implement caching:

import hashlib
import os
def get_cached_pdf(html_content):
# Create hash of content
content_hash = hashlib.md5(html_content.encode()).hexdigest()
cache_path = f"cache/{content_hash}.pdf"
if os.path.exists(cache_path):
return open(cache_path, 'rb').read()
# Generate and cache
pdf = generate_pdf(html_content)
with open(cache_path, 'wb') as f:
f.write(pdf)
return pdf

Comparing Python PDF Generation Libraries

Here’s a comparison of the five libraries to help you choose the right one for your needs:

LibraryInstallationJavaScriptCSS SupportPerformanceBest For
PyppeteerModerate (requires Chromium)✅ Full support✅ ExcellentMediumDynamic content, SPAs, modern web apps
xhtml2pdfEasy (pure Python)❌ No support⚠️ BasicGoodSimple documents, basic reports
python-pdfkitComplex (needs wkhtmltopdf)⚠️ Limited✅ GoodGoodStatic sites, invoices, reports
PlaywrightModerate (browser binaries)✅ Full support✅ ExcellentGoodComplex web apps, cross-browser needs
WeasyPrintModerate (system dependencies)❌ No support✅ ExcellentGoodPrint-quality documents, precise layouts

Quick Decision Guide

  • Choose Pyppeteer if you need JavaScript execution and already use async Python
  • Choose xhtml2pdf for simple HTML without JavaScript requirements
  • Choose python-pdfkit for a balance between features and simplicity
  • Choose Playwright for the most modern and robust browser automation
  • Choose WeasyPrint for the best CSS compliance and print-quality output

Templated (PDF Generation API) - An Efficient Bulk PDF

If you need to generate bulk PDFs for use cases like invoices, reports, certificates, and more types of documents in PDF, using Python libraries works. But managing rendering engines, scaling servers, and handling formatting issues becomes overhead fast.

An API is simply faster, easier, and more reliable at scale. Templated is a PDF generation API that is exactly built for that. Let’s learn more about it.

Why Templated?

1. Simple API - Just a Few Lines of Code

No browser setup. No dependency headaches. Just one API call:

import requests
# Generate PDF with one API call
response = requests.post(
"https://api.templated.io/v1/render",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"template": "your-template-id",
"format": "pdf",
"layers": {
"name": {
"text": "John Doe"
},
"invoice_no": {
"text": "INV-001"
}
}
}
)

You pass a template ID and dynamic data. Our pdf generator API handles rendering and scaling.

2. Visual Template Editor

  • Templated includes a built-in editor.
  • Design templates with drag-and-drop
  • Live preview as you build
  • No coding required for template changes
  • Team collaboration features
  • Allow direct PDF document import and designs from Canva

3. AI-Powered PDF Generation

You can also generate PDF templates using AI.

Simply describe what you need. The more descriptive the prompt, the better the template design layout will be. Even if you want to make changes after the prompt, you can easily do so with drag-and-drop features similar to Canva.

4. Built for Everyone

  • Developers can integrate the API directly into backend systems and generate PDFs programmatically at scale.
  • Teams can use Templated’s spreadsheet feature to generate bulk PDFs without writing code and automate workflows using no-code tools.

5. Production-Ready Features

  • Fast: Generate PDFs in under 2 seconds
  • Scalable: Handle thousands of concurrent requests
  • 99.9% Uptime: Reliable service guarantee

Python Libraries vs Templated API

FeatureHTML to PDF with PythonTemplated API
SetupRequires browser enginesNo setup
ScalingSelf-managedBuilt-in scaling
MaintenanceManual updatesFully managed
Template EditingCode changes requiredVisual editor
AI Template CreationNot availableYes
Bulk GenerationComplex at scaleBuilt for high volume

Conclusion

We’ve explored 5 popular Python libraries for HTML to PDF conversion. Each has its place:

  • Open-source libraries are great for:

    • Learning and prototyping
    • Simple, low-volume applications
    • Complete control over the rendering process
  • Templated API is ideal for:

    • Production applications
    • High-volume PDF generation
    • Teams that want to focus on business logic, not infrastructure

The choice depends on your specific needs. If you’re building a production application and want to avoid the complexity of managing browser dependencies and scaling issues, try Templated free with 50 API credits.

Other Languages

Learn how to convert HTML to PDF in other programming languages:

Frequently Asked Questions

Q: Which Python library is best for HTML to PDF conversion?
A: For simple documents, use xhtml2pdf. For JavaScript-heavy pages, use Playwright or Pyppeteer. For production applications, consider a managed API like Templated.

Q: Can I convert HTML to PDF without installing browsers?
A: Yes, xhtml2pdf and WeasyPrint don’t require browsers, but they have limited JavaScript support. For full JavaScript support without managing browsers, use a cloud API.

Q: How can I generate PDFs at scale in Python?
A: For high-volume PDF generation, use either Playwright with proper resource management or a managed API service that handles scaling automatically.

Q: What’s the fastest way to implement PDF generation in production?
A: While open-source libraries are great for learning, they require significant setup and maintenance (browser dependencies, memory management, error handling). For production applications, Templated’s API gets you running in under 5 minutes with just a few lines of code, handles all scaling automatically, and includes a visual template editor so non-developers can update templates without code changes. This typically saves 40+ hours of development and maintenance time.


Ready to generate PDFs without the hassle?
Start your free trial and get 50 free API calls per month. No credit card required.

Automate your images, videos and PDFs with a powerful API

Automate your marketing, social media visuals, banners, videos, PDFs and more with our
 API and no-code integrations

Learn More
Back to Blog

Ready to automate your images and PDFs?

Sign up to our free trial and try it for yourself

See our latest posts

View all posts »
FFmpeg in Python: A Practical Guide with Code Examples

FFmpeg in Python: A Practical Guide with Code Examples

Learn how to use FFmpeg in Python with the ffmpeg-python library. Covers format conversion, audio extraction, video trimming, frame extraction, thumbnails, and a simpler API-based alternative with Templated.

3 Quick Ways To Generate Templates for Your Automation

3 Quick Ways To Generate Templates for Your Automation

Create automation templates faster using three smart methods: import Canva designs, generate layouts with Templated’s AI Template Generator, or leverage MCP integration for AI-powered creation.

4 Reasons to Switch from APITemplate for Image Automation

4 Reasons to Switch from APITemplate for Image Automation

Templated offers a powerful alternative to APITemplate for image automation. From importing Canva templates and AI-generated designs to a flexible visual editor and better developer support, it helps teams automate image creation faster and more easily.

Top 5 PDF Generator API Alternatives for HTML to PDF in 2026

Top 5 PDF Generator API Alternatives for HTML to PDF in 2026

Explore the top 5 PDF Generator API alternatives for 2026 to convert HTML to PDF seamlessly. Compare pricing, performance, and features to choose the best tool for automated PDF creation and document generation workflows.