Convert HTML to PDF in Python with 5 Popular Libraries

You’ve crafted the perfect HTML template. The layout is clean, the CSS is spot on, and everything looks great in the browser.

But then, when you try to generate the PDF, things start to go wrong. Margins shift, page breaks slice tables in half, fonts don’t render the way you expect, and complex CSS acts unpredictably.

What should be a straightforward “export to PDF” turns into a marathon of debugging rendering engines and adjusting styles for hours.

Hi, I’m Pedro, the founder of Templated, an API that automates images, PDFs, and videos.

In this guide, I’m going to share five popular Python libraries for converting HTML to PDF. We’ll explore how each one works, where they excel, and where they have their limitations.

I’ll also provide you with a quick comparison to help you pick the right library based on accuracy, performance, CSS support, and deployment complexity.

Plus, at the end, I’ll introduce you to a more scalable HTML to PDF solution that runs reliably without all the hassle of managing rendering and infrastructure.

Sounds good? Let’s check them out.

Why Generate PDF from HTML in Python?

When working with Python, you typically start with structured data JSON responses, database records, form submissions, analytics dashboards, invoices, or reports. The challenge is not generating a PDF file. The challenge is generating a well-designed, properly formatted document.

That’s where HTML becomes the most practical input format.

Instead of manually drawing text, tables, and layouts inside a PDF canvas, you define the structure using HTML and CSS, the same technologies used to design web pages. Python libraries can then render that HTML into a print-ready PDF.

This approach gives you:

Layout precision using CSS
Reusable and dynamic templates
Support for tables, charts, images, and branding
The ability to convert entire web pages via URL

In short, HTML acts as a presentation layer between your data and the final PDF output, which makes document automation far more scalable and maintainable.

If you only need to convert a few HTML files or URLs without setting up libraries, you can also try our free HTML to PDF converter tool for quick and instant conversion.

HTML to PDF using Python Libraries

Let’s examine five popular Python libraries for HTML-to-PDF conversion and understand where each one fits best.

i. Pyppeteer

Pyppeteer is a Python port of the Node library Puppeteer, which provides a high-level API over the Chrome DevTools Protocol. It’s like you are running a browser in your code that can do similar things that your browser can do. Puppeteer can be used to scrape data from websites, take screenshots for a website, and much more. Let’s see how we can utilize pyppeteer to generate PDFs from HTML.

First, we need to install pyppeteer with the following command:

pip install pyppeteer

Generate PDF from a website URL

import asyncio
from pyppeteer import launch

async def generate_pdf(url, pdf_path):
    browser = await launch()
    page = await browser.newPage()

    await page.goto(url)

    await page.pdf({'path': pdf_path, 'format': 'A4'})

    await browser.close()

# Run the function
asyncio.get_event_loop().run_until_complete(generate_pdf('https://example.com', 'example.pdf'))

In the above code, if you see the generate_pdf method, we are doing the following things:

Launching a new headless browser instance
Opens a new tab or page in the headless browser and waits for it to be ready.
Navigate to the URL specified in the url argument and wait for the page to load.
Generates a PDF of the webpage. The PDF is saved at the location specified in pdf_path, and the format is set to A4.
Closes the headless browser.

Generate PDF from Custom HTML content

import asyncio
from pyppeteer import launch

async def generate_pdf_from_html(html_content, pdf_path):
    browser = await launch()
    page = await browser.newPage()

    await page.setContent(html_content)

    await page.pdf({'path': pdf_path, 'format': 'A4'})

    await browser.close()

# HTML content
html_content = '''
<!DOCTYPE html>
<html>
<head>
    <title>PDF Example</title>
</head>
<body>
    <h1>Hello, world!</h1>
</body>
</html>
'''

# Run the function
asyncio.get_event_loop().run_until_complete(generate_pdf_from_html(html_content, 'from_html.pdf'))

Above is another example using Pyppeteer on how we can use our own custom HTML content to generate PDFs. Let’s see what is happening in the method generate_pdf_from_html:

Launching a new headless browser instance
Opens a new tab or page in the headless browser and waits for it to be ready.
Now we are explicitly setting the content of the page to our HTML content
Generates a PDF of the webpage. The PDF is saved at the location specified in pdf_path, and the format is set to A4.
Closes the headless browser.

ii. xhtml2pdf

xhtml2pdf is another Python library that lets you generate PDFs from HTML content. Let’s see xhtml2pdf in action.

The following command is to install xhtml2pdf:

pip install xhtml2pdf requests

To generate PDF from a website URL

Note that xhtml2pdf does not have an in-built feature to parse the URL, but we can use requests in Python to get the content from a URL.

from xhtml2pdf import pisa
import requests

def convert_url_to_pdf(url, pdf_path):
    # Fetch the HTML content from the URL
    response = requests.get(url)
    if response.status_code != 200:
        print(f"Failed to fetch URL: {url}")
        return False

    html_content = response.text

    # Generate PDF
    with open(pdf_path, "wb") as pdf_file:
        pisa_status = pisa.CreatePDF(html_content, dest=pdf_file)

    return not pisa_status.err

# URL to fetch
url_to_fetch = "https://google.com"

# PDF path to save
pdf_path = "google.pdf"

# Generate PDF
if convert_url_to_pdf(url_to_fetch, pdf_path):
    print(f"PDF generated and saved at {pdf_path}")
else:
    print("PDF generation failed")

In the above code, we are doing the following things in our method convert_url_to_pdf:

First, we are using requests to get the webpage content from the URL.
Once we get the content, we select the text part from the response using response.text
Now the generating PDF part comes, we are using pisa.CreatePDF and pass our HTML content and PDF file name for the output.

Generating PDF from custom HTML content

from xhtml2pdf import pisa

def convert_html_to_pdf(html_string, pdf_path):
  with open(pdf_path, "wb") as pdf_file:
    pisa_status = pisa.CreatePDF(html_string, dest=pdf_file)
  return not pisa_status.err

# HTML content
html = '''
<html>
  <head>
      <title>PDF Example</title>
  </head>

  <body>
      <h1>Hey, this will turn into a PDF!</h1>
  </body>
</html>
'''

# Create PDF
pdf_path = "example.pdf"
convert_html_to_pdf(html, pdf_path)

Creating a PDF from custom HTML content is similar to the process for URLs, with just one key difference: instead of passing a URL, we directly provide the actual HTML content to our creation method. The method then uses this custom HTML content to create the PDF.

If you’d like a more detailed, step-by-step walkthrough, check out our complete guide on how to convert HTML to PDF with xhtml2pdf.

iii. python-pdfkit

python-pdfkit is a Python wrapper for the wkhtmltopdf utility, which uses Webkit to convert HTML to PDF.

First, let’s install python-pdfkit with pip:

pip install pdfkit

Generate PDF from a website URL

import pdfkit

# URL to fetch
url = 'https://cnn.com'

# PDF path to save
pdf_path = 'example.pdf'

pdfkit.from_url(url, pdf_path)

pdfkit supports generating PDFs from website URLs out of the box just like Pyppeteer.

In the above code, as you can see, pdfkit is generating pdf just from one line code. pdfkit.from_url is all you need to generate a PDF.

Generate PDF from Custom HTML content

import pdfkit

# HTML content
html = '''
<html>
  <head>
      <title>PDF Example</title>
  </head>

  <body>
      <h1>Hey, this will turn into a PDF!</h1>
  </body>
</html>
'''

# PDF path to save
pdf_path = 'example.pdf'

# Create PDF
pdfkit.from_string(html, pdf_path)

To generate a PDF from custom HTML content using python-pdfkit, you simply need to use pdfkit.from_string and provide the HTML content along with the path for the PDF file.

For a complete setup guide, advanced configuration tips, and real-world examples, read our in-depth tutorial on how to convert HTML to PDF with python-pdfkit.

iv. Playwright

Playwright is a modern, lightweight library for headless browser automation. It supports multiple browsers (Firefox, Chromium, Edge, Safari) across platforms and languages, making it versatile for tasks like PDF generation.

To use Python as a converter for HTML to PDF with Playwright, follow these steps:

Step 1: Install Playwright:

pip install playwright
playwright install

Step 2: Generate PDF from Website URL:

import asyncio
from playwright.async_api import async_playwright

async def url_to_pdf(url, output_path):
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.goto(url)
        await page.pdf(path=output_path)
        await browser.close()

# Example usage
url = 'https://example.com'
output_path = 'example_url.pdf'
asyncio.run(url_to_pdf(url, output_path))

Step 3: Generate PDF from Custom HTML Content:

import asyncio
from playwright.async_api import async_playwright

async def html_to_pdf(html_content, output_path):
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.set_content(html_content)
        await page.pdf(path=output_path)
        await browser.close()

html_content = '''
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Sample HTML</title>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is a sample HTML content to be converted to PDF.</p>
</body>
</html>
'''
output_path = 'custom_html.pdf'
asyncio.run(html_to_pdf(html_content, output_path))

v. WeasyPrint

WeasyPrint is a visual rendering engine that follows the W3C specifications for HTML and CSS. It’s known for its excellent CSS support and ability to generate high-quality PDFs without requiring external dependencies like browsers or rendering engines.

Step 1: Install WeasyPrint:

pip install weasyprint

Step 2: Generate PDF from Website URL:

from weasyprint import HTML

def url_to_pdf(url, output_path):
    HTML(url=url).write_pdf(output_path)

# Example usage
url = 'https://example.com'
output_path = 'example_url.pdf'
url_to_pdf(url, output_path)

Step 3: Generate PDF from Custom HTML Content:

from weasyprint import HTML

def html_to_pdf(html_content, output_path):
    HTML(string=html_content).write_pdf(output_path)

html_content = '''
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Sample HTML</title>
    <style>
        body { font-family: Arial, sans-serif; }
        h1 { color: navy; }
    </style>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is a sample HTML content to be converted to PDF.</p>
</body>
</html>
'''
output_path = 'custom_html.pdf'
html_to_pdf(html_content, output_path)

Performance Tips for HTML to PDF Conversion

When converting HTML to PDF in Python, keep these optimization tips in mind:

1. Batch Processing

If you need to generate multiple PDFs, process them in batches:

from concurrent.futures import ThreadPoolExecutor
import asyncio

# For async libraries (Pyppeteer/Playwright)
async def batch_generate_async(urls):
    tasks = [generate_pdf(url) for url in urls]
    return await asyncio.gather(*tasks)

# For sync libraries (xhtml2pdf, pdfkit)
def batch_generate_sync(html_contents):
    with ThreadPoolExecutor(max_workers=4) as executor:
        futures = [executor.submit(generate_pdf, html) for html in html_contents]
        return [f.result() for f in futures]

2. Resource Management

Browser-based libraries: Reuse browser instances instead of creating new ones
Memory: Clear cache and close pages after use
Timeouts: Set reasonable timeouts to prevent hanging

3. Caching

For frequently generated PDFs, implement caching:

import hashlib
import os

def get_cached_pdf(html_content):
    # Create hash of content
    content_hash = hashlib.md5(html_content.encode()).hexdigest()
    cache_path = f"cache/{content_hash}.pdf"

    if os.path.exists(cache_path):
        return open(cache_path, 'rb').read()

    # Generate and cache
    pdf = generate_pdf(html_content)
    with open(cache_path, 'wb') as f:
        f.write(pdf)
    return pdf

Comparing Python PDF Generation Libraries

Here’s a comparison of the five libraries to help you choose the right one for your needs:

Library	Installation	JavaScript	CSS Support	Performance	Best For
Pyppeteer	Moderate (requires Chromium)	✅ Full support	✅ Excellent	Medium	Dynamic content, SPAs, modern web apps
xhtml2pdf	Easy (pure Python)	❌ No support	⚠️ Basic	Good	Simple documents, basic reports
python-pdfkit	Complex (needs wkhtmltopdf)	⚠️ Limited	✅ Good	Good	Static sites, invoices, reports
Playwright	Moderate (browser binaries)	✅ Full support	✅ Excellent	Good	Complex web apps, cross-browser needs
WeasyPrint	Moderate (system dependencies)	❌ No support	✅ Excellent	Good	Print-quality documents, precise layouts

Quick Decision Guide

Choose Pyppeteer if you need JavaScript execution and already use async Python
Choose xhtml2pdf for simple HTML without JavaScript requirements
Choose python-pdfkit for a balance between features and simplicity
Choose Playwright for the most modern and robust browser automation
Choose WeasyPrint for the best CSS compliance and print-quality output

Templated (PDF Generation API) - An Efficient Bulk PDF

If you need to generate bulk PDFs for use cases like invoices, reports, certificates, and more types of documents in PDF, using Python libraries works. But managing rendering engines, scaling servers, and handling formatting issues becomes overhead fast.

An API is simply faster, easier, and more reliable at scale. Templated is a PDF generation API that is exactly built for that. Let’s learn more about it.

Why Templated?

1. Simple API - Just a Few Lines of Code

No browser setup. No dependency headaches. Just one API call:

import requests

# Generate PDF with one API call
response = requests.post(
    "https://api.templated.io/v1/render",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "template": "your-template-id",
        "format": "pdf",
        "layers": {
            "name": {
                "text": "John Doe"
            },
            "invoice_no": {
                "text": "INV-001"
            }
        }
    }
)

You pass a template ID and dynamic data. Our pdf generator API handles rendering and scaling.

2. Visual Template Editor

Templated includes a built-in editor.
Design templates with drag-and-drop
Live preview as you build
No coding required for template changes
Team collaboration features
Allow direct PDF document import and designs from Canva

3. AI-Powered PDF Generation

You can also generate PDF templates using AI.

Simply describe what you need. The more descriptive the prompt, the better the template design layout will be. Even if you want to make changes after the prompt, you can easily do so with drag-and-drop features similar to Canva.

4. Built for Everyone

Developers can integrate the API directly into backend systems and generate PDFs programmatically at scale.
Teams can use Templated’s spreadsheet feature to generate bulk PDFs without writing code and automate workflows using no-code tools.

5. Production-Ready Features

Fast: Generate PDFs in under 2 seconds
Scalable: Handle thousands of concurrent requests
99.9% Uptime: Reliable service guarantee

Python Libraries vs Templated API

Feature	HTML to PDF with Python	Templated API
Setup	Requires browser engines	No setup
Scaling	Self-managed	Built-in scaling
Maintenance	Manual updates	Fully managed
Template Editing	Code changes required	Visual editor
AI Template Creation	Not available	Yes
Bulk Generation	Complex at scale	Built for high volume

Conclusion

We’ve explored 5 popular Python libraries for HTML to PDF conversion. Each has its place:

Open-source libraries are great for:
- Learning and prototyping
- Simple, low-volume applications
- Complete control over the rendering process
Templated API is ideal for:
- Production applications
- High-volume PDF generation
- Teams that want to focus on business logic, not infrastructure

The choice depends on your specific needs. If you’re building a production application and want to avoid the complexity of managing browser dependencies and scaling issues, try Templated free with 50 API credits.

Other Languages

Learn how to convert HTML to PDF in other programming languages:

Frequently Asked Questions

Q: Which Python library is best for HTML to PDF conversion?
A: For simple documents, use xhtml2pdf. For JavaScript-heavy pages, use Playwright or Pyppeteer. For production applications, consider a managed API like Templated.

Q: Can I convert HTML to PDF without installing browsers?
A: Yes, xhtml2pdf and WeasyPrint don’t require browsers, but they have limited JavaScript support. For full JavaScript support without managing browsers, use a cloud API.

Q: How can I generate PDFs at scale in Python?
A: For high-volume PDF generation, use either Playwright with proper resource management or a managed API service that handles scaling automatically.

Q: What’s the fastest way to implement PDF generation in production?
A: While open-source libraries are great for learning, they require significant setup and maintenance (browser dependencies, memory management, error handling). For production applications, Templated’s API gets you running in under 5 minutes with just a few lines of code, handles all scaling automatically, and includes a visual template editor so non-developers can update templates without code changes. This typically saves 40+ hours of development and maintenance time.

Ready to generate PDFs without the hassle?
Start your free trial and get 50 free API calls per month. No credit card required.