· tutorials · 12 min read
Convert HTML to PDF in Python with 5 Popular Libraries
Learn how to convert HTML to PDF in Python using the best libraries. Compare features, performance, and find the right solution for your needs

You’ve crafted the perfect HTML template. The layout is clean, the CSS is spot on, and everything looks great in the browser.
But then, when you try to generate the PDF, things start to go wrong. Margins shift, page breaks slice tables in half, fonts don’t render the way you expect, and complex CSS acts unpredictably.
What should be a straightforward “export to PDF” turns into a marathon of debugging rendering engines and adjusting styles for hours.
Hi, I’m Pedro, the founder of Templated, an API that automates images, PDFs, and videos.
In this guide, I’m going to share five popular Python libraries for converting HTML to PDF. We’ll explore how each one works, where they excel, and where they have their limitations.
I’ll also provide you with a quick comparison to help you pick the right library based on accuracy, performance, CSS support, and deployment complexity.
Plus, at the end, I’ll introduce you to a more scalable HTML to PDF solution that runs reliably without all the hassle of managing rendering and infrastructure.
Sounds good? Let’s check them out.
Why Generate PDF from HTML in Python?
When working with Python, you typically start with structured data JSON responses, database records, form submissions, analytics dashboards, invoices, or reports. The challenge is not generating a PDF file. The challenge is generating a well-designed, properly formatted document.
That’s where HTML becomes the most practical input format.
Instead of manually drawing text, tables, and layouts inside a PDF canvas, you define the structure using HTML and CSS, the same technologies used to design web pages. Python libraries can then render that HTML into a print-ready PDF.
This approach gives you:
- Layout precision using CSS
- Reusable and dynamic templates
- Support for tables, charts, images, and branding
- The ability to convert entire web pages via URL
In short, HTML acts as a presentation layer between your data and the final PDF output, which makes document automation far more scalable and maintainable.
If you only need to convert a few HTML files or URLs without setting up libraries, you can also try our free HTML to PDF converter tool for quick and instant conversion.
HTML to PDF using Python Libraries
Let’s examine five popular Python libraries for HTML-to-PDF conversion and understand where each one fits best.
i. Pyppeteer
Pyppeteer is a Python port of the Node library Puppeteer, which provides a high-level API over the Chrome DevTools Protocol. It’s like you are running a browser in your code that can do similar things that your browser can do. Puppeteer can be used to scrape data from websites, take screenshots for a website, and much more. Let’s see how we can utilize pyppeteer to generate PDFs from HTML.
First, we need to install pyppeteer with the following command:
pip install pyppeteerGenerate PDF from a website URL
import asynciofrom pyppeteer import launch
async def generate_pdf(url, pdf_path): browser = await launch() page = await browser.newPage()
await page.goto(url)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# Run the functionasyncio.get_event_loop().run_until_complete(generate_pdf('https://example.com', 'example.pdf'))In the above code, if you see the generate_pdf method, we are doing the following things:
- Launching a new headless browser instance
- Opens a new tab or page in the headless browser and waits for it to be ready.
- Navigate to the URL specified in the
urlargument and wait for the page to load. - Generates a PDF of the webpage. The PDF is saved at the location specified in
pdf_path, and the format is set toA4. - Closes the headless browser.
Generate PDF from Custom HTML content
import asynciofrom pyppeteer import launch
async def generate_pdf_from_html(html_content, pdf_path): browser = await launch() page = await browser.newPage()
await page.setContent(html_content)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# HTML contenthtml_content = '''<!DOCTYPE html><html><head> <title>PDF Example</title></head><body> <h1>Hello, world!</h1></body></html>'''
# Run the functionasyncio.get_event_loop().run_until_complete(generate_pdf_from_html(html_content, 'from_html.pdf'))Above is another example using Pyppeteer on how we can use our own custom HTML content to generate PDFs. Let’s see what is happening in the method generate_pdf_from_html:
- Launching a new headless browser instance
- Opens a new tab or page in the headless browser and waits for it to be ready.
- Now we are explicitly setting the content of the page to our HTML content
- Generates a PDF of the webpage. The PDF is saved at the location specified in
pdf_path, and the format is set toA4. - Closes the headless browser.
ii. xhtml2pdf
xhtml2pdf is another Python library that lets you generate PDFs from HTML content. Let’s see xhtml2pdf in action.
The following command is to install xhtml2pdf:
pip install xhtml2pdf requestsTo generate PDF from a website URL
Note that xhtml2pdf does not have an in-built feature to parse the URL, but we can use requests in Python to get the content from a URL.
from xhtml2pdf import pisaimport requests
def convert_url_to_pdf(url, pdf_path): # Fetch the HTML content from the URL response = requests.get(url) if response.status_code != 200: print(f"Failed to fetch URL: {url}") return False
html_content = response.text
# Generate PDF with open(pdf_path, "wb") as pdf_file: pisa_status = pisa.CreatePDF(html_content, dest=pdf_file)
return not pisa_status.err
# URL to fetchurl_to_fetch = "https://google.com"
# PDF path to savepdf_path = "google.pdf"
# Generate PDFif convert_url_to_pdf(url_to_fetch, pdf_path): print(f"PDF generated and saved at {pdf_path}")else: print("PDF generation failed")In the above code, we are doing the following things in our method convert_url_to_pdf:
- First, we are using
requeststo get the webpage content from the URL. - Once we get the content, we select the text part from the response using
response.text - Now the generating PDF part comes, we are using
pisa.CreatePDFand pass our HTML content and PDF file name for the output.
Generating PDF from custom HTML content
from xhtml2pdf import pisa
def convert_html_to_pdf(html_string, pdf_path): with open(pdf_path, "wb") as pdf_file: pisa_status = pisa.CreatePDF(html_string, dest=pdf_file) return not pisa_status.err
# HTML contenthtml = '''<html> <head> <title>PDF Example</title> </head>
<body> <h1>Hey, this will turn into a PDF!</h1> </body></html>'''
# Create PDFpdf_path = "example.pdf"convert_html_to_pdf(html, pdf_path)Creating a PDF from custom HTML content is similar to the process for URLs, with just one key difference: instead of passing a URL, we directly provide the actual HTML content to our creation method. The method then uses this custom HTML content to create the PDF.
If you’d like a more detailed, step-by-step walkthrough, check out our complete guide on how to convert HTML to PDF with xhtml2pdf.
iii. python-pdfkit
python-pdfkit is a Python wrapper for the wkhtmltopdf utility, which uses Webkit to convert HTML to PDF.
First, let’s install python-pdfkit with pip:
pip install pdfkitGenerate PDF from a website URL
import pdfkit
# URL to fetchurl = 'https://cnn.com'
# PDF path to savepdf_path = 'example.pdf'
pdfkit.from_url(url, pdf_path)pdfkit supports generating PDFs from website URLs out of the box just like Pyppeteer.
In the above code, as you can see, pdfkit is generating pdf just from one line code. pdfkit.from_url is all you need to generate a PDF.
Generate PDF from Custom HTML content
import pdfkit
# HTML contenthtml = '''<html> <head> <title>PDF Example</title> </head>
<body> <h1>Hey, this will turn into a PDF!</h1> </body></html>'''
# PDF path to savepdf_path = 'example.pdf'
# Create PDFpdfkit.from_string(html, pdf_path)To generate a PDF from custom HTML content using python-pdfkit, you simply need to use pdfkit.from_string and provide the HTML content along with the path for the PDF file.
For a complete setup guide, advanced configuration tips, and real-world examples, read our in-depth tutorial on how to convert HTML to PDF with python-pdfkit.
iv. Playwright
Playwright is a modern, lightweight library for headless browser automation. It supports multiple browsers (Firefox, Chromium, Edge, Safari) across platforms and languages, making it versatile for tasks like PDF generation.
To use Python as a converter for HTML to PDF with Playwright, follow these steps:
Step 1: Install Playwright:
pip install playwrightplaywright installStep 2: Generate PDF from Website URL:
import asynciofrom playwright.async_api import async_playwright
async def url_to_pdf(url, output_path): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.goto(url) await page.pdf(path=output_path) await browser.close()
# Example usageurl = 'https://example.com'output_path = 'example_url.pdf'asyncio.run(url_to_pdf(url, output_path))Step 3: Generate PDF from Custom HTML Content:
import asynciofrom playwright.async_api import async_playwright
async def html_to_pdf(html_content, output_path): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.set_content(html_content) await page.pdf(path=output_path) await browser.close()
html_content = '''<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Sample HTML</title></head><body> <h1>Hello, World!</h1> <p>This is a sample HTML content to be converted to PDF.</p></body></html>'''output_path = 'custom_html.pdf'asyncio.run(html_to_pdf(html_content, output_path))v. WeasyPrint
WeasyPrint is a visual rendering engine that follows the W3C specifications for HTML and CSS. It’s known for its excellent CSS support and ability to generate high-quality PDFs without requiring external dependencies like browsers or rendering engines.
Step 1: Install WeasyPrint:
pip install weasyprintStep 2: Generate PDF from Website URL:
from weasyprint import HTML
def url_to_pdf(url, output_path): HTML(url=url).write_pdf(output_path)
# Example usageurl = 'https://example.com'output_path = 'example_url.pdf'url_to_pdf(url, output_path)Step 3: Generate PDF from Custom HTML Content:
from weasyprint import HTML
def html_to_pdf(html_content, output_path): HTML(string=html_content).write_pdf(output_path)
html_content = '''<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <title>Sample HTML</title> <style> body { font-family: Arial, sans-serif; } h1 { color: navy; } </style></head><body> <h1>Hello, World!</h1> <p>This is a sample HTML content to be converted to PDF.</p></body></html>'''output_path = 'custom_html.pdf'html_to_pdf(html_content, output_path)Performance Tips for HTML to PDF Conversion
When converting HTML to PDF in Python, keep these optimization tips in mind:
1. Batch Processing
If you need to generate multiple PDFs, process them in batches:
from concurrent.futures import ThreadPoolExecutorimport asyncio
# For async libraries (Pyppeteer/Playwright)async def batch_generate_async(urls): tasks = [generate_pdf(url) for url in urls] return await asyncio.gather(*tasks)
# For sync libraries (xhtml2pdf, pdfkit)def batch_generate_sync(html_contents): with ThreadPoolExecutor(max_workers=4) as executor: futures = [executor.submit(generate_pdf, html) for html in html_contents] return [f.result() for f in futures]2. Resource Management
- Browser-based libraries: Reuse browser instances instead of creating new ones
- Memory: Clear cache and close pages after use
- Timeouts: Set reasonable timeouts to prevent hanging
3. Caching
For frequently generated PDFs, implement caching:
import hashlibimport os
def get_cached_pdf(html_content): # Create hash of content content_hash = hashlib.md5(html_content.encode()).hexdigest() cache_path = f"cache/{content_hash}.pdf"
if os.path.exists(cache_path): return open(cache_path, 'rb').read()
# Generate and cache pdf = generate_pdf(html_content) with open(cache_path, 'wb') as f: f.write(pdf) return pdfComparing Python PDF Generation Libraries
Here’s a comparison of the five libraries to help you choose the right one for your needs:
| Library | Installation | JavaScript | CSS Support | Performance | Best For |
|---|---|---|---|---|---|
| Pyppeteer | Moderate (requires Chromium) | ✅ Full support | ✅ Excellent | Medium | Dynamic content, SPAs, modern web apps |
| xhtml2pdf | Easy (pure Python) | ❌ No support | ⚠️ Basic | Good | Simple documents, basic reports |
| python-pdfkit | Complex (needs wkhtmltopdf) | ⚠️ Limited | ✅ Good | Good | Static sites, invoices, reports |
| Playwright | Moderate (browser binaries) | ✅ Full support | ✅ Excellent | Good | Complex web apps, cross-browser needs |
| WeasyPrint | Moderate (system dependencies) | ❌ No support | ✅ Excellent | Good | Print-quality documents, precise layouts |
Quick Decision Guide
- Choose Pyppeteer if you need JavaScript execution and already use async Python
- Choose xhtml2pdf for simple HTML without JavaScript requirements
- Choose python-pdfkit for a balance between features and simplicity
- Choose Playwright for the most modern and robust browser automation
- Choose WeasyPrint for the best CSS compliance and print-quality output
Templated (PDF Generation API) - An Efficient Bulk PDF
If you need to generate bulk PDFs for use cases like invoices, reports, certificates, and more types of documents in PDF, using Python libraries works. But managing rendering engines, scaling servers, and handling formatting issues becomes overhead fast.
An API is simply faster, easier, and more reliable at scale. Templated is a PDF generation API that is exactly built for that. Let’s learn more about it.
Why Templated?
1. Simple API - Just a Few Lines of Code
No browser setup. No dependency headaches. Just one API call:
import requests
# Generate PDF with one API callresponse = requests.post( "https://api.templated.io/v1/render", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={ "template": "your-template-id", "format": "pdf", "layers": { "name": { "text": "John Doe" }, "invoice_no": { "text": "INV-001" } } })You pass a template ID and dynamic data. Our pdf generator API handles rendering and scaling.
2. Visual Template Editor
- Templated includes a built-in editor.
- Design templates with drag-and-drop
- Live preview as you build
- No coding required for template changes
- Team collaboration features
- Allow direct PDF document import and designs from Canva
3. AI-Powered PDF Generation
You can also generate PDF templates using AI.
Simply describe what you need. The more descriptive the prompt, the better the template design layout will be. Even if you want to make changes after the prompt, you can easily do so with drag-and-drop features similar to Canva.
4. Built for Everyone
- Developers can integrate the API directly into backend systems and generate PDFs programmatically at scale.
- Teams can use Templated’s spreadsheet feature to generate bulk PDFs without writing code and automate workflows using no-code tools.
5. Production-Ready Features
- Fast: Generate PDFs in under 2 seconds
- Scalable: Handle thousands of concurrent requests
- 99.9% Uptime: Reliable service guarantee
Python Libraries vs Templated API
| Feature | HTML to PDF with Python | Templated API |
|---|---|---|
| Setup | Requires browser engines | No setup |
| Scaling | Self-managed | Built-in scaling |
| Maintenance | Manual updates | Fully managed |
| Template Editing | Code changes required | Visual editor |
| AI Template Creation | Not available | Yes |
| Bulk Generation | Complex at scale | Built for high volume |
Conclusion
We’ve explored 5 popular Python libraries for HTML to PDF conversion. Each has its place:
Open-source libraries are great for:
- Learning and prototyping
- Simple, low-volume applications
- Complete control over the rendering process
Templated API is ideal for:
- Production applications
- High-volume PDF generation
- Teams that want to focus on business logic, not infrastructure
The choice depends on your specific needs. If you’re building a production application and want to avoid the complexity of managing browser dependencies and scaling issues, try Templated free with 50 API credits.
Other Languages
Learn how to convert HTML to PDF in other programming languages:
- Convert HTML to PDF with Java
- Convert HTML to PDF with C#
- Convert HTML to PDF with PHP
- Convert HTML to PDF with Node.js
Frequently Asked Questions
Q: Which Python library is best for HTML to PDF conversion?
A: For simple documents, use xhtml2pdf. For JavaScript-heavy pages, use Playwright or Pyppeteer. For production applications, consider a managed API like Templated.
Q: Can I convert HTML to PDF without installing browsers?
A: Yes, xhtml2pdf and WeasyPrint don’t require browsers, but they have limited JavaScript support. For full JavaScript support without managing browsers, use a cloud API.
Q: How can I generate PDFs at scale in Python?
A: For high-volume PDF generation, use either Playwright with proper resource management or a managed API service that handles scaling automatically.
Q: What’s the fastest way to implement PDF generation in production?
A: While open-source libraries are great for learning, they require significant setup and maintenance (browser dependencies, memory management, error handling). For production applications, Templated’s API gets you running in under 5 minutes with just a few lines of code, handles all scaling automatically, and includes a visual template editor so non-developers can update templates without code changes. This typically saves 40+ hours of development and maintenance time.
Ready to generate PDFs without the hassle?
Start your free trial and get 50 free API calls per month. No credit card required.



