· tutorials · 12 min read
Convert HTML to PDF in Python with 5 Popular Libraries (Updated 2025)
Learn how to convert HTML to PDF in Python using the best libraries. Compare features, performance, and find the right solution for your needs

We often encounter the need to create PDFs based on content. While there is no right or wrong way to generate PDFs, some approaches are more efficient and quicker to build than others.
Previously, we had to write all the boilerplate code to generate PDFs in our applications.
However, now we have many great libraries and tools that can help us quickly implement this feature.
The most important part of generating PDFs is the input data. The most common and useful approach is to generate PDFs from HTML content or based on a website URL.
In this article, we will look into some approaches that we can take to generate PDFs from HTML.
TL;DR: We provide a robust REST API designed for seamless PDF generation with popular programming languages like Python.
Why generate PDF from HTML?
Before we move on to the libraries, first let’s see why we prefer HTML as input data for generating PDFs. Some of the reasons are as follows:
- Open and Mature Technology: HTML is an open standard, which ensures that tools and technologies built around it are widely available and well-understood. Its maturity also means that most of the challenges and quirks are well-documented, making troubleshooting easier.
- Cost-effective: There are a plethora of tools, libraries, and APIs available (both free and paid) that can convert HTML to PDF, reducing the need for specialized software for PDF creation.
- Embed Multimedia: HTML supports the embedding of multimedia such as images, videos, and audio. Although not all of these can be directly translated into a PDF, having a source in HTML provides options for creating rich, multimedia-enhanced documents.
- Styling with CSS: Cascading Style Sheets (CSS) provide powerful styling options for HTML content, allowing for branding, theming, and visual consistency. These can then be reflected in the resulting PDF.
- Easy to Learn and Use: Learning the basics of HTML can be done quickly, making it accessible for many users to create content.
In summary, converting PDFs from HTML combines the best of both worlds: the flexibility, accessibility, and interactivity of HTML with the portability and Standardization of PDFs.
HTML to PDF using Python Libraries
There are many libraries available in Python that allow the generation of PDFs from HTML content, some of them are explained below.
When generating HTML to PDF in Python, we need libraries and solutions which does not compromise the formatting of the PDF. With the following Open Source libraries you don’t need to worry about losing formatting because all the below solutions take care of the formatting when generating HTML to PDF using Python.
i. Pyppeteer
Pyppeteer is a Python port of the Node library Puppeteer, which provides a high-level API over the Chrome DevTools Protocol. It’s like you are running a browser in your code that can do similar things that your browser can do. Puppeteer can be used to scrape data from websites, take screenshots for a website, and much more. Let’s see how we can utilize pyppeteer to generate PDFs from HTML.
First, we need to install pyppeteer with the following command:
pip install pyppeteerGenerate PDF from a website URL
import asynciofrom pyppeteer import launch
async def generate_pdf(url, pdf_path): browser = await launch() page = await browser.newPage()
await page.goto(url)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# Run the functionasyncio.get_event_loop().run_until_complete(generate_pdf('https://example.com', 'example.pdf'))In the above code, if you see the generate_pdf method, we are doing the following things:
- Launching a new headless browser instance
- Opens a new tab or page in the headless browser and waits for it to be ready.
- Navigate to the URL specified in the
urlargument and wait for the page to load. - Generates a PDF of the webpage. The PDF is saved at the location specified in
pdf_path, and the format is set toA4. - Closes the headless browser.
Generate PDF from Custom HTML content
import asynciofrom pyppeteer import launch
async def generate_pdf_from_html(html_content, pdf_path): browser = await launch() page = await browser.newPage()
await page.setContent(html_content)
await page.pdf({'path': pdf_path, 'format': 'A4'})
await browser.close()
# HTML contenthtml_content = '''<!DOCTYPE html><html><head> <title>PDF Example</title></head><body> <h1>Hello, world!</h1></body></html>'''
# Run the functionasyncio.get_event_loop().run_until_complete(generate_pdf_from_html(html_content, 'from_html.pdf'))Above is another example using Pyppeteer on how we can use our own custom HTML content to generate PDFs. Let’s see what is happening in the method generate_pdf_from_html:
- Launching a new headless browser instance
- Opens a new tab or page in the headless browser and waits for it to be ready.
- Now we are explicitly setting the content of the page to our HTML content
- Generates a PDF of the webpage. The PDF is saved at the location specified in
pdf_path, and the format is set toA4. - Closes the headless browser.
ii. xhtml2pdf
xhtml2pdf is another Python library that lets you generate PDFs from HTML content. Let’s see xhtml2pdf in action.
The following command is to install xhtml2pdf:
pip install xhtml2pdf requestsTo generate PDF from a website URL
Note that xhtml2pdf does not have an in-built feature to parse the URL, but we can use requests in Python to get the content from a URL.
from xhtml2pdf import pisaimport requests
def convert_url_to_pdf(url, pdf_path): # Fetch the HTML content from the URL response = requests.get(url) if response.status_code != 200: print(f"Failed to fetch URL: {url}") return False
html_content = response.text
# Generate PDF with open(pdf_path, "wb") as pdf_file: pisa_status = pisa.CreatePDF(html_content, dest=pdf_file)
return not pisa_status.err
# URL to fetchurl_to_fetch = "https://google.com"
# PDF path to savepdf_path = "google.pdf"
# Generate PDFif convert_url_to_pdf(url_to_fetch, pdf_path): print(f"PDF generated and saved at {pdf_path}")else: print("PDF generation failed")In the above code, we are doing the following things in our method convert_url_to_pdf:
- First, we are using
requeststo get the webpage content from the URL. - Once we get the content, we select the text part from the response using
response.text - Now the generating PDF part comes, we are using
pisa.CreatePDFand pass our HTML content and PDF file name for the output.
Generating PDF from custom HTML content
from xhtml2pdf import pisa
def convert_html_to_pdf(html_string, pdf_path): with open(pdf_path, "wb") as pdf_file: pisa_status = pisa.CreatePDF(html_string, dest=pdf_file) return not pisa_status.err
# HTML contenthtml = '''<html> <head> <title>PDF Example</title> </head>
<body> <h1>Hey, this will turn into a PDF!</h1> </body></html>'''
# Create PDFpdf_path = "example.pdf"convert_html_to_pdf(html, pdf_path)Creating a PDF from custom HTML content is similar to the process for URLs, with just one key difference: instead of passing a URL, we directly provide the actual HTML content to our creation method. The method then uses this custom HTML content to create the PDF.
iii. python-pdfkit
python-pdfkit is a Python wrapper for the wkhtmltopdf utility, which uses Webkit to convert HTML to PDF.
First, let’s install python-pdfkit with pip:
pip install pdfkitGenerate PDF from a website URL
import pdfkit
# URL to fetchurl = 'https://cnn.com'
# PDF path to savepdf_path = 'example.pdf'
pdfkit.from_url(url, pdf_path)pdfkit supports generating PDFs from website URLs out of the box just like Pyppeteer.
In the above code, as you can see, pdfkit is generating pdf just from one line code. pdfkit.from_url is all you need to generate a PDF.
Generate PDF from Custom HTML content
import pdfkit
# HTML contenthtml = '''<html> <head> <title>PDF Example</title> </head>
<body> <h1>Hey, this will turn into a PDF!</h1> </body></html>'''
# PDF path to savepdf_path = 'example.pdf'
# Create PDFpdfkit.from_string(html, pdf_path)To generate a PDF from custom HTML content using python-pdfkit, you simply need to use pdfkit.from_string and provide the HTML content along with the path for the PDF file.
iv. Playwright
Playwright is a modern, lightweight library for headless browser automation. It supports multiple browsers (Firefox, Chromium, Edge, Safari) across platforms and languages, making it versatile for tasks like PDF generation.
To use Python as a converter for HTML to PDF with Playwright, follow these steps:
Step 1: Install Playwright:
pip install playwrightplaywright installStep 2: Generate PDF from Website URL:
import asynciofrom playwright.async_api import async_playwright
async def url_to_pdf(url, output_path): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.goto(url) await page.pdf(path=output_path) await browser.close()
# Example usageurl = 'https://example.com'output_path = 'example_url.pdf'asyncio.run(url_to_pdf(url, output_path))Step 3: Generate PDF from Custom HTML Content:
import asynciofrom playwright.async_api import async_playwright
async def html_to_pdf(html_content, output_path): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.set_content(html_content) await page.pdf(path=output_path) await browser.close()
html_content = '''<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Sample HTML</title></head><body> <h1>Hello, World!</h1> <p>This is a sample HTML content to be converted to PDF.</p></body></html>'''output_path = 'custom_html.pdf'asyncio.run(html_to_pdf(html_content, output_path))v. WeasyPrint
WeasyPrint is a visual rendering engine that follows the W3C specifications for HTML and CSS. It’s known for its excellent CSS support and ability to generate high-quality PDFs without requiring external dependencies like browsers or rendering engines.
Step 1: Install WeasyPrint:
pip install weasyprintStep 2: Generate PDF from Website URL:
from weasyprint import HTML
def url_to_pdf(url, output_path): HTML(url=url).write_pdf(output_path)
# Example usageurl = 'https://example.com'output_path = 'example_url.pdf'url_to_pdf(url, output_path)Step 3: Generate PDF from Custom HTML Content:
from weasyprint import HTML
def html_to_pdf(html_content, output_path): HTML(string=html_content).write_pdf(output_path)
html_content = '''<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <title>Sample HTML</title> <style> body { font-family: Arial, sans-serif; } h1 { color: navy; } </style></head><body> <h1>Hello, World!</h1> <p>This is a sample HTML content to be converted to PDF.</p></body></html>'''output_path = 'custom_html.pdf'html_to_pdf(html_content, output_path)Performance Tips for HTML to PDF Conversion
When converting HTML to PDF in Python, keep these optimization tips in mind:
1. Batch Processing
If you need to generate multiple PDFs, process them in batches:
from concurrent.futures import ThreadPoolExecutorimport asyncio
# For async libraries (Pyppeteer/Playwright)async def batch_generate_async(urls): tasks = [generate_pdf(url) for url in urls] return await asyncio.gather(*tasks)
# For sync libraries (xhtml2pdf, pdfkit)def batch_generate_sync(html_contents): with ThreadPoolExecutor(max_workers=4) as executor: futures = [executor.submit(generate_pdf, html) for html in html_contents] return [f.result() for f in futures]2. Resource Management
- Browser-based libraries: Reuse browser instances instead of creating new ones
- Memory: Clear cache and close pages after use
- Timeouts: Set reasonable timeouts to prevent hanging
3. Caching
For frequently generated PDFs, implement caching:
import hashlibimport os
def get_cached_pdf(html_content): # Create hash of content content_hash = hashlib.md5(html_content.encode()).hexdigest() cache_path = f"cache/{content_hash}.pdf"
if os.path.exists(cache_path): return open(cache_path, 'rb').read()
# Generate and cache pdf = generate_pdf(html_content) with open(cache_path, 'wb') as f: f.write(pdf) return pdfComparing Python PDF Generation Libraries
Here’s a comparison of the five libraries to help you choose the right one for your needs:
| Library | Installation | JavaScript | CSS Support | Performance | Best For |
|---|---|---|---|---|---|
| Pyppeteer | Moderate (requires Chromium) | ✅ Full support | ✅ Excellent | Medium | Dynamic content, SPAs, modern web apps |
| xhtml2pdf | Easy (pure Python) | ❌ No support | ⚠️ Basic | Good | Simple documents, basic reports |
| python-pdfkit | Complex (needs wkhtmltopdf) | ⚠️ Limited | ✅ Good | Good | Static sites, invoices, reports |
| Playwright | Moderate (browser binaries) | ✅ Full support | ✅ Excellent | Good | Complex web apps, cross-browser needs |
| WeasyPrint | Moderate (system dependencies) | ❌ No support | ✅ Excellent | Good | Print-quality documents, precise layouts |
Quick Decision Guide
- Choose Pyppeteer if you need JavaScript execution and already use async Python
- Choose xhtml2pdf for simple HTML without JavaScript requirements
- Choose python-pdfkit for a balance between features and simplicity
- Choose Playwright for the most modern and robust browser automation
- Choose WeasyPrint for the best CSS compliance and print-quality output
A Better Alternative: Templated API
If you’re looking for a production-ready solution without the headaches, Templated provides a managed API that eliminates all the operational challenges of open-source libraries.
Why Templated?
1. Simple API - Just Few Lines of Code:
import requests
# Generate PDF with one API callresponse = requests.post( "https://api.templated.io/v1/render", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={ "template": "your-template-id", "format": "pdf", "layers": { "name": { text: "John Doe" }, "invoice_no": { text: "INV-001" } } })2. Visual Template Editor:
- Design templates with drag-and-drop
- Live preview as you build
- No coding required for template changes
- Team collaboration features
3. Production-Ready Features:
- 🚀 Fast: Generate PDFs in under 2 seconds
- 📈 Scalable: Handle thousands of concurrent requests
- 🔄 99.9% Uptime: Reliable service guarantee
4. Zero Maintenance:
- No browser installations
- No memory management
- No dependency updates
- No server scaling issues
Try Templated Free
🎯 Special Offer: Get 50 free credits. No credit card required.
Quick Start in 3 Steps:
Step 1: Sign Up (30 seconds)
- Create your free account at app.templated.io
- Get instant API access
Step 2: Create Template (2 minutes)
- Use our visual editor or HTML
- Add dynamic variables
- Preview instantly
Step 3: Generate PDFs (5 lines of code)
import requests
result = requests.post( "https://api.templated.io/v1/render", headers={"Authorization": "Bearer YOUR_KEY"}, json={"template": "template-id", "format": "pdf", "layers": {}}).json()Free Online HTML to PDF Converter
Need to convert HTML to PDF right now? Try our free online converter - no signup required!
Conclusion
We’ve explored 5 popular Python libraries for HTML to PDF conversion. Each has its place:
Open-source libraries are great for:
- Learning and prototyping
- Simple, low-volume applications
- Complete control over the rendering process
Templated API is ideal for:
- Production applications
- High-volume PDF generation
- Teams that want to focus on business logic, not infrastructure
The choice depends on your specific needs. If you’re building a production application and want to avoid the complexity of managing browser dependencies and scaling issues, try Templated free with 50 API credits.
Other Languages
Learn how to convert HTML to PDF in other programming languages:
- Convert HTML to PDF with Java
- Convert HTML to PDF with C#
- Convert HTML to PDF with PHP
- Convert HTML to PDF with Node.js
Frequently Asked Questions
Q: Which Python library is best for HTML to PDF conversion?
A: For simple documents, use xhtml2pdf. For JavaScript-heavy pages, use Playwright or Pyppeteer. For production applications, consider a managed API like Templated.
Q: Can I convert HTML to PDF without installing browsers?
A: Yes, xhtml2pdf and WeasyPrint don’t require browsers, but they have limited JavaScript support. For full JavaScript support without managing browsers, use a cloud API.
Q: How can I generate PDFs at scale in Python?
A: For high-volume PDF generation, use either Playwright with proper resource management or a managed API service that handles scaling automatically.
Q: What’s the fastest way to implement PDF generation in production?
A: While open-source libraries are great for learning, they require significant setup and maintenance (browser dependencies, memory management, error handling). For production applications, Templated’s API gets you running in under 5 minutes with just a few lines of code, handles all scaling automatically, and includes a visual template editor so non-developers can update templates without code changes. This typically saves 40+ hours of development and maintenance time.
Ready to generate PDFs without the hassle?
Start your free trial and get 50 free API calls per month. No credit card required.
Automate your content with Templated



