· tutorials · 3 min read

How To Generate PDFs and Images With Pyppeteer

Learn how Pyppeteer simplifies PDF generation and web page screenshot capture in Python and how to automate these tasks efficiently for your web development projects.

Learn how Pyppeteer simplifies PDF generation and web page screenshot capture in Python and how to automate these tasks efficiently for your web development projects.

Understanding the Basics of Generating PDFs with Pyppeteer

Pyppeteer simplifies PDF generation from web pages. Start by installing Pyppeteer and its dependencies, then write your code to navigate to URLs and generate PDFs with various customization options for page size, orientation, and margins.

What is the difference between Pyppeteer and Puppeteer?

Sample Code: Basic PDF Generation

import asyncio
from pyppeteer import launch
async def generate_pdf(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
await page.pdf(path='example.pdf')
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(generate_pdf(url))

Step-by-Step Guide to Generating PDFs

Install Pyppeteer and dependencies. Import pyppeteer, asyncio, and PyppeteerPDF modules. Launch a browser instance and create a new page. Navigate to the URL and wait for the page to load. Generate the PDF with the pdf() method and save it.

Sample Code: Detailed PDF Generation Steps

import asyncio
from pyppeteer import launch
async def detailed_pdf_generation(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url, {'waitUntil': 'networkidle2'})
await page.pdf(path='detailed_example.pdf', format='A4')
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(detailed_pdf_generation(url))

Customizing PDF Generation

Pyppeteer offers options like page format, landscape mode, and scale factor. You can also inject custom CSS styles or manipulate page content before generating the PDF.

Sample Code: Customizing PDF Appearance

import asyncio
from pyppeteer import launch
async def customize_pdf(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
await page.pdf(path='customized.pdf', format='A4', landscape=True, scale=0.8)
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(customize_pdf(url))

Tips for Optimizing PDF Generation

To optimize PDFs, minimize large images and complex layouts, use waitForSelector() for element loading, and experiment with the scale option for a balance between file size and quality.

Sample Code: Optimizing PDF File Size

import asyncio
from pyppeteer import launch
async def optimize_pdf(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
await page.waitForSelector('#content')
await page.pdf(path='optimized.pdf', scale=0.75)
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(optimize_pdf(url))

Generating Images with Pyppeteer

Pyppeteer also supports capturing screenshots of web pages. Use the screenshot() method with options like path and fullPage to capture the entire page or specific elements.

Sample Code: Capturing a Full-Page Screenshot

import asyncio
from pyppeteer import launch
async def capture_screenshot(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
await page.screenshot({'path': 'fullpage.png', 'fullPage': True})
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(capture_screenshot(url))

Enhancing Image Generation

Adjust the viewport size or emulate specific devices to capture images that reflect the appearance on different screens.

Sample Code: Emulating Devices for Screenshots

import asyncio
from pyppeteer import launch
from pyppeteer.devices import devices
async def emulate_device_screenshot(url, device_name):
device = devices[device_name]
browser = await launch()
page = await browser.newPage()
await page.emulate(device)
await page.goto(url)
await page.screenshot({'path': f'{device_name.replace(" ", "_")}.png'})
await browser.close()
url = 'https://example.com'
device_name = 'iPhone 6'
asyncio.get_event_loop().run_until_complete(emulate_device_screenshot(url, device_name))

Advanced Techniques for Dynamic Images

Leverage evaluate() to execute JavaScript and capture dynamic data representations as images.

Sample Code: Using JavaScript for Dynamic Screenshots

import asyncio
from pyppeteer import launch
async def dynamic_screenshot(url):
browser = await launch()
page = await browser.newPage()
await page.goto(url)
# Execute JavaScript to modify the page or capture specific data
await page.evaluate('() => document.title = "Screenshot Title"')
await page.screenshot({'path': 'dynamic.png'})
await browser.close()
url = 'https://example.com'
asyncio.get_event_loop().run_until_complete(dynamic_screenshot(url))

Other Python libraries

There are other Python libraries capable of converting HTML to PDF and you can find more information about it in this article on How To Convert HTML to PDF with Python.

Conclusion

Pyppeteer is a powerful tool for automating PDF and image generation, suitable for a variety of applications from business processes to visualizations and web scraping. With these examples, you can start generating custom PDFs and images for your projects.

Generate images and PDFs with a simple API

Generate social media visuals, banners, PDFs and more with our
 API and no-code integrations

Learn More
Back to Blog

Ready to start generating your images and PDFs?

Sign up to our free trial and try it for yourself

See our latest posts

View all posts »