1tomany/pdf-to-image-bundle

This package is abandoned and no longer maintained. The author suggests using the 1tomany/pdf-ai-bundle package instead.

Symfony bundle for the 1tomany/pdf-ai library

Installs: 109

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 2

Forks: 0

Open Issues: 0

Type:symfony-bundle

v0.5.1 2025-09-11 19:41 UTC

This package is auto-updated.

Last update: 2025-09-11 20:28:31 UTC


README

PDFAI is a simple PHP library that makes extracting data from PDFs for large language models easy.

Install PDFAI

composer require 1tomany/pdf-ai-bundle

Usage

Symfony will autowire the necessary classes after the bundle is installed. Any constructor argument typed with OneToMany\PDFAI\Contract\Action\ExtractDataActionInterface or OneToMany\PDFAI\Contract\Action\ReadMetadataActionInterface will allow you to interact with the concrete extractor client via the act() method.

<?php

namespace App\File\Action\Handler;

use OneToMany\PDFAI\Contract\Action\ExtractDataActionInterface;
use OneToMany\PDFAI\Contract\Action\ReadMetadataActionInterface;
use OneToMany\PDFAI\Request\ExtractDataRequest;
use OneToMany\PDFAI\Request\ExtractTextRequest;
use OneToMany\PDFAI\Request\ReadMetadataRequest;

final readonly class UploadFileHandler
{
    public function __construct(
        private ReadMetadataActionInterface $readMetadataAction,
        private ExtractDataActionInterface $extractDataAction,
    ) {
    }

    public function handle(string $filePath): void
    {
        // Read PDF metadata like page count
        $metadata = $this->readMetadataAction->act(
            new ReadMetadataRequest($filePath)
        );

        // Rasterize all pages of a PDF to a 150 DPI PNG
        $request = new ExtractDataRequest(
            $filePath,       // Full path to PDF file
            1,               // First page to extract
            null,            // Last page to extract, NULL for all pages
            OutputType::Png, // Jpg and Txt are other options
            150,             // Output resolution in dots per inch
        );
        
        // @see OneToMany\PDFAI\Contract\Response\ExtractedDataResponseInterface
        foreach ($this->extractDataAction->act($request) as $image) {
            // $image->getData() or $image->toDataUri()
        }
        
        // Extract text from pages 2 through 8
        $request = new ExtractTextRequest($filePath, 2, 8);
        
        // @see OneToMany\PDFAI\Contract\Response\ExtractedDataResponseInterface
        foreach ($this->extractDataAction->act($request) as $text) {
            // $text->getData() or $text->toDataUri()
        }
    }
}

Testing

If you wish to avoid interacting with an external process in your test environment, you can take advantage of the MockExtractorClient by simply setting the 1tomany.pdfai_extractor_client parameter to the value mock in your Symfony service configuration for the test environment.

when@test:
    parameters:
        1tomany.pdfai_extractor_client: 'mock'

Without changing any other code, Symfony will automatically inject the MockExtractorClient instead of the default PopplerExtractorClient for your tests.

Extending

Don't want to use Poppler? No problem! Create your own extractor class that implements the OneToMany\PDFAI\Contract\Client\ExtractorClientInterface interface and tag it accordingly.

<?php

namespace App\File\Service\PDFAI\Client\Magick;

use OneToMany\PDFAI\Contract\Client\ExtractorClientInterface;
use OneToMany\PDFAI\Contract\Request\ExtractDataRequestInterface;
use OneToMany\PDFAI\Contract\Request\ReadMetadataRequestInterface;
use OneToMany\PDFAI\Contract\Response\MetadataResponseInterface;

class MagickExtractorClient implements ExtractorClientInterface
{
    public function readMetadata(ReadMetadataRequestInterface $request): MetadataResponseInterface
    {
        // Add your implementation here
    }
    
    public function extractData(ExtractDataRequestInterface $request): \Generator
    {
        // Add your implementation here
    }
}
parameters:
    1tomany.pdfai_extractor_client: 'magick'

services:
    App\File\Service\PDFAI\Client\Magick\MagickExtractorClient:
        tags:
            - { name: 1tomany.pdfai_extractor_client, key: magick }

That's it! Again, without changing any code, Symfony will automatically inject the correct extractor client for the action interfaces outlined above.

Run Static Analysis

./vendor/bin/phpstan

Credits

License

The MIT License