codeinc/pdf2txt-client

This package is abandoned and no longer maintained. The author suggests using the codeinc/document-cloud-client package instead.

A PHP client for the pdf2txt service

v1.13 2024-12-10 18:06 UTC

This package is auto-updated.

Last update: 2024-12-11 01:39:34 UTC


README

Code Inc. Tests

Important

This client is deprecated and is replaced by the Document Cloud PHP Client

This repository contains a PHP 8.2+ library for converting PDF files to text using the pdf2txt service.

Installation

The library is available on Packagist. The recommended way to install it is via Composer:

composer require codeinc/pdf2txt-client

Usage

This client requires a running instance of the pdf2txt service. The service can be run locally using Docker or deployed to a server.

Examples

Extracting text from a local file:

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\Exception;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';

try {
    // convert
    $client = new Pdf2TxtClient($apiBaseUri);
    $stream = $client->extract(
        $client->createStreamFromFile($localPdfPath)
    );
    
    // display the text
    echo (string)$stream;
}
catch (Exception $e) {
    // handle exception
}

With additional options:

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\ConvertOptions;
use CodeInc\Pdf2TxtClient\Format;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';
$convertOption = new ConvertOptions(
    firstPage: 2,
    lastPage: 3,
    format: Format::json
);

try {
    $client = new Pdf2TxtClient($apiBaseUri);

    // convert 
    $jsonResponse = $client->extract(
        $client->createStreamFromFile($localPdfPath),
        $convertOption
    );
    
   // display the text in a JSON format
   $decodedJson = $client->processJsonResponse($jsonResponse);
   var_dump($decodedJson); 
}
catch (Exception $e) {
    // handle exception
}

Saving the extracted text to a file:

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\ConvertOptions;
use CodeInc\Pdf2TxtClient\Format;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';
destinationTextPath = '/path/to/local/file.txt';

try {
    $client = new Pdf2TxtClient($apiBaseUri);

    // convert
    $stream = $client->extract(
        $client->createStreamFromFile($localPdfPath)
    );
    
    // save the text to a file
    $client->saveStreamToFile($stream, $destinationTextPath);
}
catch (Exception $e) {
    // handle exception
}

License

The library is published under the MIT license (see LICENSE file).