php-llm/llm-chain

A slim PHP component with tooling around LLMs.

0.7.1 2024-10-04 09:47 UTC

README

PHP library for building LLM-based features and applications.

This library is not a stable yet, but still rather experimental. Feel free to try it out, give feedback, ask questions, contribute or share your use cases. Abstractions, concepts and interfaces are not final and potentially subject of change.

Requirements

  • PHP 8.2 or higher

Installation

The recommended way to install LLM Chain is through Composer:

composer require php-llm/llm-chain

When using Symfony Framework, check out the integration bundle php-llm/llm-chain-bundle.

Examples

See examples folder to run example implementations using this library. Depending on the example you need to export different environment variables for API keys or deployment configurations or create a .env.local based on .env file.

To run all examples, use make run-all-examples or php example.

Basic Concepts & Usage

Models & Platforms

LLM Chain categorizes two main types of models: Language Models and Embeddings Models.

Language Models, like GPT, Claude and Llama, as essential centerpiece of LLM applications and Embeddings Models as supporting models to provide vector representations of text.

Those models are provided by different platforms, like OpenAI, Azure, Replicate, and others.

Example Instantiation

use PhpLlm\LlmChain\OpenAI\Model\Embeddings;
use PhpLlm\LlmChain\OpenAI\Model\Gpt;
use PhpLlm\LlmChain\OpenAI\Model\Gpt\Version;
use PhpLlm\LlmChain\OpenAI\Platform\OpenAI;
use Symfony\Component\HttpClient\HttpClient;

// Platform: OpenAI
$platform = new OpenAI(HttpClient::create(), $_ENV['OPENAI_API_KEY']);

// Language Model: GPT (OpenAI)
$llm = new Gpt($platform, Version::gpt4oMini()); 

// Embeddings Model: Embeddings (OpenAI)
$embeddings = new Embeddings($platform);

Supported Models & Platforms

See issue #28 for planned support of other models and platforms.

Chain & Messages

The core feature of LLM Chain is to interact with language models via messages. This interaction is done by sending a MessageBag to a Chain, which takes care of LLM invokation and response handling.

Messages can be of different types, most importantly UserMessage, SystemMessage, or AssistantMessage, and can also have different content types, like Text or Image.

Example Chain call with messages

use PhpLlm\LlmChain\Chain;
use PhpLlm\LlmChain\Message\MessageBag;
use PhpLlm\LlmChain\Message\SystemMessage;
use PhpLlm\LlmChain\Message\UserMessage;

// LLM instantiation

$chain = new Chain($llm);
$messages = new MessageBag(
    new SystemMessage('You are a helpful chatbot answering questions about LLM Chain.'),
    new UserMessage('Hello, how are you?'),
);
$response = $chain->call($messages);

echo $response->getContent(); // "I'm fine, thank you. How can I help you today?"

The MessageInterface and Content interface help to customize this process if needed, e.g. additional state handling.

Code Examples

  1. Anthropic's Claude: chat-claude-anthropic.php
  2. OpenAI's GPT with Azure: chat-gpt-azure.php
  3. OpenAI's GPT: chat-gpt-openai.php
  4. OpenAI's o1: chat-o1-openai.php

Tools

To integrate LLMs with your application, LLM Chain supports tool calling out of the box. Tools are services that can be called by the LLM to provide additional features or process data.

Tool calling can be enabled by registering the processors in the chain:

use PhpLlm\LlmChain\ToolBox\ChainProcessor;
use PhpLlm\LlmChain\ToolBox\ToolAnalyzer;
use PhpLlm\LlmChain\ToolBox\ToolBox;
use Symfony\Component\Serializer\Encoder\JsonEncoder;
use Symfony\Component\Serializer\Normalizer\ObjectNormalizer;
use Symfony\Component\Serializer\Serializer;

$yourTool = new YourTool();

$toolBox = new ToolBox(new ToolAnalyzer(), [$yourTool]);
$toolProcessor = new ChainProcessor($toolBox);

$chain = new Chain($llm, inputProcessor: [$toolProcessor], outputProcessor: [$toolProcessor]);

Custom tools can basically be any class, but must configure by the #[AsTool] attribute.

use PhpLlm\LlmChain\ToolBox\Attribute\AsTool;

#[AsTool('company_name', 'Provides the name of your company')]
final class CompanyName
{
    public function __invoke(): string
    {
        return 'ACME Corp.'
    }
}

Code Examples (with built-in tools)

  1. Clock Tool: toolbox-clock.php
  2. SerpAPI Tool: toolbox-serpapi.php
  3. Weather Tool: toolbox-weather.php
  4. Wikipedia Tool: toolbox-wikipedia.php
  5. YouTube Transcriber Tool: toolbox-youtube.php (with streaming)

Document Embedding, Vector Stores & Similarity Search (RAG)

LLM Chain supports document embedding and similarity search using vector stores like ChromaDB, Azure AI Search, MongoDB Atlas Search, or Pinecone.

For populating a vector store, LLM Chain provides the service DocumentEmbedder, which requires an instance of an EmbeddingsModel and one of StoreInterface, and works with a collection of Document objects as input:

use PhpLlm\LlmChain\DocumentEmbedder;
use PhpLlm\LlmChain\OpenAI\Model\Embeddings;
use PhpLlm\LlmChain\OpenAI\Platform\OpenAI;
use PhpLlm\LlmChain\Store\Pinecone\Store;
use Probots\Pinecone\Pinecone;
use Symfony\Component\HttpClient\HttpClient;

$embedder = new DocumentEmbedder(
    new Embeddings(new OpenAI(HttpClient::create(), $_ENV['OPENAI_API_KEY']);),
    new Store(Pinecone::client($_ENV['PINECONE_API_KEY'], $_ENV['PINECONE_HOST']),
);
$embedder->embed($documents);

The collection of Document instances is usually created by text input of your domain entities:

use PhpLlm\LlmChain\Document\Metadata;
use PhpLlm\LlmChain\Document\TextDocument;

foreach ($entities as $entity) {
    $documents[] = new TextDocument(
        id: $entity->getId(),                       // UUID instance
        content: $entity->toString(),               // Text representation of relevant data for embedding
        metadata: new Metadata($entity->toArray()), // Array representation of entity to be stored additionally
    );
}

Note

Not all data needs to be stored in the vector store, but you could also hydrate the original data entry based on the ID or metadata after retrieval from the store.*

In the end the chain is used in combination with a retrieval tool on top of the vector store, e.g. the built-in SimilaritySearch tool provided by the library:

use PhpLlm\LlmChain\Chain;
use PhpLlm\LlmChain\DocumentEmbedder;
use PhpLlm\LlmChain\Message\Message;
use PhpLlm\LlmChain\Message\MessageBag;
use PhpLlm\LlmChain\ToolBox\ChainProcessor;
use PhpLlm\LlmChain\ToolBox\Tool\SimilaritySearch;
use PhpLlm\LlmChain\ToolBox\ToolAnalyzer;
use PhpLlm\LlmChain\ToolBox\ToolBox;

// Initialize Platform and LLM

$similaritySearch = new SimilaritySearch($embeddings, $store);
$toolBox = new ToolBox(new ToolAnalyzer(), [$similaritySearch]);
$processor = new ChainProcessor($toolBox);
$chain = new Chain(new Gpt($platform), [$processor], [$processor]);

$messages = new MessageBag(
    Message::forSystem(<<<PROMPT
        Please answer all user questions only using the similary_search tool. Do not add information and if you cannot
        find an answer, say so.
        PROMPT>>>),
    Message::ofUser('...') // The user's question.
);
$response = $chain->call($messages);

Code Examples

  1. MongoDB Store: store-mongodb-similarity-search.php
  2. Pinecone Store: store-pinecone-similarity-search.php

Supported Stores

See issue #28 for planned support of other models and platforms.

Advanced Usage & Features

Structured Output

A typical use-case of LLMs is to classify and extract data from unstructured sources, which is supported by some models by features like Structured Output or providing a Response Format.

PHP Classes as Output

LLM Chain support that use-case by abstracting the hustle of defining and providing schemas to the LLM and converting the response back to PHP objects.

To achieve this, a specific chain processor needs to be registered:

use PhpLlm\LlmChain\Chain;
use PhpLlm\LlmChain\Message\Message;
use PhpLlm\LlmChain\Message\MessageBag;
use PhpLlm\LlmChain\StructuredOutput\ChainProcessor;
use PhpLlm\LlmChain\StructuredOutput\ResponseFormatFactory;
use PhpLlm\LlmChain\Tests\StructuredOutput\Data\MathReasoning;
use Symfony\Component\Serializer\Encoder\JsonEncoder;
use Symfony\Component\Serializer\Normalizer\ObjectNormalizer;
use Symfony\Component\Serializer\Serializer;

// Initialize Platform and LLM

$serializer = new Serializer([new ObjectNormalizer()], [new JsonEncoder()]);
$processor = new ChainProcessor(new ResponseFormatFactory(), $serializer);
$chain = new Chain($llm, [$processor], [$processor]);

$messages = new MessageBag(
    Message::forSystem('You are a helpful math tutor. Guide the user through the solution step by step.'),
    Message::ofUser('how can I solve 8x + 7 = -23'),
);
$response = $chain->call($messages, ['output_structure' => MathReasoning::class]);

dump($response->getContent()); // returns an instance of `MathReasoning` class

Array Structures as Output

Also PHP array structures as response_format are supported, which also requires the chain processor mentioned above:

use PhpLlm\LlmChain\Message\Message;
use PhpLlm\LlmChain\Message\MessageBag;

// Initialize Platform, LLM and Chain with processors and Clock tool

$messages = new MessageBag(Message::ofUser('What date and time is it?'));
$response = $chain->call($messages, ['response_format' => [
    'type' => 'json_schema',
    'json_schema' => [
        'name' => 'clock',
        'strict' => true,
        'schema' => [
            'type' => 'object',
            'properties' => [
                'date' => ['type' => 'string', 'description' => 'The current date in the format YYYY-MM-DD.'],
                'time' => ['type' => 'string', 'description' => 'The current time in the format HH:MM:SS.'],
            ],
            'required' => ['date', 'time'],
            'additionalProperties' => false,
        ],
    ],
]]);

dump($response->getContent()); // returns an array

Code Examples

  1. Structured Output (PHP class): structured-output-math.php
  2. Structured Output (array): structured-output-clock.php

Tool Parameters

LLM Chain generates a JSON Schema representation for all tools in the ToolBox based on the #[AsTool] attribute and method arguments and doc block. Additionally, JSON Schema support validation rules, which are partially support by LLMs like GPT.

To leverage this, configure the #[ToolParameter] attribute on the method arguments of your tool:

use PhpLlm\LlmChain\ToolBox\Attribute\AsTool;
use PhpLlm\LlmChain\ToolBox\Attribute\ToolParameter;

#[AsTool('my_tool', 'Example tool with parameters requirements.')]
final class MyTool
{
    /**
     * @param string $name   The name of an object
     * @param int    $number The number of an object
     */
    public function __invoke(
        #[ToolParameter(pattern: '/([a-z0-1]){5}/')]
        string $name,
        #[ToolParameter(minimum: 0, maximum: 10)]   
        int $number,
    ): string {
        // ...
    }
}

Note

Please be aware, that this is only converted in a JSON Schema for the LLM to respect, but not validated by LLM Chain.

Response Streaming

Since LLMs usually generate a response word by word, most of them also support streaming the response using Server Side Events. LLM Chain supports that by abstracting the conversion and returning a Generator as content of the response.

use PhpLlm\LlmChain\Chain;
use PhpLlm\LlmChain\Message\Message;
use PhpLlm\LlmChain\Message\MessageBag;

// Initialize Platform and LLM

$chain = new Chain($llm);
$messages = new MessageBag(
    Message::forSystem('You are a thoughtful philosopher.'),
    Message::ofUser('What is the purpose of an ant?'),
);
$response = $chain->call($messages, [
    'stream' => true, // enable streaming of response text
]);

foreach ($response->getContent() as $word) {
    echo $word;
}

In a terminal application this generator can be used directly, but with a web app an additional layer like Mercure needs to be used.

Code Examples

  1. Streaming Claude: stream-claude-anthropic.php
  2. Streaming GPT: stream-gpt-openai.php

Image Processing

Some LLMs also support images as input, which LLM Chain supports as Content type within the UserMessage:

use PhpLlm\LlmChain\Message\Content\Image;
use PhpLlm\LlmChain\Message\Message;
use PhpLlm\LlmChain\Message\MessageBag;

// Initialize Platoform, LLM & Chain

$messages = new MessageBag(
    Message::forSystem('You are an image analyzer bot that helps identify the content of images.'),
    Message::ofUser(
        'Describe the image as a comedian would do it.',
        new Image(dirname(__DIR__).'/tests/Fixture/image.png'), // Path to an image file
        new Image('https://foo.com/bar.png'), // URL to an image
        new Image('data:image/png;base64,...'), // Data URL of an image
    ),
);
$response = $chain->call($messages);

Code Examples

  1. Image Description: image-describer-binary.php (with binary file)
  2. Image Description: image-describer-url.php (with URL)

Embeddings

Creating embeddings of word, sentences or paragraphs is a typical use case around the interaction with LLMs and therefore LLM Chain implements a EmbeddingsModel interface with various models, see above.

The standalone usage results in an Vector instance:

use PhpLlm\LlmChain\OpenAI\Model\Embeddings;
use PhpLlm\LlmChain\OpenAI\Model\Embeddings\Version;

// Initialize Platform

$embeddings = new Embeddings($platform, Version::textEmbedding3Small());

$vector = $embeddings->create($textInput);

dump($vector->getData()); // Array of float values

Code Examples

  1. OpenAI's Emebddings: embeddings-openai.php
  2. Voyage's Embeddings: embeddings-voyage.php

Input & Output Processing

The behavior of the Chain is extendable with services that implement InputProcessor and/or OutputProcessor interface. They are provided while instantiating the Chain instance:

use PhpLlm\LlmChain\Chain;

// Initialize LLM and processors

$chain = new Chain($llm, $inputProcessors, $outputProcessors);

InputProcessor

InputProcessor instances are called in the chain before handing over the MessageBag and the $options array to the LLM and are able to mutate both on top of the Input instance provided.

use PhpLlm\LlmChain\Chain\Input;
use PhpLlm\LlmChain\Chain\InputProcessor;
use PhpLlm\LlmChain\Message\AssistantMessage

final class MyProcessor implements InputProcessor
{
    public function processInput(Input $input): void
    {
        // mutate options
        $options = $input->getOptions();
        $options['foo'] = 'bar';
        $input->setOptions($options);
        
        // mutate MessageBag
        $input->messages->append(new AssistantMessage(sprintf('Please answer using the locale %s', $this->locale)));
    }
}

OutputProcessor

OutputProcessor instances are called after the LLM provided a response and can - on top of options and messages - mutate or replace the given response:

use PhpLlm\LlmChain\Chain\Output;
use PhpLlm\LlmChain\Chain\OutputProcessor;
use PhpLlm\LlmChain\Message\AssistantMessage

final class MyProcessor implements OutputProcessor
{
    public function processOutput(Output $out): void
    {
        // mutate response
        if (str_contains($output->response->getContent, self::STOP_WORD)) {
            $output->reponse = new TextReponse('Sorry, we were unable to find relevant information.')
        }
    }
}

Chain Awareness

Both, Input and Output instances, provide access to the LLM used by the Chain, but the chain itself is only provided, in case the processor implemented the ChainAwareProcessor interface, which can be combined with using the ChainAwareTrait:

use PhpLlm\LlmChain\Chain\ChainAwareProcessor;
use PhpLlm\LlmChain\Chain\ChainAwareTrait;
use PhpLlm\LlmChain\Chain\Output;
use PhpLlm\LlmChain\Chain\OutputProcessor;
use PhpLlm\LlmChain\Message\AssistantMessage

final class MyProcessor implements OutputProcessor, ChainAwareProcessor
{
    use ChainAwareTrait;

    public function processOutput(Output $out): void
    {
        // additional chain interaction
        $response = $this->chain->call(...);
    }
}

Contributions

Contributions are always welcome, so feel free to join the development of this library.

Current Contributors

LLM Chain Contributors