A CLI, PHP Library and Symfony Bundle that helps getting structured data out from GPT.

0.1.0 2023-04-07 09:40 UTC

This package is auto-updated.

Last update: 2024-12-11 17:21:23 UTC


README

A CLI and PHP Library that helps getting structured data out from GPT.

Given a JSON Schema, GPT is perfectly capable of outputting JSON that conforms to the schema. This approach enables GPT to be used programmatically for non-conversational use cases.

For example, before parsing a user uploaded CSV, you could ask GPT to map its headers to the ones your code supports:

$ bin/portal ./examples/csv_headers.yaml '{
  "supportedHeaders":["firstName","age"], 
  "headers":["Prenom","Nom Famille","Annees"]}'
...

Completion Results:
===================

{
    "mappedHeaders": {
        "Prenom": "firstName",
        "Nom Famille": null,
        "Annees": "age"
    }
}

⚠️ Note that this library is experimental, and the API will change.

You are welcome to contribute by submitting issues, ideas, PRs, etc 🙂.

Installation

composer require sourceability/portal

Trying out

You can try out YAML spells with docker:

git clone https://github.com/sourceability/portal.git
cd portal
make php
bin/portal ./examples/csv_headers.yaml

Symfony support

The library includes a Symfony bundle.

Add the bundle to config/bundles.php:

return [
    // ...
    Sourceability\Portal\Bundle\SourceabilityPortalBundle::class => ['all' => true],
];

Then define the OPENAI_API_KEY=sk-XXX environment variable, for example in .env.local.

You can also configure the bundle:

# config/packages/sourceability_portal.yaml
sourceability_portal:
    openai_api_key: '%my_openai_api_key%'

You can invoke your service spells using their FQCN with the cast command (don't forget the quotes):

bin/console portal:cast 'App\Portal\MySpell'

You can also define a short name with the #[AutoconfigureSpell] attribute:

use Sourceability\Portal\Bundle\DependencyInjection\Attribute\AutoconfigureSpell;

#[AutoconfigureSpell('Categorize')]
class CategorizeSpell implements Spell
{

And invoke the spell with bin/console portal:cast Categorize

Static YAML

You can invoke portal with the path to a .yaml with the following format:

schema:
    properties:
        barbar:
            type: string
examples:
    - foobar: hello
    - foobar: world
prompt: |
    Do something.
  
    {{ foobar }}
vendor/bin/portal my_spell.yaml

Spell

The Spell interface is the main way to interact with this library.

You can think of a Spell as a way to create a function whose "implementation" is a GPT prompt:

$spell = new StaticSpell(
    schema: [
        'type' => 'array',
        'items' => ['type' => 'string']
    ],
    prompt: 'Synonyms of {{ input }}'
);

/** @var callable(string): array<string> $generateSynonyms */
$generateSynonyms = $portal->callableFromSpell($spell);

dump($generateSynonyms('car'));

array:5 [▼
  0 => "automobile"
  1 => "vehicle"
  2 => "motorcar"
  3 => "machine"
  4 => "transport"
]
use Sourceability\Portal\Spell\Spell;

/*
 * @implements Spell<TInput, TOutput>
 */
class MySpell implements Spell

A spell is defined by its Input/Output types TInput and TOutput. So for example, a spell that accepts a number and returns an array of string, would use Spell<int, string<string>>.

getSchema

With the getSchema you return a JSON Schema:

/**
 * @return string|array<string, mixed>|JsonSerializable The JSON-Schema of the desired completion output.
 */
public function getSchema(): string|array|JsonSerializable;

Make sure to leverage the description and examples properties to give GPT more context and instructions:

public function getSchema()
{
    return [
        'type' => 'object',
        'properties' => [
            'release' => [
                'description' => 'The release reference/key.',
                'examples' => ['v1.0.1', 'rc3', '2022.48.2'],
            ]
        ],
    ];
}

Note that you can also leverage libraries that define a DSL to build schemas:

getPrompt

The getPrompt method is where you describe the desired behaviour:

/**
 * @param TInput $input
 */
public function getPrompt($input): string
{
    return sprintf('Do something with ' . $input);
}

transcribe

Finally, you can transform the json decoded GPT output into your output type:

/**
 * @param array<mixed> $completionValue
 * @return array<TOutput>
 */
public function transcribe(array $completionValue): array
{
    return array_map(fn ($item) => new Money($item), $completionValue);
}

getExamples

The getExamples method returns 0 or many inputs examples. This is very useful when iterating on a prompt.

/**
 * @return array<TInput>
 */
public function getExamples(): array;

Casting

Once you've done all that, you can cast try your spell examples:

vendor/bin/portal 'App\Portal\FraudSpell'

Or invoke your spell with the PHP Api:

$portal = new Portal(...);

$result = $portal->cast(
    new FraudSpell(),
    ['user' => $user->toArray()] // This contains TInput
);

// $result->value contains array<TOutput>
actOnThe($result->value);

$portal->transfer

If you don't need the Spell getExamples and transcribe, you can use transfer:

$transferResult = $portal->transfer(
    ['type' => 'string'], // output schema
    'The prompt'
);
$transferResult->value; // the json decoded value

CLI

You can pass your own JSON example to the portal cli:

bin/portal spell.yaml '[{"hello":["worlds"]},{"hello":[]}]'

Use -v, -vv, -vvv to print more information like the prompts or the OpenAI API requests/responses.

ApiPlatformSpell

The ApiPlatformSpell uses API Platform's to generate the JSON Schema but also to deserialize the JSON result.

You must implement the following methods:

  • getClass
  • getPrompt

The following are optional:

  • isCollection is false by default, you can return true instead
  • getExamples is empty by default, you can add your examples
use Sourceability\Portal\Spell\ApiPlatformSpell;

/**
 * @extends ApiPlatformSpell<string, array<Part>>
 */
class PartListSpell extends ApiPlatformSpell
{
    public function getExamples(): array
    {
        return [
            'smartwatch',
            'bookshelf speaker',
        ];
    }

    public function getPrompt($input): string
    {
        return sprintf('A list of parts to build a %s.', $input);
    }
    
    protected function isCollection(): bool
    {
        return true;
    }
    
    protected function getClass(): string
    {
        return Part::class;
    }
}

You can then use the #[ApiProperty] attribute to add context to your schema:

use ApiPlatform\Metadata\ApiProperty;

class Part
{
    #[ApiProperty(
        description: 'Product description',
        schema: ['maxLength' => 100],
    )]
    public string $description;
}

Examples

See ./examples/.