arnapou / encoder
Library - Basic encoders with a common interface.
Requires
- php: ~8.3.0 || ~8.4.0
Requires (Dev)
- ext-zlib: *
- friendsofphp/php-cs-fixer: ^3.52
- phpstan/extension-installer: ^1.3
- phpstan/phpstan: ^2.0
- phpstan/phpstan-deprecation-rules: ^2.0
- phpstan/phpstan-phpunit: ^2.0
- phpstan/phpstan-strict-rules: ^2.0
- phpunit/php-code-coverage: ^11.0
- phpunit/phpunit: ^11.0
README
This libray expose basic encoders with a common interface.
Installation
composer require arnapou/encoder
packagist 👉️ arnapou/encoder
Book Encoders
These encoders are predictive encoders which offers great compression capabilities on a specific known book of words.
These encoders are bad for strings which are not in the book : they take 2 bytes instead of 1.
One byte encoder (8 bits)
Maximum number of words : 255
use Arnapou\Encoder\Book\ArrayBook;
use Arnapou\Encoder\Book\OneByteBookEncoder;
$book = new ArrayBook(['Hello', 'World']);
$encoder = new OneByteBookEncoder($book);
// 11 to 4 bytes (-63%)
var_dump(bin2hex($encoder->encode('Hello World')));
// string(8) "00ff2001"
var_dump($encoder->decode(hex2bin('00ff2001')));
// string(11) "Hello World"
Two bytes encoder (16 bits)
Maximum number of words : 65535
This is mostly usefull when you have a lot of words to store (> 256) and each of them is at least 2 bytes long.
use Arnapou\Encoder\Book\ArrayBook;
use Arnapou\Encoder\Book\TwoBytesBookEncoder;
$book = new ArrayBook(['https://', 'arnapou.net']);
$encoder = new TwoBytesBookEncoder($book);
// 19 to 4 bytes (-79%)
var_dump(bin2hex($encoder->encode('https://arnapou.net')));
// string(6) "00010000"
var_dump($encoder->decode(hex2bin('00010000')));
// string(19) "https://arnapou.net"
Example
You can use this encoder for shortening predefined text patterns.
I use it to shorten urls which are on the same patterns.
Look at available books in src/Book :
- ArrayBook : simple implementation of generic iterable book
- EnglishBook : built-in book for english text
- WebBook : built-in book for URIs
Feel free to submit dedicated books.
use Arnapou\Encoder\Book\EnglishBook;
use Arnapou\Encoder\Book\OneByteBookEncoder;
$book = new EnglishBook();
$encoder = new OneByteBookEncoder($book);
// 22 to 11 bytes (-50%)
var_dump(bin2hex($encoder->encode('This is a small string')));
// string(11) "ff54446e5675aeac75d446"
var_dump($encoder->decode(hex2bin('ff54446e5675aeac75d446')));
// string(22) "This is a small string"
Miscellaneous Encoders
Your will find other useful encoders like
Encoder | Description |
---|---|
Identity | no encoding |
Hexadecimal | hexadecimal encoding |
Base64 | encoding in base 64 with optional trimming |
Base64UrlSafe | encoding in base 64 for urls (+/ => -_ ) |
Zlib | gzdeflate /gzencode /gzcompress family |
And a PipelineEncoder
which is very
usefull to chain encoders :
use Arnapou\Encoder\Base64\Base64Encoder;
use Arnapou\Encoder\PipelineEncoder;
use Arnapou\Encoder\Zlib\ZlibEncoder;
use Arnapou\Encoder\Zlib\ZlibEncoding;
$encoder = new PipelineEncoder(
new ZlibEncoder(encoding: ZlibEncoding::raw),
new Base64Encoder(),
);
var_dump($encoder->encode('Lorem ipsum dolor sit amet'));
// string(38) "88kvSs1VyCwoLs1VSMnPyS9SKM4sUUjMTS0BAA"
var_dump($encoder->decode('88kvSs1VyCwoLs1VSMnPyS9SKM4sUUjMTS0BAA'));
// string(26) "Lorem ipsum dolor sit amet"
Php versions
Date | Ref | 8.4 | 8.3 | 8.2 |
---|---|---|---|---|
24/11/2024 | 2.3.x, main | × | × | |
25/11/2023 | 2.0 - 2.2 | × | ||
04/09/2024 | 1.3.x | × | × | |
13/12/2022 | 1.0 - 1.2 | × |