bakame / aide-ndjson
A simple class to handle Jsonlines in PHP
Fund package maintenance!
nyamsprod
Requires
- php: ^8.1.2
- league/csv: ^9.25
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.75.0
- phpstan/phpstan: ^1.12.27
- phpstan/phpstan-deprecation-rules: ^1.2.1
- phpstan/phpstan-phpunit: ^1.4.2
- phpstan/phpstan-strict-rules: ^1.6.2
- phpunit/phpunit: ^10.5 || ^11.5 || ^12.3
- symfony/var-dumper: ^6.4 || ^7.3
README
A robust NDJSON?JSONL Encoder/Decoder for PHP
Introduction
NdJson
is a robust PHP utility for encoding, decoding, streaming, and tabular parsing of NDJSON (Newlinw-Delimited JSON)
— also commonly known as JSON Lines (JSONL). Both names refer to the same format: one JSON object per line, separated by \n
.
It supports both object-row and list-header formats, streaming iterators, and static-analysis-friendly types for PHPStan/Psalm.
Installation
Composer
composer require bakame-php/aide-ndjson
System Requirements
You need:
- PHP >= 8.1 but the latest stable version of PHP is recommended
- latest version of league/csv
Usage
Builder
New feature introduced in version 1.1.0
The Builder()
class provides a fluent, immutable API for configuring and
working with NdJson
. It lets you define reusable defaults (such as mappers,
formatters, chunk size, encoding flags, and record format) and then
apply them consistently across multiple operations.
Overview
use League\Csv\NdJson\Builder; $builder = new Builder();
Each configuration method returns a new instance. This ensures immutability and allows for safely sharing base configurations.
Configuration
- Mapper:
mapper(?callable $mapper): self
Transform each decoded row into another representation. - Formatter:
formatter(?callable $formatter): self
Transform each record before encoding. - Chunk Size:
chunkSize(int $size): self
Controls how many records are grouped together per JSON chunk (default: 1). - Flags
flags(int $flags): self
JSON encoding flags (same as json_encode, JSON_PRETTY_PRINT is ignored). - Format
format(RecordFormat $format): self
Defines the output representation (Object or List) when working with tabular data.
Example:
$builder = (new Builder()) ->formatter(fn ($row) => ['value' => $row]) ->mapper(fn (array $row) => $row['value']) ->chunkSize(100) ->flags(JSON_THROW_ON_ERROR) ->format(RecordFormat::List);
Encoding NDJSON
Different encoding strategies are supported, depending on how you want to generate your NDJSON content. You can encode using:
- the
encode()
method to output a string - the
write()
method to store the output into a file ; - the
download()
method to encode and make the file downloadable via any HTTP client.
Encode an array of data to NDJSON/JSONL string
$data = [ ['name' => 'Alice', 'score' => 42], ['name' => 'Bob', 'score' => 27], ]; $ndjson = (new Builder())->encode($data); echo $ndjson; /* {"name":"Alice","score":42} {"name":"Bob","score":27} */
Write NDJSON/JSONL directly to a file
$data = [ ['user' => 'Charlie', 'active' => true], ['user' => 'Diana', 'active' => false], ]; (new Builder())->write($data, __DIR__ . '/users.ndjson');
Make NDJSON downloadable via HTTP
// In a controller: (new Builder())->download( [['id' => 1, 'value' => 'foo'], ['id' => 2, 'value' => 'bar']], filename: 'export.ndjson' );
Decoding NDJSON
- the
decode()
method to decode a NDJSON/JSONL string - the
read()
method to retrieve NDJSON/JSONL conten from a file;
Decode a string
$content = <<<NDJSON {"name":"Alice","score":42} {"name":"Bob","score":27} NDJSON; foreach (new Builder())->decode($content) as $row) { var_dump($row); } /* array(2) { ["name"]=> string(5) "Alice" ["score"]=> int(42) } array(2) { ["name"]=> string(3) "Bob" ["score"]=> int(27) } */
Decode with mapper
$content = <<<NDJSON {"value":1} {"value":2} {"value":3} NDJSON; $iterator = (new Builder()) ->formatter(fn (array $row): int => $row['value'] * 10) ->decode($content); foreach ($iterator as $num) { echo $num, PHP_EOL; } /* 10 20 30 */
Working with Tabular data
the Builder
can parse and encode tabular data in two forms:
- Object rows:
{"Name":"Gilbert","Score":24}
- List rows with header: ["Name","Score"] followed by values
decodeTabularData(Stringable|string $content, array $header = []): TabularData encodeTabularData(TabularData|TabularDataProvider $tabularData): string readTabularData(mixed $path, array $header = [], $context = null): TabularData writeTabularData(TabularData|TabularDataProvider $tabularData, mixed $to, $context = null): int
Parses a file or stream as tabular data. Auto-detects headers if $header
is empty.
$tabular = (new Builder())->decodeTabularData($ldjsonString); foreach ($tabular as $record) { var_dump($record); }
The returned TabularData
can be further processed using all the features from the
League\Csv
package.
use League\Csv\Statement; $query = (new Statement()) ->andWhere('score', '=', 10) ->whereNot('name', 'starts_with', 'P') //filtering is done case-sensitively on the first character of the column value; $tabular = (new Builder())->readTabularData('/tmp/scores.ldjson'); foreach ($query->process($tabular)->getRecordsAsObject(Player::class) as $player) { echo $player->name, PHP_EOL; }
Conversely, you can encode the TabularData
as shown below:
use Bakame\Aide\NdJson\Builder; use Bakame\Aide\NdJson\RecordFormat; $csv = <<<CSV Name,Score Alice,42 Bob,27 CSV; $document = Reader::createFromString($csv); $document->setHeaderOffset(0); echo (new Builder()) ->format(RecordFormat::ListWithHeader) ->encodeTabularData($document); $builder = new Builder(); $json = $builder->encodeTabularData($document); // NDJSON with object records // {"Name":"Alice","Score":42} // {"Name":"Bob","Score":27} $json = $builder->format(RecordFormat::ListWithHeader)->encodeTabularData($document, RecordFormat::ListWithHeader); // NDJSON with array list // ["Name","Score"] // ["Alice",42] // ["Bob",27] $decoded = $builder->decodeTabularData($json);