bakame/aide-ndjson

A simple class to handle Jsonlines in PHP

1.0.0 2025-09-11 08:40 UTC

This package is auto-updated.

Last update: 2025-09-13 14:35:45 UTC


README

A robust NDJSON?JSONL Encoder/Decoder for PHP

Introduction

NdJson is a robust PHP utility for encoding, decoding, streaming, and tabular parsing of NDJSON (Newlinw-Delimited JSON) — also commonly known as JSON Lines (JSONL). Both names refer to the same format: one JSON object per line, separated by \n.

It supports both object-row and list-header formats, streaming iterators, and static-analysis-friendly types for PHPStan/Psalm.

Installation

Composer

composer require bakame-php/aide-ndjson

System Requirements

You need:

  • PHP >= 8.1 but the latest stable version of PHP is recommended
  • latest version of league/csv

Usage

Builder

New feature introduced in version 1.1.0

The Builder() class provides a fluent, immutable API for configuring and working with NdJson. It lets you define reusable defaults (such as mappers, formatters, chunk size, encoding flags, and record format) and then apply them consistently across multiple operations.

Overview

use League\Csv\NdJson\Builder;

$builder = new Builder();

Each configuration method returns a new instance. This ensures immutability and allows for safely sharing base configurations.

Configuration

  • Mapper: mapper(?callable $mapper): self Transform each decoded row into another representation.
  • Formatter: formatter(?callable $formatter): self Transform each record before encoding.
  • Chunk Size: chunkSize(int $size): self Controls how many records are grouped together per JSON chunk (default: 1).
  • Flags flags(int $flags): self JSON encoding flags (same as json_encode, JSON_PRETTY_PRINT is ignored).
  • Format format(RecordFormat $format): self Defines the output representation (Object or List) when working with tabular data.

Example:

$builder = (new Builder())
    ->formatter(fn ($row) => ['value' => $row])
    ->mapper(fn (array $row) => $row['value'])
    ->chunkSize(100)
    ->flags(JSON_THROW_ON_ERROR)
    ->format(RecordFormat::List);

Encoding NDJSON

Different encoding strategies are supported, depending on how you want to generate your NDJSON content. You can encode using:

  • the encode() method to output a string
  • the write() method to store the output into a file ;
  • the download() method to encode and make the file downloadable via any HTTP client.

Encode an array of data to NDJSON/JSONL string

$data = [
    ['name' => 'Alice', 'score' => 42],
    ['name' => 'Bob', 'score' => 27],
];

$ndjson = (new Builder())->encode($data);
echo $ndjson;

/*
{"name":"Alice","score":42}
{"name":"Bob","score":27}
*/

Write NDJSON/JSONL directly to a file

$data = [
    ['user' => 'Charlie', 'active' => true],
    ['user' => 'Diana', 'active' => false],
];

(new Builder())->write($data, __DIR__ . '/users.ndjson');

Make NDJSON downloadable via HTTP

// In a controller:
(new Builder())->download(
    [['id' => 1, 'value' => 'foo'], ['id' => 2, 'value' => 'bar']],
    filename: 'export.ndjson'
);

Decoding NDJSON

  • the decode() method to decode a NDJSON/JSONL string
  • the read() method to retrieve NDJSON/JSONL conten from a file;

Decode a string

$content = <<<NDJSON
{"name":"Alice","score":42}
{"name":"Bob","score":27}
NDJSON;

foreach (new Builder())->decode($content) as $row) {
    var_dump($row);
}
/*
array(2) { ["name"]=> string(5) "Alice" ["score"]=> int(42) }
array(2) { ["name"]=> string(3) "Bob"   ["score"]=> int(27) }
*/

Decode with mapper

$content = <<<NDJSON
{"value":1}
{"value":2}
{"value":3}
NDJSON;

$iterator = (new Builder())
    ->formatter(fn (array $row): int => $row['value'] * 10)
    ->decode($content);

foreach ($iterator as $num) {
    echo $num, PHP_EOL;
}
/*
10
20
30
*/

Working with Tabular data

the Builder can parse and encode tabular data in two forms:

  • Object rows: {"Name":"Gilbert","Score":24}
  • List rows with header: ["Name","Score"] followed by values
decodeTabularData(Stringable|string $content, array $header = []): TabularData
encodeTabularData(TabularData|TabularDataProvider $tabularData): string

readTabularData(mixed $path, array $header = [], $context = null): TabularData
writeTabularData(TabularData|TabularDataProvider $tabularData, mixed $to, $context = null): int

Parses a file or stream as tabular data. Auto-detects headers if $header is empty.

$tabular = (new Builder())->decodeTabularData($ldjsonString);
foreach ($tabular as $record) {
    var_dump($record);
}

The returned TabularData can be further processed using all the features from the League\Csv package.

use League\Csv\Statement;

$query = (new Statement())
    ->andWhere('score', '=', 10) 
    ->whereNot('name', 'starts_with', 'P') //filtering is done case-sensitively on the first character of the column value;

$tabular = (new Builder())->readTabularData('/tmp/scores.ldjson');
foreach ($query->process($tabular)->getRecordsAsObject(Player::class) as $player) {
    echo $player->name, PHP_EOL;
}

Conversely, you can encode the TabularData as shown below:

use Bakame\Aide\NdJson\Builder;
use Bakame\Aide\NdJson\RecordFormat;

$csv = <<<CSV
Name,Score
Alice,42
Bob,27
CSV;

$document = Reader::createFromString($csv);
$document->setHeaderOffset(0);

echo (new Builder())
    ->format(RecordFormat::ListWithHeader)
    ->encodeTabularData($document);

$builder = new Builder();
$json = $builder->encodeTabularData($document);
// NDJSON with object records
// {"Name":"Alice","Score":42}
// {"Name":"Bob","Score":27}
$json = $builder->format(RecordFormat::ListWithHeader)->encodeTabularData($document, RecordFormat::ListWithHeader);
// NDJSON with array list
// ["Name","Score"]
// ["Alice",42]
// ["Bob",27]

$decoded = $builder->decodeTabularData($json);