wiki-connect / parsewiki
A library that helps parse wikitext template data
2.0
2025-06-29 21:46 UTC
Requires (Dev)
- phpstan/phpstan: ^2.1
- phpunit/phpunit: ^12.2
README
A powerful PHP library for parsing MediaWiki-style content from raw wiki text.
๐ Overview
This library allows you to extract:
- Templates (single, multiple, nested)
- Internal wiki links
- External links
- Citations (references)
- Categories (with or without display text)
Perfect for handling wiki-formatted text in PHP projects.
๐๏ธ Project Structure
ParserTemplates
: Parses multiple templates.ParserTemplate
: Parses a single template.ParserInternalLinks
: Parses internal wiki links.ParserExternalLinks
: Parses external links.ParserCitations
: Parses citations and references.ParserCategories
: Parses categories from wiki text.DataModel
classes:Template
InternalLink
ExternalLink
Citation
tests/
: Contains PHPUnit test files:ParserTemplatesTest
ParserTemplateTest
ParserInternalLinksTest
ParserExternalLinksTest
ParserCitationsTest
ParserCategoriesTest
๐ Features
- โ Parse single and multiple templates.
- โ Support nested templates.
- โ Handle named and unnamed template parameters.
- โ Extract internal links with or without display text.
- โ Extract external links with or without labels.
- โ Parse citations including attributes and special characters.
- โ Parse categories, support custom namespaces, handle whitespaces and special characters.
- โ Full PHPUnit test coverage.
โ๏ธ Requirements
- PHP 8.0 or higher
- PHPUnit 9 or higher
๐ป Installation
composer require wiki-connect/parsewiki
Make sure you have proper PSR-4 autoloading for the WikiConnect\ParseWiki
namespace.
๐งช Running Tests
vendor/bin/phpunit tests
Test Coverage:
- Templates: Single, multiple, nested, named/unnamed parameters.
- Internal Links: Simple, with display text, special characters.
- External Links: With/without labels, multiple links, whitespace handling.
- Citations: With/without attributes, special characters.
- Categories: Simple, with display text, custom namespaces, whitespaces, special characters.
โจ Example Usage
Parsing Templates
use WikiConnect\ParseWiki\ParserTemplates; $text = '{{Infobox person|name=John Doe|birth_date=1990-01-01}}'; $parser = new ParserTemplates($text); $templates = $parser->getTemplates(); foreach ($templates as $template) { echo $template->getName(); print_r($template->getParameters()); }
Parsing Internal Links
use WikiConnect\ParseWiki\ParserInternalLinks; $text = 'See [[Main Page|the main page]] and [[Help]].'; $parser = new ParserInternalLinks($text); $links = $parser->getTargets(); foreach ($links as $link) { echo 'Target: ' . $link->getTarget() . PHP_EOL; echo 'Text: ' . ($link->getText() ?? $link->getTarget()) . PHP_EOL; }
Parsing External Links
use WikiConnect\ParseWiki\ParserExternalLinks; $text = 'Visit [https://example.com Example Site] and [https://nolabel.com].'; $parser = new ParserExternalLinks($text); $links = $parser->getLinks(); foreach ($links as $link) { echo 'URL: ' . $link->getLink() . PHP_EOL; echo 'Label: ' . ($link->getText() ?: 'No label') . PHP_EOL; }
Parsing Citations
use WikiConnect\ParseWiki\ParserCitations; $text = 'Some text with a citation.<ref name="source">This is a citation</ref>'; $parser = new ParserCitations($text); $citations = $parser->getCitations(); foreach ($citations as $citation) { echo 'Content: ' . $citation->getContent() . PHP_EOL; echo 'Attributes: ' . $citation->getAttributes() . PHP_EOL; }
Parsing Categories
use WikiConnect\ParseWiki\ParserCategories; $text = 'Some text [[Category:Science]] and [[Category:Math|Displayed]].'; $parser = new ParserCategories($text); $categories = $parser->getCategories(); foreach ($categories as $category) { echo 'Category: ' . $category . PHP_EOL; }
๐ Author
Developed with โค๏ธ by Gerges.