innmind/crawler-app

Crawl the web and publish the graph to an api

Installs: 12

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 3

Forks: 1

Open Issues: 3

Type:project

pkg:composer/innmind/crawler-app

1.5.2 2020-10-25 17:28 UTC

This package is auto-updated.

Last update: 2025-09-29 01:56:09 UTC


README

Build Status codecov Type Coverage

This is an app to crawl internet and publish resource attributes to a Library.

Installation

composer install
docker-compose up -d

Copy config/.env.dist to config/.env and adapt the url of the amqp server to your need.

Usage

bin/crawler consume crawler

This will launch a consumer to read the urls to crawl

bin/console crawl http://the.url/to/crawl https://innmind_library.host/

This will crawl http://the.url/to/crawl, extract the resource attributes and publish them to the library https://innmind_library.host/. It will automatically detect the api resource to publish to.