glow / robots
Robots.txt parser and generator toolset
Installs: 1 309
Dependents: 0
Suggesters: 0
Security: 0
Stars: 11
Watchers: 6
Forks: 1
Open Issues: 3
Requires
- php: >=5.5.0
- tomverran/robots-txt-checker: 1.02
Requires (Dev)
- codacy/coverage: dev-master
- fzaninotto/faker: ^1.5
- phpunit/phpunit: ^4.8
This package is not auto-updated.
Last update: 2018-06-27 18:59:32 UTC
README
A PHP 5.5 (or greater) toolset for parsing, validating, and generating a robots.txt file.
Installing
The recommended way to install Glow\Robots is to use Composer.
composer require glow/robots
Badges
Service | Badge |
---|---|
SensioLabs | |
Codecov | |
Travis CI | |
Gitter Chat | |
Packagist | |
Codacy |
Usage
Parser
The parser class is used to parse the robots.txt contents into human readable arrays. It can gracefully skip errors occurred, and in some cases fix the errors during the parse procedure.
Methods
Method | Visibility | Description |
---|---|---|
__construct | public | Class Construct |
parse | protected | Parse the contents of the robots.txt source |
parseLine | protected | Workhorse method for parsing the robots.txt lines |
getParsed | public | Returns the parsed contents |
getErrors | public | Returns all of the errors that occurred |
setError | protected | Sets an error |
incrementCounter | protected | Increments counters used throughout the parsing process |
isAllowed | public | Used to determine if a url path is allowed to be crawled |
isDisallowed | public | Used to determine if a url path is not allowed to be crawled |
getTR | public | Returns the tomverran/robots-txt-checker class |
getElements | public | Returns the elements that are searched for during parsing |
setElements | public | Sets the elements that are searched for during parsing |
getMeta | public | Returns the meta data extrapolated during parsing |
getSitemaps | public | Returns an array of sitemaps discovered during the parsing |
getUserAgentData | public | Returns all of the parsed directives for a specified useragent |
getUserAgentAllow | public | Returns all of the allowed elements for a specified useragent |
getUserAgentDisallow | public | Returns all of the disallowed elements for a specified useragent |
Basic Usage
A basic example of parser usage:
$p = new Glow\Robots\Parser(file_get_contents('http://cnn.com/robots.txt'));
Validate
The validate class is used to scan for errors and validate the robots.txt contents.
Methods
Method | Visibility | Description |
---|---|---|
__construct | public | Class construct |
check | public | Check the source for errors |
Basic Usage
A basic example of validate usage:
$p = new Glow\Robots\Validate(file_get_contents('http://cnn.com/robots.txt'));
if ($p->check()===false) {
//something is wrong with our robots.txt file
}
else {
//hooray everything is good with our file!
}