README

🚀 Laravel Vector Indexer

AI-Powered Search for Text, Images, Documents, Audio & Video

Semantic search across all your content - powered by OpenAI Vision, Whisper & Embeddings

Features • Installation • Quick Start • Documentation • Examples

🎯 What is This?

Laravel Vector Indexer transforms your Laravel application with:

🔍 Semantic Text Search - Search by meaning, not just keywords
📸 Image Search - Find images by visual content using GPT-4 Vision
📄 Document Search - Search PDFs, DOCX, and more by content
🎙️ Audio Search - Transcribe and search audio with Whisper
🎬 Video Search - Search videos by audio + visual content
🤖 AI Chat with RAG - ChatGPT-style conversations with your data
🔐 Enterprise Security - Row-level security and permission-based access
🏢 Multi-Tenant Ready - Perfect for SaaS applications
⚡ Production Ready - Queue support, Horizon integration, caching

⚡ Before & After

❌ Traditional Search	✅ With Vector Indexer
// Exact keyword matching only $posts = Post::where('title', 'LIKE', '%AI%') ->orWhere('content', 'LIKE', '%AI%') ->get(); // Returns: Only posts with "AI" // Misses: "artificial intelligence", // "machine learning", etc.	// Semantic understanding $posts = Post::vectorSearch( 'artificial intelligence' ); // Returns: AI, ML, neural networks, // deep learning, GPT, etc. // Understands context & meaning!
// Audio content not searchable $podcasts = Podcast::where( 'title', 'LIKE', '%climate%' )->get(); // Can't search audio content // Manual transcription needed	// Search transcribed audio $podcasts = Podcast::vectorSearch( 'climate change discussion' ); // Automatically transcribes audio // Searches across content!
// Can't search images $products = Product::where( 'name', 'LIKE', '%laptop%' )->get(); // Only searches text fields // Visual content ignored	// Search by visual content $products = Product::vectorSearch( 'red laptop with backlit keyboard' ); // Searches images, text, PDFs, videos // Multi-modal AI search!

❌ Traditional Search

✅ With Vector Indexer

// Exact keyword matching only
$posts = Post::where('title', 'LIKE', '%AI%')
    ->orWhere('content', 'LIKE', '%AI%')
    ->get();

// Returns: Only posts with "AI"
// Misses: "artificial intelligence",
//         "machine learning", etc.

// Semantic understanding
$posts = Post::vectorSearch(
    'artificial intelligence'
);

// Returns: AI, ML, neural networks,
//          deep learning, GPT, etc.
// Understands context & meaning!

// Audio content not searchable
$podcasts = Podcast::where(
    'title', 'LIKE', '%climate%'
)->get();

// Can't search audio content
// Manual transcription needed

// Search transcribed audio
$podcasts = Podcast::vectorSearch(
    'climate change discussion'
);

// Automatically transcribes audio
// Searches across content!

// Can't search images
$products = Product::where(
    'name', 'LIKE', '%laptop%'
)->get();

// Only searches text fields
// Visual content ignored

// Search by visual content
$products = Product::vectorSearch(
    'red laptop with backlit keyboard'
);

// Searches images, text, PDFs, videos
// Multi-modal AI search!

🎯 Supported File Types

Type	Formats	Count
📸 Images	JPG, PNG, GIF, WEBP, SVG, HEIC, etc.	9 formats
📄 Documents	PDF, DOCX, TXT, CSV, XLSX, PPT, etc.	11 formats
🎵 Audio	MP3, WAV, OGG, FLAC, M4A, AAC, etc.	7 formats
🎬 Video	MP4, AVI, MOV, MKV, WEBM, etc.	8 formats

Total: 35+ file formats with AI-powered search!

📑 Table of Contents

✨ Features
📦 Installation
🚀 Quick Start
📸 Media Embedding ← NEW!
🎙️ Speech-to-Text
🤖 AI Chat with RAG
🔐 Security & Authorization
⚙️ Commands
🔧 Configuration
🎨 Advanced Usage
🏢 Multi-Tenant & SaaS
⚡ Performance
📚 Documentation
💡 Use Cases
🔍 Real-World Examples
❓ FAQ
🐛 Troubleshooting

✨ Features

🔍 Semantic Search

Natural Language - Search by meaning, not keywords
Relevance Scoring - Results ranked by similarity
Find Similar - Discover related content
Multi-Field - Search across multiple fields
Relationship Support - Index related data

📸 Media Embedding

Image Search - GPT-4 Vision describes images
Document Search - PDF, DOCX, TXT, CSV, etc.
Audio Search - Whisper transcription
Video Search - Audio + visual content
35+ Formats - All major file types

🔐 Enterprise Security

Row-Level Security - User-specific data filtering
Spatie Permission - Role & permission integration
Multi-Tenant - Organization isolation
Audit Logging - Track all operations
Custom Gates - Fine-grained control

⚡ Performance

Queue Support - Async processing
Smart Caching - Reduce API calls
Batch Operations - Process in chunks
N+1 Prevention - Intelligent eager loading
Optimized - <100ms search times

🤖 Automation

Auto-Analysis - Suggests optimal config
Real-Time Indexing - Auto-index on save
Relationship Tracking - Auto-reindex related
Duplicate Prevention - Smart deduplication
Status Monitoring - Track progress

🎨 Developer Experience

Simple API - Easy to use traits
Artisan Commands - CLI for everything
Comprehensive Docs - Detailed guides
Examples - Real-world use cases
Laravel 9/10/11 - Full compatibility

🤖 AI & Chat

RAG Support - Retrieval Augmented Generation
Context-Aware AI - AI knows YOUR data
laravel-ai-support - Seamless integration
Conversation Memory - Multi-turn conversations
Source Citations - Traceable responses

📦 Installation

Quick Install

# Add repository to composer.json
composer config repositories.bites-vector-indexer vcs https://github.com/bites-development/laravel-vector-indexer.git

composer require bites/laravel-vector-indexer

Then

# Install package
composer require bites/laravel-vector-indexer:dev-main

📖 For detailed installation instructions, see INSTALLATION.md

2. Publish Config (Optional)

The package config has sensible defaults. Only publish if you need to customize settings:

# Publish config (optional)
php artisan vendor:publish --tag=vector-indexer-config

Note: Migrations are auto-loaded from the package. Do NOT publish them unless you need to modify the schema.

3. Run Migrations

php artisan migrate

4. Configure Environment

Add to your .env:

# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key
OPENAI_EMBEDDING_MODEL=text-embedding-3-large

# Qdrant Configuration
QDRANT_HOST=http://localhost:6333
QDRANT_API_KEY=your-qdrant-api-key  # Optional

# Queue Configuration (optional)
VECTOR_QUEUE_NAME=vector-indexing

🚀 Quick Start

⚡ 5-Minute Setup

1. Add Traits to Your Model

use Bites\VectorIndexer\Traits\Vectorizable;
use Bites\VectorIndexer\Traits\HasVectorSearch;

class Post extends Model
{
    use Vectorizable, HasVectorSearch;
    
    // Your model code...
}

2. Generate Configuration

# Analyze model and generate config
php artisan vector:analyze "App\Models\Post"
php artisan vector:generate-config "App\Models\Post"

3. Start Watching for Changes

# Enable auto-indexing on model changes
php artisan vector:watch "App\Models\Post"

4. Index Existing Records

# Index all existing records
php artisan vector:index "App\Models\Post"

# Or with queue
php artisan vector:index "App\Models\Post" --queue

5. Search!

// Simple search
$posts = Post::vectorSearch("Laravel best practices");

// With filters
$posts = Post::vectorSearch("Laravel", filters: [
    'status' => 'published',
    'author_id' => 123
]);

// With limit and threshold
$posts = Post::vectorSearch("Laravel", limit: 5, threshold: 0.7);

// Find similar
$similar = $post->findSimilar(limit: 10);

// With user context (for permission checks)
$posts = Post::vectorSearch("Laravel", user: $request->user());

Security & Authorization

Setup Permissions

# Install Spatie Permission (if not already installed)
composer require spatie/laravel-permission

# Setup vector permissions
php artisan vector:setup-permissions --role=admin

Permission-Based Search

The package integrates seamlessly with Spatie Permission:

// Give user permission to search
$user->givePermissionTo('view posts');

// Search automatically checks permissions
$results = Post::vectorSearch('query', user: $user);

Row-Level Security

Implement user-specific filtering in your models:

class Post extends Model
{
    use Vectorizable, HasVectorSearch;
    
    protected static function applyUserFilters(array $filters, ?object $user = null): array
    {
        $user = $user ?? auth()->user();
        
        if (!$user) {
            return $filters;
        }
        
        // Users only see their organization's posts
        if (!$user->hasRole('admin')) {
            $filters['organization_id'] = $user->organization_id;
        }
        
        return $filters;
    }
}

// Now searches automatically filter by user's organization
$results = Post::vectorSearch('query');

See SPATIE_INTEGRATION.md and ROW_LEVEL_SECURITY.md for complete guides.

📸 Media Embedding (Images, Documents, Audio, Video)

🎯 Search Across All Content Types

Not just text! Search images, PDFs, audio, and video using AI-powered embeddings.

🚀 Quick Start

1. Add the trait:

use Bites\VectorIndexer\Traits\HasMediaEmbeddings;

class Product extends Model
{
    use Vectorizable, HasVectorSearch, HasMediaEmbeddings;
    
    public function getMediaFilesForEmbedding(): array
    {
        return [
            storage_path('app/products/' . $this->image),
            storage_path('app/brochures/' . $this->pdf),
            storage_path('app/videos/' . $this->demo_video),
        ];
    }
}

2. Install dependencies:

# For PDF support
sudo apt-get install poppler-utils

# For video/audio support
sudo apt-get install ffmpeg

3. Test media embedding:

# Test an image
php artisan vector:test-media storage/app/products/laptop.jpg

# Test a document
php artisan vector:test-media storage/app/docs/contract.pdf

# Test audio
php artisan vector:test-media storage/app/audio/podcast.mp3

# Test video
php artisan vector:test-media storage/app/videos/tutorial.mp4

4. Index and search:

php artisan vector:index "App\Models\Product"

// Search by visual content
$products = Product::vectorSearch('red laptop with backlit keyboard');

// Search by document content
$products = Product::vectorSearch('warranty 2 years international');

// Search by video content
$products = Product::vectorSearch('unboxing tutorial');

📊 How It Works

File Type	Process	API Used
📸 Images	Describe image → Embed description	GPT-4 Vision
📄 Documents	Extract text → Embed text	Text extraction
🎵 Audio	Transcribe → Embed transcription	Whisper
🎬 Video	Extract audio + frames → Transcribe + Describe → Combine	Whisper + Vision

💡 Use Cases

E-Commerce: Search products by image ("blue running shoes")
Document Management: Search PDFs by content ("privacy policy")
Video Platform: Search videos by transcript ("Laravel tutorial")
Podcast Platform: Search episodes by content ("startup founders")
Asset Management: Find any file by its content

📖 Full Documentation

See MEDIA_EMBEDDING_GUIDE.md for:

Detailed usage instructions
Performance optimization
Cost estimates
Troubleshooting
Advanced examples

🎙️ Speech-to-Text (Audio Transcription)

🎯 Why Use STT?

Transform audio content into searchable text:

Podcasts - Search across episode transcriptions
Voicemails - Find specific messages instantly
Meetings - Search recorded discussions
Videos - Index video audio tracks
Customer Calls - Compliance and quality monitoring

⚡ Quick Setup

Add the trait to models with audio fields:

use Bites\VectorIndexer\Traits\HasAudioTranscription;

class Podcast extends Model
{
    use Vectorizable, HasVectorSearch, HasAudioTranscription;
    
    protected $audioField = 'audio_file';
    protected $transcriptionField = 'transcription';
}

Usage

// Transcribe audio
$transcription = $podcast->transcribeAudio();

// With timestamps
$result = $podcast->transcribeWithTimestamps();

// Translate to English
$englishText = $podcast->translateAudio();

// Search transcribed audio
$results = Podcast::vectorSearch('machine learning discussion');

Supported formats: MP3, M4A, WAV, WEBM, OGG, FLAC, MP4, MPEG
Cost: $0.006 per minute
Languages: 50+ languages supported

See STT_INTEGRATION.md for complete guide.

🤖 AI Chat with RAG (Retrieval Augmented Generation)

🎯 What is RAG?

Combine your vector search with AI chat for intelligent, context-aware conversations:

Your Data - AI knows about YOUR content, not just general knowledge
Source Citations - Responses reference specific records
Permission-Aware - Users only see authorized data

⚡ Quick Setup

1. Install laravel-ai-engine:

composer require m-tech-stack/laravel-ai-engine

2. Add trait to your model:

use Bites\VectorIndexer\Traits\HasVectorChat;

class Post extends Model
{
    use Vectorizable, HasVectorSearch, HasVectorChat;
}

3. Chat with AI:

use MTechStack\AiEngine\Facades\AiEngine;

// Get context from your data
$ragData = Post::getRAGContext('What are the latest features?', auth()->user());

// AI responds using YOUR content
$response = AiEngine::conversation('user-123')
    ->systemPrompt($ragData['system_prompt'])
    ->generate('What are the latest features?');

echo $response->content;
// "Based on your posts, the latest features include..."

💡 Real-World Example

// Customer support bot with context
$ragData = Ticket::getRAGContext(
    query: 'How do I reset my password?',
    user: auth()->user(),
    options: ['filters' => ['status' => 'resolved']]
);

$response = AiEngine::conversation($userId)
    ->systemPrompt($ragData['system_prompt'])
    ->generate('How do I reset my password?');

// AI answers using resolved tickets as examples
// "According to Source 1 (Ticket #123), you can reset your password by..."

See RAG_INTEGRATION.md for complete guide.

⚙️ Commands

📋 All Available Commands

Analyze Model

php artisan vector:analyze "App\Models\Post"

Analyzes model structure and suggests configuration.

Generate Configuration

php artisan vector:generate-config "App\Models\Post"

Creates vector indexing configuration for the model.

Watch Model

php artisan vector:watch "App\Models\Post"

Enables auto-indexing on model changes.

Unwatch Model

php artisan vector:unwatch "App\Models\Post"

Disables auto-indexing.

Index Records

# Synchronous
php artisan vector:index "App\Models\Post"

# With queue
php artisan vector:index "App\Models\Post" --queue

# Specific IDs
php artisan vector:index "App\Models\Post" --ids=1,2,3

# Force re-index
php artisan vector:index "App\Models\Post" --force

Check Status

php artisan vector:status "App\Models\Post"

Setup Permissions

php artisan vector:setup-permissions --role=admin

Creates all required permissions and assigns them to a role.

Test Speech-to-Text

php artisan vector:test-stt /path/to/audio.mp3
php artisan vector:test-stt --url "https://example.com/audio.mp3"
php artisan vector:test-stt audio.mp3 --storage --timestamps

Test audio transcription with detailed output and cost estimation.

Test RAG Integration

php artisan vector:test-rag "App\Models\Post" "What are the latest features?"
php artisan vector:test-rag "App\Models\Post" "Laravel tips" --chat
php artisan vector:test-rag "App\Models\Document" "contracts" --user=1

Test RAG (Retrieval Augmented Generation) with vector context and AI chat.

🔧 Configuration

The config/vector-indexer.php file contains all configuration options:

return [
    // OpenAI settings
    'openai' => [
        'api_key' => env('OPENAI_API_KEY'),
        'model' => env('OPENAI_EMBEDDING_MODEL', 'text-embedding-3-large'),
        'dimensions' => 3072,
    ],

    // Qdrant settings
    'qdrant' => [
        'host' => env('QDRANT_HOST', 'http://localhost:6333'),
        'api_key' => env('QDRANT_API_KEY'),
    ],

    // Queue settings
    'queue' => [
        'enabled' => env('VECTOR_QUEUE_ENABLED', true),
        'queue_name' => env('VECTOR_QUEUE_NAME', 'vector-indexing'),
    ],

    // Chunking settings
    'chunking' => [
        'max_chunk_size' => 1000,
        'overlap' => 100,
    ],
    
    // Authorization settings
    'authorization' => [
        'enabled' => env('VECTOR_AUTH_ENABLED', true),
        'admin_roles' => ['super-admin', 'admin', 'developer'],
        'allow_console_without_auth' => env('VECTOR_ALLOW_CONSOLE', false),
        'allow_authenticated_search' => env('VECTOR_ALLOW_AUTH_SEARCH', true),
    ],
];

🎨 Advanced Usage

Custom Field Weights

// In your VectorConfiguration
'fields' => [
    'title' => ['weight' => 3],      // Higher weight = more important
    'content' => ['weight' => 1],
    'excerpt' => ['weight' => 2],
]

Relationship Indexing

The package automatically indexes relationships:

// Post model with relationships
public function author() { return $this->belongsTo(User::class); }
public function tags() { return $this->belongsToMany(Tag::class); }
public function comments() { return $this->hasMany(Comment::class); }

// All relationships are automatically indexed!

Search with Filters

$posts = Post::vectorSearch("Laravel tutorials", filters: [
    'status' => 'published',
    'author_id' => $userId,
    'created_at' => ['gte' => now()->subDays(30)]
]);

Batch Processing

// Process in batches
php artisan vector:index "App\Models\Post" --batch=50

Queue Configuration

For production, configure Laravel Horizon:

// config/horizon.php
'vector-supervisor' => [
    'connection' => 'redis',
    'queue' => ['vector-indexing'],
    'balance' => 'auto',
    'maxProcesses' => 5,
    'memory' => 256,
    'timeout' => 300,
    'tries' => 3,
],

Testing

composer test

Module Indexing

The package automatically indexes module names for better organization:

// For module models: Bites\Modules\MailBox\Models\EmailCache
// Module: "MailBox"

// For app models: App\Models\User
// Module: "users" (table name fallback)

// Search with module filter
$results = EmailCache::vectorSearch("budget", filters: [
    'module' => 'MailBox'
]);

API Controller Example

use Illuminate\Http\Request;
use App\Models\Post;

class SearchController extends Controller
{
    public function search(Request $request)
    {
        try {
            $results = Post::vectorSearch(
                query: $request->query,
                limit: $request->limit ?? 20,
                user: $request->user() // Automatic permission check
            );
            
            return response()->json([
                'results' => $results,
                'count' => $results->count(),
            ]);
            
        } catch (\Illuminate\Auth\Access\AuthorizationException $e) {
            return response()->json([
                'error' => 'Unauthorized',
                'message' => $e->getMessage(),
            ], 403);
        }
    }
}

🏢 Multi-Tenant & SaaS Applications

Perfect for multi-tenant applications with automatic data isolation:

class Document extends Model
{
    use Vectorizable, HasVectorSearch;
    
    protected static function applyUserFilters(array $filters, ?object $user = null): array
    {
        $user = $user ?? auth()->user();
        
        if (!$user) {
            return $filters;
        }
        
        // Isolate by workspace
        $filters['workspace_id'] = $user->workspace_id;
        
        return $filters;
    }
}

// Searches automatically filtered to user's workspace
$results = Document::vectorSearch('contract');

⚡ Performance

Memory Usage

8 users: ~40MB (includes Laravel overhead)
100 users: ~50-60MB (scales efficiently)
Optimized: Batch processing and caching

Speed

Indexing: ~6 seconds per record (with embeddings)
Search: <100ms for most queries
Queue: Async processing for large datasets

Optimization Tips

Use Queue Workers for production
Reduce Relationships - Limit depth to 1-2 levels
Cache Accessible IDs for permission checks
Batch Processing - Use --batch flag

Requirements

PHP 8.1+
Laravel 9.x, 10.x, or 11.x
OpenAI API Key
Qdrant instance (local or cloud)
Spatie Permission (optional, for authorization)

📚 Documentation

📖 Complete Guides

QUICK_START.md - Get started in 5 minutes
SPATIE_INTEGRATION.md - Spatie Permission integration
ROW_LEVEL_SECURITY.md - User-specific data filtering
SEARCH_AUTHORIZATION.md - Search permission control
AUTHORIZATION.md - Complete authorization guide
SECURITY_SETUP.md - Security quick setup
PERMISSION_QUICK_REFERENCE.md - Permission cheat sheet
PUBLISHING_GUIDE.md - Asset publishing guide
STT_INTEGRATION.md - Speech-to-Text integration guide
RAG_INTEGRATION.md - 🆕 AI Chat with RAG (laravel-ai-support)

Examples

examples/UserModelWithFilters.php - User model with filtering
examples/RowLevelSecurity.php - 7 real-world RLS examples
examples/PodcastWithSTT.php - Podcast with audio transcription
examples/RAGChatExample.php - 🆕 15 RAG chat examples

💡 Use Cases

✅ Content Management

Search articles, posts, and pages
Find similar content
Multi-language support

✅ E-commerce

Product search with semantic understanding
"Find similar products"
Customer support ticket search

✅ SaaS Applications

Multi-tenant data isolation
Workspace-specific search
Organization-level filtering

✅ Healthcare

HIPAA-compliant patient record search
Doctor-patient relationship filtering
Department-level access control

✅ Document Management

Semantic document search
Shared folder access
Time-based access control

✅ Customer Support

Ticket search and categorization
Knowledge base search
Similar issue detection

✅ Media & Podcasts

Transcribe audio/video content
Search across podcast transcriptions
Multi-language content support
Timestamp-based navigation

✅ Voicemail & Call Centers

Transcribe voicemails automatically
Search call recordings
Sentiment analysis on transcriptions
Compliance and quality monitoring

🔍 Real-World Examples

Example 1: E-Commerce Product Search

class Product extends Model
{
    use Vectorizable, HasVectorSearch;
}

// Traditional search - limited
$products = Product::where('name', 'LIKE', '%laptop%')->get();

// Semantic search - understands intent
$products = Product::vectorSearch('portable computer for programming');
// Returns: laptops, notebooks, ultrabooks, workstations, etc.

// Find similar products
$similar = $product->findSimilar(limit: 5);

Example 2: Multi-Tenant SaaS

class Document extends Model
{
    use Vectorizable, HasVectorSearch;
    
    protected static function applyUserFilters(array $filters, ?object $user = null): array
    {
        $user = $user ?? auth()->user();
        
        // Automatic workspace isolation
        $filters['workspace_id'] = $user->workspace_id;
        
        // Department-level access
        if (!$user->hasRole('admin')) {
            $filters['department_id'] = $user->department_id;
        }
        
        return $filters;
    }
}

// Users only see their workspace documents
$results = Document::vectorSearch('contract agreement', user: auth()->user());

Example 3: Podcast Platform

class Podcast extends Model
{
    use Vectorizable, HasVectorSearch, HasAudioTranscription;
    
    protected $audioField = 'audio_file';
    protected $transcriptionField = 'transcription';
}

// Automatically transcribe and index
$podcast = Podcast::create([
    'title' => 'AI in 2024',
    'audio_file' => 's3://podcasts/ai-2024.mp3',
]);

$transcription = $podcast->transcribeAudio();

// Search across transcriptions
$results = Podcast::vectorSearch('machine learning trends');

// Find similar episodes
$similar = $podcast->findSimilar(limit: 5);

Example 4: Healthcare (HIPAA Compliant)

class PatientRecord extends Model
{
    use Vectorizable, HasVectorSearch;
    
    protected static function applyUserFilters(array $filters, ?object $user = null): array
    {
        $user = $user ?? auth()->user();
        
        if ($user->hasRole('doctor')) {
            // Doctors see their assigned patients
            $filters['doctor_id'] = $user->id;
        } elseif ($user->hasRole('nurse')) {
            // Nurses see department patients
            $filters['department_id'] = $user->department_id;
        }
        
        return $filters;
    }
}

// HIPAA-compliant search with automatic filtering
$records = PatientRecord::vectorSearch('diabetes treatment', user: auth()->user());

Example 5: Customer Support

class Ticket extends Model
{
    use Vectorizable, HasVectorSearch;
}

// Find similar tickets
$similarTickets = $currentTicket->findSimilar(limit: 10);

// Search knowledge base
$solutions = Ticket::vectorSearch('password reset not working')
    ->where('status', 'resolved')
    ->get();

// Suggest solutions to agents
foreach ($similarTickets as $ticket) {
    echo "Similar: {$ticket->title} - Resolution: {$ticket->resolution}\n";
}

❓ FAQ

General Questions

Q: Do I need to publish migrations?
A: No, migrations auto-load from the package. Only publish if you need to modify the schema.

Q: Can I use this with existing Spatie permissions?
A: Yes! It integrates seamlessly with your existing permission structure.

Q: How do I implement multi-tenant search?
A: Use applyUserFilters() in your model to filter by organization/workspace.

Q: What's the cost of OpenAI embeddings?
A: ~$0.13 per 1M tokens. A typical record costs <$0.001 to index.

Q: Can users only see their own data?
A: Yes! Implement row-level security with applyUserFilters().

Q: Does it work with Laravel 9?
A: Yes, supports Laravel 9.x, 10.x, and 11.x.

Audio/STT Questions

Q: What audio formats are supported?
A: MP3, M4A, WAV, WEBM, OGG, FLAC, MP4, MPEG (max 25MB).

Q: How much does audio transcription cost?
A: $0.006 per minute. A 60-minute podcast costs ~$0.36.

Q: Can I transcribe non-English audio?
A: Yes! Supports 50+ languages including Spanish, French, German, Chinese, Japanese, etc.

Q: How long does transcription take?
A: ~0.1x real-time. A 60-minute podcast takes ~6 minutes to transcribe.

Q: Are transcriptions cached?
A: Yes, by default for 24 hours. Configurable via STT_CACHE_DURATION.

Performance Questions

Q: How fast is semantic search?
A: <100ms for most queries with proper indexing.

Q: Can I process large datasets?
A: Yes! Use queue workers and batch processing.

Q: How much memory does it use?
A: ~40-60MB for typical use cases with proper optimization.

Q: Should I use queues in production?
A: Yes, highly recommended for async processing.

🐛 Troubleshooting

"Class App\Models\VectorIndexQueue not found"

Solution: Clear autoload cache

composer dump-autoload
php artisan config:clear

"You do not have permission to search"

Solution: Setup permissions

php artisan vector:setup-permissions --role=admin

$user->givePermissionTo('view users');

High Memory Usage

Solution: Optimize relationships

// Reduce relationship depth in config
'max_relationship_depth' => 1,

// Or use queue workers
php artisan vector:index "App\Models\User" --queue

Slow Indexing

Solution: Use batch processing and queues

php artisan vector:index "App\Models\Post" --batch=50 --queue

FAQ

Q: Do I need to publish migrations?
A: No, migrations auto-load from the package.

Q: Can I use this with existing Spatie permissions?
A: Yes! It integrates seamlessly with your existing permission structure.

Q: How do I implement multi-tenant search?
A: Use applyUserFilters() in your model to filter by organization/workspace.

Q: What's the cost of OpenAI embeddings?
A: ~$0.13 per 1M tokens. A typical record costs <$0.001 to index.

Q: Can users only see their own data?
A: Yes! Implement row-level security with applyUserFilters().

Q: Does it work with Laravel 9?
A: Yes, supports Laravel 9.x, 10.x, and 11.x.

📝 Changelog

v1.0.0 (2025-11-23) - Initial Release

Core Features

✅ Automatic model analysis and indexing
✅ Semantic search with OpenAI embeddings
✅ Qdrant vector database integration
✅ Real-time indexing on model events
✅ Queue support for async processing

Security & Authorization

✅ Spatie Permission integration
✅ Row-level security (RLS)
✅ Multi-tenant support
✅ Custom Gates and policies
✅ Audit logging

Audio & Media

✅ Speech-to-Text (OpenAI Whisper)
✅ 50+ language support
✅ Word/segment timestamps
✅ Audio translation
✅ Automatic transcription caching

Developer Experience

✅ 8 Artisan commands
✅ Comprehensive documentation (9 guides)
✅ Real-world examples
✅ Testing command
✅ Laravel 9/10/11 support

🤝 Contributing

We welcome contributions! Here's how you can help:

Report Bugs - Open an issue
Suggest Features - Start a discussion
Submit PRs - Fork, code, test, and submit!
Improve Docs - Documentation PRs are always welcome

Development Setup

git clone https://github.com/bites-development/laravel-vector-indexer.git
cd laravel-vector-indexer
composer install

🔒 Security

If you discover any security vulnerabilities, please email security@bites.app instead of using the issue tracker. All security vulnerabilities will be promptly addressed.

📄 License

This package is open-sourced software licensed under the MIT license.

💖 Credits

Developed with ❤️ by Bites Team

Built With

Laravel - The PHP Framework
OpenAI - Embeddings & Whisper API
Qdrant - Vector Database
Spatie Permission - Authorization

Special Thanks

Laravel community for amazing tools and support
OpenAI for powerful AI APIs
Qdrant team for excellent vector database
All contributors and users

📞 Support & Community

Get Help

Star History

If you find this package useful, please consider giving it a ⭐ on GitHub!

Made with ❤️ by the Bites Team

Website • GitHub • Documentation

bites / laravel-vector-indexer

Maintainers

Details

README

🚀 Laravel Vector Indexer

AI-Powered Search for Text, Images, Documents, Audio & Video

🎯 What is This?

⚡ Before & After

🎯 Supported File Types

📑 Table of Contents

✨ Features

🔍 Semantic Search

📸 Media Embedding

🔐 Enterprise Security

⚡ Performance

🤖 Automation

🎨 Developer Experience

🤖 AI & Chat

📦 Installation

Quick Install

2. Publish Config (Optional)

3. Run Migrations

4. Configure Environment

🚀 Quick Start

⚡ 5-Minute Setup

1. Add Traits to Your Model

2. Generate Configuration

3. Start Watching for Changes

4. Index Existing Records

5. Search!

Security & Authorization

Setup Permissions

Permission-Based Search

Row-Level Security

📸 Media Embedding (Images, Documents, Audio, Video)

🎯 Search Across All Content Types

🚀 Quick Start

📊 How It Works

💡 Use Cases

📖 Full Documentation

🎙️ Speech-to-Text (Audio Transcription)

🎯 Why Use STT?

⚡ Quick Setup

Usage

🤖 AI Chat with RAG (Retrieval Augmented Generation)

🎯 What is RAG?

⚡ Quick Setup

💡 Real-World Example

⚙️ Commands

📋 All Available Commands

Analyze Model

Generate Configuration

Watch Model

Unwatch Model

Index Records

Check Status

Setup Permissions

Test Speech-to-Text

Test RAG Integration

🔧 Configuration

🎨 Advanced Usage

Custom Field Weights

Relationship Indexing

Search with Filters

Batch Processing

Queue Configuration

Testing

Module Indexing

API Controller Example

🏢 Multi-Tenant & SaaS Applications

⚡ Performance

Memory Usage

Speed

Optimization Tips

Requirements

📚 Documentation

📖 Complete Guides

Examples

💡 Use Cases

✅ Content Management