Goal
Enable document upload and querying through Supabase to support complex dataset parsing and knowledge base access via custom tools in Voice AI and chat interfaces. This system unlocks voice-accessible knowledge bases and enhanced querying for chat-based assistants.Resources
- Buildship Remix Link: https://app.buildship.com/remix/f1643e2b-bd80-48b0-b556-49dea270b2f9
- AI Assistant Upload Portal: https://createassistants.com/supabase-knowledge
Prerequisites
OpenAI Account
- API Key for embedding and processing capabilities
Supabase Account
- API Key (Project Settings > API)
- Project URL (Project Settings > API)
Buildship Account
- Access to the Remix project (linked above)
AI Assistant Account
- Active assistant and custom tool setup
Implementation Steps
Step 1: Set Up Buildship Project
Open and Duplicate the Remix Project
- Open the Remix Link in your Buildship account
- Duplicate the project from the link provided
- This will create a copy of the pre-configured workflows in your account
Configure API Keys
Add your API credentials to the project: Required Keys:- OpenAI API Key for embedding generation
- Supabase API Key for database access
- Supabase Project URL (Found in Supabase → Project Settings > API, right at the top)
Update Supabase Nodes
Update all Supabase nodes with your credentials: Node Locations:- 5 nodes in the “Add Document Chunks” workflow
- 1 node in the “RAG using Supabase” workflow
Deploy the Project
- After updating all Supabase nodes (project URL and API key) and OpenAI nodes (API key)
- Click “Ship” in the top right corner to save changes
- This will generate the API URLs needed for later steps
Step 2: Configure Supabase Database
Enable Vector Extension
- In Supabase, click on the Database tab
- Add the extension “vector” to enable knowledge embedding functionality
- This extension is required for storing and querying document embeddings
Set Up Database Tables
- Go to the “SQL Editor” tab
- Run the following SQL commands individually
- Each command should result in “Success no rows returned” output
Create Files Table
Create Chunks Table
Create Index for Full-Text Search
Create Hybrid Search Function
Expected Output: Each command should return “Success no rows returned”
Step 3: Configure Upload Portal
Set Up the Upload Interface
- Go to: https://createassistants.com/supabase-knowledge
- Add your PDF Upload workflow API URL into the “Buildship API PDF Upload URL” field under the “Upload” tab
- This URL should be generated from your Buildship project after shipping
Upload Documents
Document Format Requirements:- Primary format: PDF
- Other formats: Convert to PDF first
- Google Docs: File > Download > PDF
- MS Word: File > Download > PDF
- Online converters: Use any reliable PDF converter
- Select your PDF files for upload
- Submit the upload - this will schedule the processing
- Check status in the Buildship workflow logs
- Monitor progress through the Buildship dashboard
Step 4: Verify Database Population
Check Uploaded Data
- Go to your Supabase database
- Open the “Table Editor” tab
- Click on the “chunks” database
- Refresh the page to see uploaded data
- Verify data appearance - you should see processed document chunks with embeddings
Data Validation
What to look for:- Document chunks with extracted text
- Embedding vectors (1536 dimensions)
- File metadata including original names
- Proper indexing for search functionality
Step 5: Integrate with AI Assistant
Create Custom Tool
- Add a custom tool to your assistant
- Configure the tool to query your RAG database
- Set up proper parameters for search queries
Tool Configuration Example
- Tool Name:
query_knowledge_base
- Description: “Search the custom knowledge base for relevant information using both semantic and full-text search capabilities.”
- Endpoint: Your Buildship “RAG using Supabase” workflow URL
- Parameters:
query
: The search query or questionmatch_count
: Number of results to return (default: 5)
Testing the System
Test in multiple interfaces:- Voice AI: Ask questions about uploaded documents
- Chat AI: Query the knowledge base through text
- Web Orbs: Access knowledge through web interface
System Architecture
Data Flow
- Document Upload → PDF processing in Buildship
- Text Extraction → Document chunking and preprocessing
- Embedding Generation → OpenAI embeddings for semantic search
- Database Storage → Supabase with vector and full-text search
- Query Processing → Hybrid search combining semantic and keyword matching
- Result Delivery → Formatted responses through AI assistant
Search Capabilities
Hybrid Search Features:- Semantic Search: Using vector embeddings for meaning-based matching
- Full-Text Search: Traditional keyword-based search
- Weighted Combination: Configurable balance between search types
- Ranking Algorithm: RRF (Reciprocal Rank Fusion) for optimal results
Advanced Configuration
Customization Options
Search Parameters
- Full-text weight: Adjust importance of keyword matching
- Semantic weight: Adjust importance of meaning-based search
- RRF constant: Fine-tune ranking algorithm
- Match count: Control number of returned results
Document Processing
- Chunk size: Optimize for your document types
- Overlap settings: Ensure context preservation
- File type support: Extend beyond PDF if needed
Performance Optimization
Database Performance
- Indexing strategy: Optimize for your query patterns
- Connection pooling: Manage database connections efficiently
- Query optimization: Monitor and improve search performance
Embedding Efficiency
- Batch processing: Process multiple documents efficiently
- Caching strategy: Store frequently accessed embeddings
- Model selection: Choose appropriate embedding models
Troubleshooting
Common Issues
Upload failures:- Verify Buildship API URL is correct
- Check PDF file format and size
- Monitor Buildship workflow logs for errors
- Confirm Supabase API keys are correct
- Verify project URL formatting
- Check database permissions
- Ensure documents were processed successfully
- Verify embeddings were generated
- Check database table population
- Monitor database query performance
- Optimize search parameters
- Consider document chunking strategy
Debugging Steps
Verify Setup
- Check Buildship logs for processing errors
- Inspect Supabase tables for data integrity
- Test API endpoints individually
- Validate search function with simple queries
Performance Monitoring
- Query response times
- Database resource usage
- Search result relevance
- User satisfaction metrics
Security Considerations
Data Protection
- API key security: Store credentials securely
- Access control: Implement proper permissions
- Data encryption: Ensure sensitive information protection
- Audit logging: Track database access and changes
Privacy Compliance
- Document handling: Ensure compliance with data regulations
- User consent: Obtain appropriate permissions for data processing
- Data retention: Implement appropriate retention policies
- Cross-border considerations: Handle international data transfer requirements
Benefits and Use Cases
Key Benefits
- ✅ Voice-accessible knowledge bases for hands-free information access
- ✅ Enhanced query capabilities with hybrid search
- ✅ Scalable document processing for large knowledge bases
- ✅ Multi-interface support across voice, chat, and web
- ✅ Real-time information retrieval from uploaded documents
- ✅ Semantic understanding for intelligent question answering
Common Use Cases
- Customer support knowledge bases
- Technical documentation systems
- Educational content libraries
- Company policy and procedure databases
- Research paper repositories
- Product information systems
Maintenance and Updates
Regular Maintenance
- Monitor database performance
- Update embeddings for modified documents
- Clean up unused data
- Backup database regularly
System Updates
- Keep dependencies current
- Monitor API changes
- Update search algorithms
- Optimize based on usage patterns