Search
Code172
// --- AI Services Routing --- const aiMatch = subpath.match(/^ai\/(chat\/completions|embeddings|images\/generations|training\/jobs(\/([0-9]+))?)$/); if (aiMatch) { const endpoint = aiMatch[1]; if (endpoint === "chat/completions") return handleAIChatCompletions(req, version); if (endpoint === "embeddings") return handleAIEmbeddings(req, version); if (endpoint === "images/generations") return handleAIImageGeneration(req, version); if (endpoint.startsWith("training/jobs")) return handleTrainingJobsService(req, version, jobId);}/** Handles OpenAI embeddings generation (e.g., POST /v1/ai/embeddings). */function handleAIEmbeddings(req: Request, version: string) { if (req.method === "POST") { return new Response(JSON.stringify({ data: [{ embedding: [0.1, 0.2, 0.3], index: 0 }], model: "simulated-model" }), { status: 200, headers: { "Content-Type": "application/json" } }); } else { return new Response("Method Not Allowed for AI embeddings.", { status: 405 }); }}A simple interface for making and querying Pinecone vector databases. Use OpenAIembeddings to vectorize and search## Quickstartasync function generateQueryEmbedding(query: string): Promise<number[]> { const response = await openai.embeddings.create({ model: EMBEDDING_MODEL, input: query,}async function generateEmbeddings(texts: string[]): Promise<number[][]> { const batchSize = 100; const allEmbeddings: number[][] = []; for (let i = 0; i < texts.length; i += batchSize) { const batch = texts.slice(i, i + batchSize); const response = await openai.embeddings.create({ model: EMBEDDING_MODEL, input: batch, }); allEmbeddings.push(...response.data.map((d) => d.embedding)); } return allEmbeddings;} console.log(`Created ${chunks.length} chunks`); // Generate embeddings console.log(`Generating embeddings for ${chunks.length} chunks`); const embeddings = await generateEmbeddings(chunks); // Store chunks and embeddings console.log(`Storing chunks in database`); for (let i = 0; i < chunks.length; i++) { chunks[i], i, JSON.stringify(embeddings[i]), JSON.stringify({ chunk_size: chunks[i].length }), ], tech: ["RAG", "Milvus", "LLM"], desc: "ChatDoc is a web application enabling users to upload documents (PDF, TXT, CSV, XLSX, PPTX, DOCX), extract and chunk text, store embeddings in Milvus, and query with state-of-the-art LLMs. It provides both a REST API and a web-based interface for seamless integration.", icon: "MessageSquareText", links: {async function embed(text: string) { const response = await axios.post( "https://api.openai.com/v1/embeddings", { model, input: text }, { const {model, modelToken} = this.options; const response = await axios.post( "https://api.openai.com/v1/embeddings", {model, input: text }, {const PINECONE_API_KEY = Deno.env.get("PINECONE_API_KEY");const PINECONE_INDEX_NAME = Deno.env.get("PINECONE_INDEX_NAME") || "aws-ring-embeddings-small";const PINECONE_DIMENSIONS = parseInt( Deno.env.get("PINECONE_DIMENSIONS") || "1536",);const OPENAI_EMBEDDINGS_MODEL = Deno.env.get("OPENAI_EMBEDDINGS_MODEL") || "text-embedding-3-small";const ANTHROPIC_API_KEY = Deno.env.get("ANTHROPIC_API_KEY");async function embedQuestion(question: string): Promise<number[]> { const response = await openai.embeddings.create({ model: OPENAI_EMBEDDINGS_MODEL, input: question, dimensions: PINECONE_DIMENSIONS, } // Step 1: embeddings const { result: embedding, ms: embedMs } = await timeStep( "embedQuestion", } // Load actual pages with embeddings for search const loadPagesStart = performance.now(); const pagesWithEmbeddings = await getAllPagesForSearch(true); // Enable timing timings.loadPages = performance.now() - loadPagesStart; console.log(`⏱️ Total pages load duration: ${timings.loadPages?.toFixed(2)}ms\n`); const queryStart = performance.now(); const testResult = await runSearchTest(query, pagesWithEmbeddings, { limit: 10, minScore: 0,- Fetches documentation pages from Groq's console- Caches page content, metadata, token counts, and embeddings in SQLite- Token counting using tiktoken (GPT-4 encoding)- AI-generated metadata (categories, tags, use cases, sample questions)- Content embeddings generation with multiple strategies (local ONNX, Transformers.js, API-based)- Semantic search with configurable strategies (embeddings + cosine similarity)- **RAG-based question answering** with configurable answer strategies (search + LLM)- Hash-based change detection to skip unchanged pages during recalculation- Calculate token counts for each page- Generate AI metadata (categories, tags, use cases, questions)- Generate embeddings for each page- Calculate content hashes for change detection- Store everything in the SQLite cache3. **Content updates** - Documentation pages have been updated and you want fresh data4. **Token count needed** - You need accurate token counts for new content5. **Metadata refresh** - You want to regenerate AI metadata or embeddings### 🔄 Default Mode (Smart Recalculation)- **Skips pages with unchanged content** (saves time and API calls)- Only processes pages that have changed- Still generates embeddings and metadata for changed pages**Response includes:****Use cases:**- Regenerating all metadata/embeddings even if content unchanged- After updating metadata generation prompts- When you want to ensure everything is fresh- Uses cached pages when available for faster results**Note**: Currently uses embeddings-based semantic search. Multiple strategies available (see Search section).#### `GET /answer`#### `GET /cache/recalculate`Recalculate pages with AI metadata and embeddings generation.**Query Parameters:**- Calculates token counts- Generates AI metadata (categories, tags, use cases, questions)- Generates embeddings (currently fake, ready for Groq API)- Calculates content hashes for change detection- Stores everything in cache metadata TEXT, contentHash TEXT, embeddings TEXT, cachedAt INTEGER NOT NULL)- `metadata` - AI-generated metadata (categories, tags, use cases, questions)- `contentHash` - SHA-256 hash of content (for change detection)- `embeddings` - Content embeddings vector (JSON array)- `cachedAt` - Timestamp when cached2. Activate in `search/index.ts`: ```typescript import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts"; ```| Strategy | File | Speed | Cost | Pros ||----------|------|-------|------|------|| **Mixedbread** | `mixedbread-embeddings-cosine.ts` | ~50-100ms | Free tier | High quality, 1024 dims || **OpenAI** | `openai-cosine.ts` | ~100-200ms | Paid | High quality, reliable || **HuggingFace** | `hf-inference-qwen3-cosine.ts` | ~150-300ms | Free tier | Qwen3-8B model |```typescript// Comment out current strategy// import { searchStrategy, generateEmbeddings } from "./transformers-cosine.ts";// Uncomment desired strategyimport { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";```### Current Implementation (Semantic Search)The search system uses semantic embeddings for intelligent search:- Understands meaning, not just keywords- Finds relevant results even with different wording1. **Embedding Generation**: Content is converted to 384-dimensional vectors2. **Cosine Similarity**: Query embeddings compared against page embeddings3. **Ranking**: Results sorted by similarity score4. **Snippet Generation**: Context-aware snippets around relevant content- Adjust system prompts## EmbeddingsContent embeddings are generated for each page using the active search strategy (see Search section above).**Current Default**: Local ONNX models (`transformers-local-onnx.ts`)- Storage: Cached as JSON arrays in SQLiteEmbeddings are:- Generated during `/cache/recalculate`- Stored in cache for fast retrievaldeno task recalc-f# Recalculate with Mixedbread embeddings strategydeno task recalc-mxbai# Force recalculation with Mixedbread embeddingsdeno task recalc-mxbai-f```Users
No users found
Embedding Vals in other sites. Copy page Copy page. Copy this page as Markdown for LLMs. View as Markdown View this page as plain text. Open in ChatGPT Ask questions
Register a new Slash Command. Section titled “Step 5: Register a new Slash Command” The embedded code below should have your name in the top-left corner. If you see anonymous,