Search

Results include substring matches and semantically similar vals. Learn more
janpaul123 avatar
blogPostEmbeddingsDimensionalityReduction
@janpaul123
An interactive, runnable TypeScript val by janpaul123
Script
import OpenAI from "npm:openai";
export default async function blogPostEmbeddingsDimensionalityReduction() {
const points = [
async function getEmbedding(str) {
return (await openai.embeddings.create({
model: "text-embedding-3-large",
})).data[0].embedding;
let embeddings = await blob.getJSON("blogPostEmbeddings");
if (!embeddings) {
embeddings = await Promise.all(points.map((point) => getEmbedding(point)));
await blob.setJSON("blogPostEmbeddings", embeddings);
const matrix = druid.Matrix.from(embeddings);
const dr = new druid.UMAP(matrix, {
return Array.from(dr.transform().iterate_rows()).map((row, index) => [row[0], row[1], points[index]]);
console.log(await blogPostEmbeddingsDimensionalityReduction());
janpaul123 avatar
compareEmbeddings
@janpaul123
An interactive, runnable TypeScript val by janpaul123
Script
const cache = {};
async function getEmbedding(str) {
cache[str] = cache[str] || (await openai.embeddings.create({
model: "text-embedding-3-large",
input: str,
yawnxyz avatar
embeddingsSearchExample
@yawnxyz
This is an example of in-memory search, using a combination of lunr, OpenAI embeddings, and cosine similarity
Script
This is an example of in-memory search, using a combination of lunr, OpenAI embeddings, and cosine similarity
// Step 1: Get Embeddings
// Function to get embeddings for multiple texts
async function getEmbeddings(texts) {
console.log(`Getting embeddings for texts: ${texts}`);
const { embeddings } = await embedMany({
console.log(`Embeddings: ${embeddings}`);
return embeddings;
// Step 2: Store Embeddings with Documents
async function prepareDocumentsWithEmbeddings() {
const embeddings = await getEmbeddings(contents);
janpaul123 avatar
debugValEmbeddings
@janpaul123
An interactive, runnable TypeScript val by janpaul123
Script
const openai = new OpenAI();
const queryEmbedding = (await openai.embeddings.create({
model: "text-embedding-3-small",
console.log(queryEmbedding.slice(0, 4));
const embedding = await openai.embeddings.create({
model: "text-embedding-3-small",
console.log("Hash is the same, no email sent.", { dynamiclandWebsiteHash });
const queryEmbeddingVal = (await openai.embeddings.create({
model: "text-embedding-3-small",
thomasatflexos avatar
generateEmbeddings
@thomasatflexos
An interactive, runnable TypeScript val by thomasatflexos
Script
import process from "node:process";
export let generateEmbeddings = async (
req: express.Request,
message: "The URL parameter is required for this end point",
const { OpenAIEmbeddings } = await import("npm:langchain/embeddings");
const { createClient } = await import(
splittedDocs,
new OpenAIEmbeddings({
openAIApiKey: process.env.OPEN_API_KEY,
toowired avatar
memorySampleSummary
@toowired
// Initialize the database
Script
// Generate embedding for a given text
async function generateEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: text,
wilt avatar
getOpenapiEmbedding
@wilt
* Call OpenAPI Embeddings api to vectorize a query string * Returns an array of 1536 numbers
Script
import { fetchJSON } from "https://esm.town/v/stevekrouse/fetchJSON?v=41";
* Call OpenAPI Embeddings api to vectorize a query string
* Returns an array of 1536 numbers
export const getOpenapiEmbedding = async ({ openapiToken, query }: {
query: string;
}): Promise<number[]> =>
fetchJSON("https://api.openai.com/v1/embeddings", {
method: "POST",
headers: {
janpaul123 avatar
indexValsNeon
@janpaul123
Part of Val Town Semantic Search . Generates OpenAI embeddings for all public vals, and stores them in Neon , using the pg_vector extension. Create the vals_embeddings table in Neon if it doesn't already exist. Get all val names from the database of public vals , made by Achille Lacoin . Get all val names from the vals_embeddings table and compute the difference (which ones are missing). Iterate through all missing vals, get their code, get embeddings from OpenAI, and store the result in Neon. Can now be searched using janpaul123/semanticSearchNeon .
Cron
Generates OpenAI embeddings for all public vals, and stores them in [Neon](https://neon.tech/), using the [pg_vector](https:/
- Create the `vals_embeddings` table in Neon if it doesn't already exist.
- Get all val names from the `vals_embeddings` table and compute the difference (which ones are missing).
// CREATE TABLE vals_embeddings (id TEXT PRIMARY KEY, embedding VECTOR(1536));
const client = new Client(Deno.env.get("NEON_URL_VALSEMBEDDINGS"));
const existingEmbeddingsIds = new Set(
(await client.queryObject`SELECT id FROM vals_embeddings`).rows.map(row => row.id),
if (!existingEmbeddingsIds.has(id)) {
const embedding = await openai.embeddings.create({
.queryObject`INSERT INTO vals_embeddings (id, embedding) VALUES (${id}, ${embeddedBinaryString})`;
yawnxyz avatar
semanticSearch
@yawnxyz
In-memory semantic search; load it up with valtown KV. This is a "dumb" version of vector search, for prototyping RAG responses and UIs — with both regular search (w/ Lunr) and vector search (with OpenAI embeddings + cosine similarity) Usage: import { semanticSearch } from "https://esm.town/v/yawnxyz/semanticSearch"; const documents = [ { id: 1, content: 'cats dogs' }, { id: 2, content: 'elephants giraffes lions tigers' }, { id: 3, content: 'edam camembert cheddar' } ]; async function runExample() { // Add documents to the semantic search instance await semanticSearch.addDocuments(documents); const results = await semanticSearch.search('animals', 0, 3); console.log('Top 3 search results for "animals":'); console.log(results); } runExample();
Script
d UIs — with both regular search (w/ Lunr) and vector search (with OpenAI embeddings + cosine similarity)
const documentsWithoutEmbeddings = documents?.filter(doc => !doc.embedding) || [];
const documentsWithEmbeddings = documents?.filter(doc => doc.embedding) || [];
if (documentsWithoutEmbeddings.length > 0) {
const contents = documentsWithoutEmbeddings.map(doc => doc.content);
const embeddings = await this.getEmbeddings(contents, modelName);
documentsWithoutEmbeddings.forEach((doc, index) => {
doc.embedding = embeddings[index];
documentsWithEmbeddings.forEach(doc => {
async getEmbeddings(texts, modelName) {
const { embeddings } = await embedMany({
janpaul123 avatar
jadeMacaw
@janpaul123
An interactive, runnable TypeScript val by janpaul123
Script
const allValsBlobEmbeddingsMeta = (await blob.getJSON("allValsBlobEmbeddingsMeta")) ?? {};
const allBatchDataIndexes = _.uniq(Object.values(allValsBlobEmbeddingsMeta).map(item => item.batchDataIndex));
const embeddingsBatches = [];
const embeddingsBatchBlobName = `allValsBlobEmbeddingsData_${batchDataIndex}`;
const promise = blob.get(embeddingsBatchBlobName).then((response) => response.arrayBuffer());
embeddingsBatches[batchDataIndex as any] = data;
console.log(`Loaded ${embeddingsBatchBlobName} (${data.byteLength} bytes)`);
for (const id in allValsBlobEmbeddingsMeta) {
const meta = allValsBlobEmbeddingsMeta[id];
(new Float32Array(embeddingsBatches[meta.batchDataIndex], 256 * 4 * meta.valIndex, 256 * 4)) as any;
janpaul123 avatar
indexValsBlobs
@janpaul123
Part of Val Town Semantic Search . Generates OpenAI embeddings for all public vals, and stores them in Val Town's blob storage . Create a new metadata object. Also has support for getting the previous metadata and only adding new vals, but that's currently disabled. Get all val names from the database of public vals , made by Achille Lacoin . Put val names in batches. Vals in the same batch will have their embeddings stored in the same blob, at different offsets. Iterate through all each batch, get code for all the vals, get embeddings from OpenAI, and store the result in a blob. When finished, save the metadata JSON to its own blob. Can now be searched using janpaul123/semanticSearchBlobs .
Script
Generates OpenAI embeddings for all public vals, and stores them in Val Town's [blob storage](https://docs.val.town/std/blob
- Put val names in batches. Vals in the same batch will have their embeddings stored in the same blob, at different offsets.
- Iterate through all each batch, get code for all the vals, get embeddings from OpenAI, and store the result in a blob.
// const allValsBlobEmbeddingsMeta = (await blob.getJSON(`allValsBlob${dimensions}EmbeddingsMeta`)) ?? {};
const allValsBlobEmbeddingsMeta = {};
const existingEmbeddingsIds = new Set(Object.keys(allValsBlobEmbeddingsMeta));
if (!existingEmbeddingsIds.has(id)) {
...Object.values(allValsBlobEmbeddingsMeta).map((item: any) => item.batchDataIndex + 1),
const embeddingsBatch = new Float32Array(dimensions * newValsBatch.length);
const embedding = await openai.embeddings.create({
embeddingsBatch.set(embeddingBinary, dimensions * valIndex);
allValsBlobEmbeddingsMeta[id] = { batchDataIndex, valIndex };
const embeddingsBatchBlobName = `allValsBlob${dimensions}EmbeddingsData_${batchDataIndex}`;
webup avatar
getModelBuilder
@webup
An interactive, runnable TypeScript val by webup
Script
async () => {
const { OpenAIEmbeddings } = await import(
"npm:langchain/embeddings/openai"
return new OpenAIEmbeddings(args);
matches({ type: "llm", provider: "huggingface" }),
await import("npm:@huggingface/inference");
const { HuggingFaceInferenceEmbeddings } = await import(
"npm:langchain/embeddings/hf"
return new HuggingFaceInferenceEmbeddings(args);
// Return function to prevent "Serialization Error"
ejfox avatar
umap
@ejfox
UMAP Dimensionality Reduction API This is a high-performance dimensionality reduction microservice using UMAP (Uniform Manifold Approximation and Projection). It provides an efficient way to reduce high-dimensional data to 2D or 3D representations, making it easier to visualize and analyze complex datasets. When to Use This Service Visualizing high-dimensional data in 2D or 3D space Reducing dimensionality of large datasets for machine learning tasks Exploring relationships and clusters in complex data Preprocessing step for other machine learning algorithms Common Use Cases Visualizing word embeddings in a scatterplotcs Exploring customer segmentation in marketing analytics Visualizing image embeddings in computer vision tasks
HTTP
- Visualizing word embeddings in a scatterplotcs
- Visualizing image embeddings in computer vision tasks
* The service accepts POST requests with JSON payloads containing high-dimensional embeddings and configuration parameters.
* - Visualizing word embeddings or document vectors in NLP tasks
* - Visualizing image embeddings in computer vision tasks
const { embeddings, config } = await request.json();
if (!Array.isArray(embeddings) || embeddings.length === 0) {
return new Response("Invalid input: embeddings must be a non-empty array", { status: 400 });
if (embeddings.length > MAX_POINTS) {
if (embeddings[0].length > MAX_DIMENSIONS) {
const data = encoder.encode(JSON.stringify({ embeddings, config }));
umap.fit(embeddings),
janpaul123 avatar
semanticSearchNeon
@janpaul123
Part of Val Town Semantic Search . Uses Neon to search embeddings of all vals, using the pg_vector extension. Call OpenAI to generate an embedding for the search query. Query the vals_embeddings table in Neon using the cosine similarity operator. The vals_embeddings table gets refreshed every 10 minutes by janpaul123/indexValsNeon .
Script
*Part of [Val Town Semantic Search](https://www.val.town/v/janpaul123/valtownsemanticsearch).*
Uses [Neon](https://neon.tech/) to search embeddings of all vals, using the [pg_vector](https://neon.tech/docs/extensions/pgv
- Call OpenAI to generate an embedding for the search query.
- Query the `vals_embeddings` table in Neon using the cosine similarity operator.
- The `vals_embeddings` table gets refreshed every 10 minutes by [janpaul123/indexValsNeon](https://www.val.town/v/janpaul1
export default async function semanticSearchPublicVals(query) {
const client = new Client(Deno.env.get("NEON_URL_VALSEMBEDDINGS"));
await client.connect();
const openai = new OpenAI();
const queryEmbedding = (await openai.embeddings.create({
model: "text-embedding-3-small",
const result = await client
ct`SELECT id, embedding <=> ${embeddedBinaryString} AS distance FROM vals_embeddings ORDER BY embedding <=> ${embeddedBinaryS
return result.rows.map((row) => {
janpaul123 avatar
indexValsTurso
@janpaul123
Part of Val Town Semantic Search . Generates OpenAI embeddings for all public vals, and stores them in Turso , using the sqlite-vss extension. Create the vals_embeddings and vss_vals_embeddings tables in Turso if they don't already exist. Get all val names from the database of public vals , made by Achille Lacoin . Get all val names from the vals_embeddings table and compute the difference (which ones are missing). Iterate through all missing vals, get their code, get embeddings from OpenAI, and store the result in Turso. When finished, update the vss_vals_embeddings table so we can efficiently query them with the sqlite-vss extension. This is blocked by a bug in Turso that doesn't allow VSS indexes past a certain size. Can now be searched using janpaul123/semanticSearchTurso .
Cron
Generates OpenAI embeddings for all public vals, and stores them in [Turso](https://turso.tech/), using the [sqlite-vss](http
- Create the `vals_embeddings` and `vss_vals_embeddings` tables in Turso if they don't already exist.
- Get all val names from the `vals_embeddings` table and compute the difference (which ones are missing).
url: "libsql://valsembeddings-jpvaltown.turso.io",
authToken: Deno.env.get("TURSO_AUTH_TOKEN_VALSEMBEDDINGS"),
sqlite.execute("CREATE TABLE IF NOT EXISTS vals_embeddings (id TEXT NOT NULL, embedding BLOB NOT NULL)");
sqlite.execute("CREATE VIRTUAL TABLE IF NOT EXISTS vss_vals_embeddings USING vss0(embedding(256))");
const existingEmbeddingsIds = new Set(
(await sqlite.execute("SELECT id FROM vals_embeddings")).rows.map((row) => row[0]),
if (!existingEmbeddingsIds.has(id)) {
const embedding = await openai.embeddings.create({
sql: "INSERT INTO vals_embeddings (id, embedding) VALUES (:id, :embeddingBinary)",
"INSERT INTO vss_vals_embeddings (rowid, embedding) SELECT rowid, embedding FROM vals_embeddings WHERE rowid NOT IN (SELE