Public
Readme
Use this for summarizers.
Combines https://r.jina.ai/URL
and markdown.download's Youtube transcription getter to do its best to retrieve content from URLs.
https://arstechnica.com/space/2024/06/nasa-indefinitely-delays-return-of-starliner-to-review-propulsion-data https://journals.asm.org/doi/10.1128/iai.00065-23
Usage:
Code
HTTP
/** @jsx jsx */
import { jsx } from "https://deno.land/x/hono@v3.11.7/middleware.ts";
import { Hono } from "https://deno.land/x/hono@v3.11.7/mod.ts";
import { ai } from "https://esm.town/v/yawnxyz/ai";
import { getUrlMetadata } from "https://esm.town/v/yawnxyz/urlMetadata";
import { getHtmlMetadata } from "https://esm.town/v/yawnxyz/htmlMetadata";
import { getCitation } from "https://esm.town/v/yawnxyz/citation";
import { blobby } from "https://esm.town/v/yawnxyz/blobby";
import { transcribeAudio } from "https://esm.town/v/yawnxyz/stt";
import stringHash from 'npm:string-hash';
const app = new Hono();
// https://www.crossref.org/blog/dois-and-matching-regular-expressions/
const DOI_REGEX = /\b(10\.\d{4,9}\/[-._;()\/:\w]+)\b/i;
export const getJinaContent = async (url, opts = {}) => {
const baseUrl = 'https://r.jina.ai/';
if (!url.includes('r.jina.ai')) {
url = baseUrl + url;
}
const fullUrl = new URL(url);
const headers = {
...(opts.withImagesSummary && { 'X-With-Images-Summary': 'true' }),
...(opts.withGeneratedAlt && { 'X-With-Generated-Alt': 'true' }),
...(opts.withLinksSummary && { 'X-With-Links-Summary': 'true' }),
...(opts.noCache && { 'X-No-Cache': 'true' }),
...(opts.accept && { 'Accept': opts.accept }),
...(opts.targetSelector && { 'X-Target-Selector': opts.targetSelector }),
...(opts.timeout && { 'X-Timeout': opts.timeout.toString() }),
...(opts.waitForSelector && { 'X-Wait-For-Selector': opts.waitForSelector }),
...(opts.returnFormat && { 'X-Return-Format': opts.returnFormat }),
};
console.log('[getJinaContent] Fetching:', fullUrl.toString(), headers);
Val Town is a collaborative website to create and scale JavaScript.
Deploy APIs, crons, & store data – all from the browser, and deployed in miliseconds.
Comments
Nobody has commented on this val yet: be the first!
yawnxyz-getcontentfromurl.web.val.run
Updated: August 24, 2024