Jina

jina-ai
539
Support web content extraction, academic literature search, and image retrieval, aiming to offer users efficient information acquisition and processing services.

Content

Jina 19

Support web content extraction, academic literature search, and image retrieval, aiming to offer users efficient information acquisition and processing services.

show_api_key

Return the bearer token from the Authorization header of the MCP settings, which is used to debug.

No parameters required

primer

Get up-to-date contextual information of the current session to provide localized, time-aware responses. Use this when you need to know the current time, user's location, or network environment to give more relevant and personalized information.

No parameters required

guess_datetime_url

Guess the last updated or published datetime of a web page. This tool examines HTTP headers, HTML metadata, Schema.org data, visible dates, JavaScript timestamps, HTML comments, Git information, RSS/Atom feeds, sitemaps, and international date formats to provide the most accurate update time with confidence scores. Returns the best guess timestamp and confidence level.

Parameters (1)
url string Required

The complete HTTP/HTTPS URL of the webpage to guess datetime information

capture_screenshot_url

Capture high-quality screenshots of web pages in base64 encoded JPEG format. Use this tool when you need to visually inspect a website, take a snapshot for analysis, or show users what a webpage looks like.

Parameters (3)
firstScreenOnly boolean Optional

Set to true for a single screen capture (faster), false for full page capture including content below the fold

return_url boolean Optional

Set to true to return screenshot URLs instead of downloading images as base64

url string Required

The complete HTTP/HTTPS URL of the webpage to capture (e.g., 'https://example.com')

read_url

Extract and convert web page content to clean, readable markdown format. Perfect for reading articles, documentation, blog posts, or any web content. Use this when you need to analyze text content from websites, bypass paywalls, or get structured data.

Parameters (3)
url string Required

The complete URL of the webpage or PDF file to read and convert (e.g., 'https://example.com/article'). Can be a single URL string or an array of URLs for parallel reading.

withAllImages boolean Optional

Set to true to extract and return all images found on the page as structured data

withAllLinks boolean Optional

Set to true to extract and return all hyperlinks found on the page as structured data

search_web

Search the entire web for current information, news, articles, and websites. Use this when you need up-to-date information, want to find specific websites, research topics, or get the latest news. Ideal for answering questions about recent events, finding resources, or discovering relevant content.

Parameters (6)
gl string Optional

Country code, e.g., 'dz' for Algeria

hl string Optional

Language code, e.g., 'zh-cn' for Simplified Chinese

location string Optional

Location for search results, e.g., 'London', 'New York', 'Tokyo'

num number Optional

Maximum number of search results to return, between 1-100

query string Required

Search terms or keywords to find relevant web content (e.g., 'climate change news 2024', 'best pizza recipe'). Can be a single query string or an array of queries for parallel search.

tbs string Optional

Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

expand_query

Expand and rewrite search queries based on an up-to-date query expansion model. This tool takes an initial query and returns multiple expanded queries that can be used for more diversed and deeper searches. Useful for improving deep research results by searching broader and deeper.

Parameters (1)
query string Required

The search query to expand (e.g., 'machine learning', 'climate change')

search_arxiv

Search academic papers and preprints on arXiv repository. Perfect for finding research papers, scientific studies, technical papers, and academic literature. Use this when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, mathematics, computer science, etc.

Parameters (3)
num number Optional

Maximum number of academic papers to return, between 1-100

query string Required

Academic search terms, author names, or research topics (e.g., 'transformer neural networks', 'Einstein relativity', 'machine learning optimization'). Can be a single query string or an array of queries for parallel search.

tbs string Optional

Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

search_ssrn

Search academic papers and preprints on SSRN (Social Science Research Network). Perfect for finding research papers in social sciences, economics, law, finance, accounting, management, and humanities. Use this when researching social science topics, looking for working papers, or finding the latest research in business and economics fields.

Parameters (3)
num number Optional

Maximum number of academic papers to return, between 1-100

query string Required

Academic search terms, author names, or research topics (e.g., 'corporate governance', 'behavioral finance', 'contract law'). Can be a single query string or an array of queries for parallel search.

tbs string Optional

Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

search_jina_blog

Search Jina AI news and blog posts at jina.ai/news for articles about AI, machine learning, neural search, embeddings, and Jina products. Use this to find official Jina documentation, tutorials, product announcements, and technical deep-dives.

Parameters (3)
num number Optional

Maximum number of blog posts to return, between 1-100

query string Required

Search terms to find relevant Jina blog posts (e.g., 'embeddings', 'reranker', 'ColBERT'). Can be a single query string or an array of queries for parallel search.

tbs string Optional

Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

search_images

Search for images across the web, similar to Google Images. Use this when you need to find photos, illustrations, diagrams, charts, logos, or any visual content. Perfect for finding images to illustrate concepts, locating specific pictures, or discovering visual resources. Images are returned by default as small base64-encoded JPEG images.

Parameters (6)
gl string Optional

Country code, e.g., 'dz' for Algeria

hl string Optional

Language code, e.g., 'zh-cn' for Simplified Chinese

location string Optional

Location for search results, e.g., 'London', 'New York', 'Tokyo'

query string Required

Image search terms describing what you want to find (e.g., 'sunset over mountains', 'vintage car illustration', 'data visualization chart')

return_url boolean Optional

Set to true to return image URLs, title, shapes, and other metadata. By default, images are downloaded as base64 and returned as rendered images.

tbs string Optional

Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

parallel_search_web

Run multiple web searches in parallel for comprehensive topic coverage and diverse perspectives. For best results, provide multiple search queries that explore different aspects of your topic. You can use expand_query to help generate diverse queries, or create them yourself.

Parameters (2)
searches array Required

Array of search configurations to execute in parallel (maximum 5 searches for optimal performance)

timeout number Optional

Timeout in milliseconds for all searches

parallel_search_arxiv

Run multiple arXiv searches in parallel for comprehensive research coverage and diverse academic angles. For best results, provide multiple search queries that explore different research angles and methodologies. You can use expand_query to help generate diverse queries, or create them yourself.

Parameters (2)
searches array Required

Array of arXiv search configurations to execute in parallel (maximum 5 searches for optimal performance)

timeout number Optional

Timeout in milliseconds for all searches

parallel_search_ssrn

Run multiple SSRN searches in parallel for comprehensive social science research coverage and diverse academic angles. For best results, provide multiple search queries that explore different research angles and methodologies. You can use expand_query to help generate diverse queries, or create them yourself.

Parameters (2)
searches array Required

Array of SSRN search configurations to execute in parallel (maximum 5 searches for optimal performance)

timeout number Optional

Timeout in milliseconds for all searches

parallel_read_url

Read multiple web pages in parallel to extract clean content efficiently. For best results, provide multiple URLs that you need to extract simultaneously. This is useful for comparing content across multiple sources or gathering information from multiple pages at once.

Parameters (2)
timeout number Optional

Timeout in milliseconds for all URL reads

urls array Required

Array of URL configurations to read in parallel (maximum 5 URLs for optimal performance)

sort_by_relevance

Rerank a list of documents by relevance to a query using Jina Reranker API. Use this when you have multiple documents and want to sort them by how well they match a specific query or topic. Perfect for document retrieval, content filtering, or finding the most relevant information from a collection.

Parameters (3)
documents array Required

Array of document texts to rerank by relevance

query string Required

The query or topic to rank documents against (e.g., 'machine learning algorithms', 'climate change solutions')

top_n number Optional

Maximum number of top results to return

deduplicate_strings

Get top-k semantically unique strings from a list using Jina embeddings and submodular optimization. Use this when you have many similar strings and want to select the most diverse subset that covers the semantic space. Perfect for removing duplicates, selecting representative samples, or finding diverse content.

Parameters (2)
k number Optional

Number of unique strings to return. If not provided, automatically finds optimal k by looking at diminishing return

strings array Required

Array of strings to deduplicate

deduplicate_images

Get top-k semantically unique images (URLs or base64-encoded) using Jina CLIP v2 embeddings and submodular optimization. Use this when you have many visually similar images and want the most diverse subset.

Parameters (2)
images array Required

Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix).

k number Optional

Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return

extract_pdf

Extract figures, tables, and equations from PDF documents using layout detection. Perfect for extracting visual elements from academic papers on arXiv or any PDF URL. Returns base64-encoded images of detected elements with metadata.

Parameters (4)
id string Optional

arXiv paper ID (e.g., '2301.12345' or 'hep-th/9901001'). Either id or url is required.

max_edge number Optional

Maximum edge size for extracted images in pixels (default: 1024)

type string Optional

Filter by float types (comma-separated): figure, table, equation. If not specified, returns all types.

url string Optional

Direct PDF URL. Either id or url is required.