agent

Image Description Service

01KFFC3GD2N54ZRYWMDT8XWB0M

Properties

actions_required

entity:view
entity:update
file:view
file:update
file:create
file:download
collection:update
relationship:view

description

Generates contextual descriptions and labels for images based on their source documents

endpoint

https://image-description-service.arke.institute

endpoint_verified_at

2026-01-21T04:13:28.116Z

input_schema

properties

entity_id

description: Image entity to describe
type: string

options

description

Agent-specific options

properties

custom_prompt

description: Custom instructions to guide image description generation (appended to the default prompt)
type: string

type

object

required

entity_id

type

object

output_description

Adds a generated description and label to the target image entity. The service first follows 'derived_from' or 'extracted_from' relationships from the image to locate a source document, then pulls contextual properties from that source (label, title, description, text, ocr_text, filename, content_type) to inform the vision model. If a medium-resolution derivative exists (found via 'has_derivative'), it is used instead of the full-size original to balance quality and payload size. A vision LLM (Mistral-Small-3.2 via DeepInfra) analyzes the image together with the source context and produces a 1-2 sentence description and a short 2-5 word label. Four properties are then written to the image entity: 'description' (the generated prose), 'label' (replacing the previous label, often a generic filename), 'description_generated_at' (ISO 8601 timestamp), and 'description_model' (the model identifier used). After processing, the description text lives on the image entity's 'description' property.

output_relationships

This service does not create or modify any relationships.
It reads 'derived_from' and 'extracted_from' relationships from the image to locate a source document for context.
It reads 'has_derivative' relationships from the image to find a medium-resolution variant to send to the vision model.
All relationship traversal is read-only; the only mutation is updating properties on the target image entity.

output_tree_example

Image entity BEFORE: label: 'page_042.jpg' filename: 'page_042.jpg' content_type: 'image/jpeg' Image entity AFTER: label: 'Map of Colonial Trade Routes' filename: 'page_042.jpg' content_type: 'image/jpeg' description: 'A hand-drawn map showing colonial-era trade routes between Europe, West Africa, and the Caribbean, with arrows indicating the direction of goods and enslaved people.' description_generated_at: '2025-05-14T18:32:09.000Z' description_model: 'mistralai/Mistral-Small-3.2-24B-Instruct-2506'

status

active

Metadata

Version: 7
Created: 1/21/2026
Updated: 1/30/2026
Edited by: ARCHON