agent

Image Description Service

01KFFC3GD2N54ZRYWMDT8XWB0M

Properties

actions_required
  • entity:view
  • entity:update
  • file:view
  • file:update
  • file:create
  • file:download
  • collection:update
  • relationship:view
description
Generates contextual descriptions and labels for images based on their source documents
endpoint
https://image-description-service.arke.institute
endpoint_verified_at
2026-01-21T04:13:28.116Z
input_schema
properties
entity_id
description
Image entity to describe
type
string
options
description
Agent-specific options
properties
custom_prompt
description
Custom instructions to guide image description generation (appended to the default prompt)
type
string
type
object
required
  • entity_id
type
object
output_description
Adds a generated description and label to the target image entity. The service first follows 'derived_from' or 'extracted_from' relationships from the image to locate a source document, then pulls contextual properties from that source (label, title, description, text, ocr_text, filename, content_type) to inform the vision model. If a medium-resolution derivative exists (found via 'has_derivative'), it is used instead of the full-size original to balance quality and payload size. A vision LLM (Mistral-Small-3.2 via DeepInfra) analyzes the image together with the source context and produces a 1-2 sentence description and a short 2-5 word label. Four properties are then written to the image entity: 'description' (the generated prose), 'label' (replacing the previous label, often a generic filename), 'description_generated_at' (ISO 8601 timestamp), and 'description_model' (the model identifier used). After processing, the description text lives on the image entity's 'description' property.
output_relationships
  • This service does not create or modify any relationships.
  • It reads 'derived_from' and 'extracted_from' relationships from the image to locate a source document for context.
  • It reads 'has_derivative' relationships from the image to find a medium-resolution variant to send to the vision model.
  • All relationship traversal is read-only; the only mutation is updating properties on the target image entity.
output_tree_example
Image entity BEFORE: label: 'page_042.jpg' filename: 'page_042.jpg' content_type: 'image/jpeg' Image entity AFTER: label: 'Map of Colonial Trade Routes' filename: 'page_042.jpg' content_type: 'image/jpeg' description: 'A hand-drawn map showing colonial-era trade routes between Europe, West Africa, and the Caribbean, with arrows indicating the direction of goods and enslaved people.' description_generated_at: '2025-05-14T18:32:09.000Z' description_model: 'mistralai/Mistral-Small-3.2-24B-Instruct-2506'
status
active
Image Description Service | Arke