Image Annotation Tool: VLM-Based Annotation for Neuroscience

Overview

The Image Annotation Tool generates annotations for static image datasets using open-source Vision Language Models (VLMs). We have used this tool to create an annotation bank for the NSD Shared 1000 images, the subset of images viewed by all 8 subjects in the Natural Scenes Dataset study.

The tool runs VLMs locally via OLLAMA. The current annotation bank includes outputs from six models: Qwen2.5-VL (7B, 32B), Gemma3 (4B, 12B, 27B), and Mistral-Small3.2 (24B). Quality assessment across models is in progress.

Key Features

Models Used

All annotations generated locally via OLLAMA:

Qwen2.5-VL: 7B and 32B parameter versions
Gemma3: 4B, 12B, and 27B parameter versions
Mistral-Small3.2: 24B parameters

Multi-Prompt Annotation

Each image is annotated using multiple prompts (general description, foreground/background, entities and interactions, mood and emotions) across all models to capture different aspects of the scene.

HED Integration (Planned)

Integration with Hierarchical Event Descriptors (HED) is the next development priority:

Mapping VLM annotations to HED tags
Validation against HED schema
Export in BIDS-compliant events.tsv format

BIDS Compliance

Annotations follow stimuli-BIDS specifications:

Standardized events.tsv format
JSON sidecars with annotation schema
Compatible with neuroimaging datasets

Web Dashboard

Interactive visualization with AGI branding, real-time annotation preview, and easy navigation through large datasets.

Annotation Types

The tool supports multiple annotation types for comprehensive image description:

General Description

Detailed natural language descriptions of image content, setting, main elements, colors, lighting, and overall composition.

Object Detection

Identification and localization of objects within images, compatible with COCO categories.

Scene Categorization

Classification of scenes into semantic categories for cross-dataset analysis.

Emotional Ratings

Valence and arousal ratings for affective neuroscience applications.

Natural Scenes Dataset (NSD)

The tool is optimized for the NSD Shared 1000 Dataset, featuring 1,000 images viewed by all 8 subjects in the Natural Scenes Dataset study. This shared subset enables:

Cross-subject analysis of neural representations
Benchmark annotations for model comparison
Foundation for the broader 73,000 image collection

Architecture

Backend

FastAPI: Python API for VLM orchestration
OLLAMA: Local model serving on GPU

Frontend

Next.js: Modern React framework
AGI Branding: Consistent design with Annotation Garden
Real-time Updates: Live annotation preview

Storage

JSON files with comprehensive metrics
Database support for large datasets
Version control through Git

Performance Benchmarks

All performance metrics were generated using an NVIDIA GeForce RTX 4090 GPU with OLLAMA for local model inference.

Annotation Tools

Powerful CLI tools for post-processing annotations:

from image_annotation.utils import reorder_annotations, remove_model, export_to_csv

# Reorder model annotations by quality
reorder_annotations("annotations/", ["best_model", "second_best"])

# Remove underperforming models
remove_model("annotations/", "poor_model")

# Export for analysis
export_to_csv("annotations/", "results.csv", include_metrics=True)

Quick Start

Prerequisites

Python 3.11+
Node.js 18+
OLLAMA
GPU with sufficient VRAM for target models

Installation

# Clone from Annotation Garden
git clone https://github.com/Annotation-Garden/image-annotation.git
cd image-annotation

# Python environment
conda create -n torch-312 python=3.12
conda activate torch-312
pip install -e .

# Frontend
cd frontend && npm install

Usage

# Start OLLAMA (for local models)
ollama serve

# Run frontend dashboard
cd frontend && npm run dev
# Visit http://localhost:3000

Access Information

License

License: CC-BY-NC-SA 4.0

Annotation Garden Initiative : Parent organization for collaborative annotation
HED-MCP : Model Context Protocol for HED integration
Natural Scenes Dataset : Source of stimulus images