DDAM LLMIndependent research · AI × DAM

Statistic · AI Tagging · From the AI Tagging Provider Index

9of 10

Vision APIs publishing per-unit pricing.

Nine of the 10 leading image-tagging APIs publish a per-unit price — per image, per 1,000 requests, or per 1,000 tokens — directly on their public pricing page. One provider gates pricing behind a sales conversation. For the three frontier multimodal LLMs, the unit is tokens, not images; mapping that back to per-image cost takes work that most operators don't realize at the planning stage.

As of
May 26, 2026
Sample
n=10 providers
Source
AI Tagging Index v1.0
Updated
Monthly
Methodology
Read →
Topic
AI Tagging

Public per-unit pricing · by provider

v1.0 · Snapshot 2026-05-26 · re-verified monthly

ProviderPricing publishedUnitNotes
Google Cloud VisionYesper 1,000 requestsPer-feature pricing tiers public.
AWS RekognitionYesper 1,000 imagesTiered by volume; per-feature rates listed.
Azure AI VisionYesper 1,000 transactionsTiered F0/S1 pricing public.
ClarifaiYesper operationOperation-based pricing on public site.
ImaggaYesper month / per imagePlan tiers with per-image fall-through.
Cloudinary AIYesper creditCredit-based model maps to AI operations.
Anthropic Claude (vision)Yesper 1M tokensPer-token pricing public; images priced as input tokens.
OpenAI GPT-4o (vision)Yesper 1M tokensPer-token pricing public; image-token pricing documented.
Google Gemini (vision)Yesper 1M tokensPer-token pricing public.
Hive AIPartialPricing page references plans but no per-unit number; sales contact required for production rates.

"Yes" requires that an outside developer can find a per-unit price (per image, per request, per 1,000 ops, per 1M tokens) without filling in a contact form. Token-based pricing for multimodal LLMs is counted as "Yes" because the rate is public, even if mapping to per-image cost takes a calculator. Cells re-verified monthly. Methodology →

The token-cost gotcha

For Anthropic, OpenAI, and Google Gemini, the listed token rate looks competitive next to classical per-image pricing — until you do the conversion. A single high-resolution image to GPT-4o consumes ~1,500-3,000 input tokens; to Claude, several thousand. At list rates that puts a single image inference at roughly 5-15× the per-image cost of Google Cloud Vision or AWS Rekognition for a comparable label-extraction task. The reason teams pay the premium is what they get in return — open-ended reasoning, instruction following, multi-image context. But the cost surprise is real, and we see operators caught off-guard by it more often than any other line item in the index.

What counts

  • Yes — per-unit price is published on the vendor's public pricing page.
  • Partial — pricing page exists but per-unit rate is not disclosed; sales contact required.
  • No — pricing is entirely behind a sales call.

Cite this statistic

DAM LLM Research. "Vision APIs publishing per-unit pricing, May 2026." damllm.ai, 2026. https://damllm.ai/statistics/vision-apis-with-public-pricing/

See also