Multimodal Capabilities
All Claude 4 models support vision input. Send images, screenshots, charts, diagrams, and PDF pages for analysis alongside text.
Image Analysis
Send any image (JPEG, PNG, GIF, WebP) up to 5MB. Claude can describe content, extract text via OCR, identify objects, and answer questions about the visual.
Document Understanding
Pass PDF pages as images for extraction of tables, charts, and structured data. Ideal for invoices, forms, research papers, and technical diagrams.
Chart and Data Extraction
Send screenshots of dashboards or charts and ask Claude to extract underlying data, identify trends, or generate equivalent code to reproduce the visualisation.
UI Screenshot Analysis
Share screenshots of UIs for debugging, accessibility review, or automated testing. Claude can identify elements, suggest improvements, and generate test selectors.
ShopMate -- Photo-to-Description
# shopmate/vision/photo_description.py # Maya uploads a product photo -> ShopMate writes the description import anthropic, base64 from pathlib import Path client = anthropic.Anthropic() SYSTEM = """You are a ThreadCo copywriter. When given a product photo: 1. Identify the garment type, colours, print/design, and apparent material 2. Write a 2-sentence product description in ThreadCo's voice: - Friendly, direct, never corporate - Always mention sustainability - No exclamation marks - Forbidden words: vibrant, perfect, stylish, must-have""" def describe_from_photo(image_path: str, extra_info: str = "") -> str: """Generate a product description directly from a product photo.""" data = base64.b64encode(Path(image_path).read_bytes()).decode() mime = "image/png" if image_path.endswith(".png") else "image/jpeg" resp = client.messages.create( model="claude-sonnet-4-6", max_tokens=150, system=SYSTEM, messages=[{"role":"user", "content":[ {"type":"image","source":{"type":"base64","media_type":mime,"data":data}}, {"type":"text", "text": f"Write the ThreadCo product description.{' Extra info: ' + extra_info if extra_info else ''}"} ]}] ) return resp.content[0].text def batch_describe_photos(photo_dir: str) -> list[dict]: """Process a folder of product photos in one go.""" results = [] for path in Path(photo_dir).glob("*.jpg"): desc = describe_from_photo(str(path)) results.append({"file": path.name, "description": desc}) print(f"{path.name}: {desc[:80]}...") return results # Usage: Maya drags her product photos into a folder, runs one command # results = batch_describe_photos("new_season_photos/")