aradotso/deepseek-ocr
Expert skill for using DeepSeek-OCR, a vision-language model for optical character recognition with context optical compression supporting documents, PDFs, and images.
Expert skill for using DeepSeek-OCR, a vision-language model for optical character recognition with context optical compression supporting documents, PDFs, and images.
npx skills add https://github.com/aradotso/trending-skills/tree/main/skills/deepseek-ocrExpert skill for using DeepSeek-OCR, a vision-language model for optical character recognition with context optical compression supporting documents, PDFs, and images.
This repo contains 20 individual skills — each has its own dedicated page.
Self-evolving AI agent system with 26 tools, three-layer memory, MCP plugins, and 24/7 self-repair in pure Python.
A collection of specialized AI agent personalities for Claude Code, Cursor, Aider, Windsurf, and other AI coding tools — covering engineering, design, marketing, sales, and more.
Headless browser automation CLI for AI agents using native Rust binary with Chrome DevTools Protocol
AI coding agent skill for Antigravity Manager — a Tauri v2 + Rust desktop app and Docker service that manages multiple Google/Anthropic accounts and proxies them as standard OpenAI/Anthropic/Gemini API endpoints with intelligent account rotation.
Guide to deploying and managing OpenClaw-compatible AI agent systems across cloud, bare metal, and hybrid infrastructure.
Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.
Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.
Search and retrieve Norwegian company data from Brønnøysundregistrene (the Norwegian Business Registry). Access all ~1.2 million registered companies in Norway.
Give AI agents access to your live Chrome session via CDP — interact with open tabs, logged-in accounts, and current page state
A Claude Code plugin that displays a real-time HUD showing context usage, active tools, running agents, and todo progress in your terminal statusline.
Enable multiple Claude Code instances to discover each other and exchange messages in real-time via a local broker daemon and MCP server.
Command Line User Interface for Claude Code — a floating macOS desktop overlay with multi-tab sessions, permission approval UI, voice input, and skills marketplace.
AI-native terminal multiplexer with programmable socket API, full Playwright-equivalent browser automation, and agent team coordination — built for Claude Code and autonomous agent workflows
Build a persistent knowledge graph of your codebase so Claude reads only what matters — up to 49x fewer tokens on coding tasks.
Self-directed iterative research skill for Codex that continuously cycles through modify, verify, retain or discard, and repeat until a measurable goal is reached.
Personal AI assistant framework supporting multiple chat channels (DingTalk, Feishu, QQ, Discord, etc.) with extensible skills, local/cloud deployment, and cron scheduling.
AI-powered green screen keyer that unmixes foreground colors and generates clean linear alpha channels using neural networks
Personal intelligence agent that aggregates 27 OSINT data sources into a self-hosted Jarvis-style dashboard with Telegram/Discord bots, LLM analysis, and real-time alerts.
LLM-powered A/H/US stock intelligent analysis system with multi-source data, real-time news, AI decision dashboards, and multi-channel push notifications via GitHub Actions.
Install and use the Edict (三省六部) multi-agent orchestration system with 12 specialized AI agents, real-time kanban dashboard, and audit trails
Comprehensive GLSL shader techniques for creating stunning visual effects — ray marching, SDF modeling, fluid simulation, particle systems, procedural generation, lighting, post-processing, and more.
Add documentation for a new AI provider — usage docs, env vars, Docker config, image resources.
AI-powered image generation for Salesforce visuals via Nano Banana Pro. TRIGGER when: user asks for PNG/SVG output, UI mockups, wireframes, visual ERDs, or says "generate image" / "create mockup". DO NOT TRIGGER when: text-based Mermaid diagrams (use sf-diagram-mermaid), or non-visual documentation tasks.
Fetch Twitter/X post content including long-form Articles with full images and metadata. Use when Claude needs to retrieve tweet/article content, author info, engagement metrics, and embedded media. Supports individual posts and X Articles (long-form content). Automatically downloads all images to local attachments folder and generates complete Markdown with proper image references. Preferred over Jina for X Articles with images.
Run a structured design critique against the brief and codebase. Checks visual hierarchy, consistency, responsiveness, accessibility, and aesthetic fidelity. Use when user wants a design review, critique, QA pass, polish pass, or mentions "review" after building.
Edit images with Google Nano Banana 2 (image-to-image edit endpoint) on RunComfy. Documents Nano Banana Edit's strengths (preserve subject identity, swap background, localize edits with spatial language, multi-image batch edits up to 20 inputs), the schema, and when to route to GPT Image 2 edit / Flux Kontext / Nano Banana 2 t2i instead. Calls `runcomfy run google/nano-banana-2/edit` through the local RunComfy CLI. Triggers on "nano banana edit", "edit with nano banana", "image edit nano banana", or any explicit ask to edit with this model.