Community研究與資料分析github.com

kaushikpaul90/ai-driven-incident-management

An AI-powered system for autonomous incident detection, diagnosis, and remediation in high-performance computing (HPC) environments. Leverages agentic workflows, RAG-based knowledge retrieval, and ML on log data to reduce downtime and enhance reliability. Features runbooks for HPC issues like disk failures, memory errors, and network timeouts.

相容平台Claude CodeCodex CLI~CursorAntigravityGemini CLI
npx add-skill kaushikpaul90/ai-driven-incident-management

kaushikpaul90/ai-driven-incident-management

An AI-powered system for autonomous incident detection, diagnosis, and remediation in high-performance computing (HPC) environments. Leverages agentic workflows, RAG-based knowledge retrieval, and ML on log data to reduce downtime and enhance reliability. Features runbooks for HPC issues like disk failures, memory errors, and network timeouts.

Source: https://github.com/kaushikpaul90/ai-driven-incident-management

Discovered from GitHub repositories pushed in the last 24 hours for agent skills, Claude/Codex/Gemini workflows, MCP tooling, and adjacent AI-agent automation.

相關技能