Community程式設計與開發github.com

dotnet-slopwatch

Use Slopwatch to detect LLM reward hacking in .NET code changes. Run after every code modification to catch disabled tests, suppressed warnings, empty catch blocks, and other shortcuts that mask real problems.

相容平台Claude Code~Codex CLI~Cursor
npx add-skill https://github.com/Aaronontheweb/dotnet-skills/tree/main/skills/slopwatch

name: dotnet-slopwatch description: Use Slopwatch to detect LLM reward hacking in .NET code changes. Run after every code modification to catch disabled tests, suppressed warnings, empty catch blocks, and other shortcuts that mask real problems. invocable: true

Slopwatch: LLM Anti-Cheat for .NET

When to Use This Skill

Use this skill constantly. Every time an LLM (including Claude) makes changes to:

  • C# source files (.cs)
  • Project files (.csproj)
  • Props files (Directory.Build.props, Directory.Packages.props)
  • Test files

Run slopwatch to validate the changes don't introduce "slop."

What is Slop?

"Slop" refers to shortcuts LLMs take that make tests pass or builds succeed without actually solving the underlying problem. These are reward hacking behaviors - the LLM optimizes for apparent success rather than real fixes.

Common Slop Patterns

PatternExampleWhy It's Bad
Disabled tests[Fact(Skip="flaky")]Hides failures instead of fixing them
Warning suppression#pragma warning disable CS8618Silences compiler without fixing issue
Empty catch blockscatch (Exception) { }Swallows errors, hides bugs
Arbitrary delaysawait Task.Delay(1000);Masks race conditions, makes tests slow
Project-level suppression<NoWarn>CS1591</NoWarn>Disables warnings project-wide
CPM bypassVersion="1.0.0" inlineUndermines central package management

Never accept these patterns. If an LLM introduces slop, reject the change and require a proper fix.


Installation

As a Local Tool (Recommended)

Add to .config/dotnet-tools.json:

{
  "version": 1,
  "isRoot": true,
  "tools": {
    "slopwatch.cmd": {
      "version": "0.2.0",
      "commands": ["slopwatch"],
      "rollForward": false
    }
  }
}

Then restore:

dotnet tool restore

As a Global Tool

dotnet tool install --global Slopwatch.Cmd

First-Time Setup: Establish a Baseline

Before using slopwatch on an existing project, create a baseline of current issues:

# Initialize baseline from existing code
slopwatch init

# This creates .slopwatch/baseline.json
git add .slopwatch/baseline.json
git commit -m "Add slopwatch baseline"

Why baseline? Legacy code may have existing issues. The baseline ensures slopwatch only catches new slop being introduced, not pre-existing technical debt.


Usage During LLM Sessions

After Every Code Change

Run slopwatch after any LLM-generated code modification:

# Analyze for new issues (uses baseline)
slopwatch analyze

# Use strict mode - fail on warnings too
slopwatch analyze --fail-on warning

When Slopwatch Flags an Issue

Do not ignore it. Instead:

  1. Understand why the LLM took the shortcut
  2. Request a proper fix - be specific about what's wrong
  3. Verify the fix doesn't introduce different slop
# Example: LLM disabled a test
❌ SW001 [Error]: Disabled test detected
   File: tests/MyApp.Tests/OrderTests.cs:45
   Pattern: [Fact(Skip="Test is flaky")]

# Correct response: Ask for actual fix
"This test was disabled instead of fixed. Please investigate why
it's flaky and fix the underlying timing/race condition issue."

Updating the Baseline (Rare)

Only update the baseline when slop is truly justified and documented:

# Add current detections to baseline (use sparingly!)
slopwatch analyze --update-baseline

Justification examples:

  • Third-party library forces a pattern (e.g., must suppress specific warning)
  • Intentional delay for rate limiting (not test flakiness)
  • Generated code that can't be modified

Document why in a code comment when updating baseline.


Claude Code Hook Integration

Add slopwatch as a hook to automatically validate every edit. Create or update .claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit|MultiEdit",
        "hooks": [
          {
            "type": "command",
            "command": "slopwatch analyze -d . --hook",
            "timeout": 60000
          }
        ]
      }
    ]
  }
}

The --hook flag:

  • Only analyzes git dirty files (fast, even on large repos)
  • Outputs errors to stderr in readable format
  • Blocks the edit on warnings/errors (exit code 2)
  • Claude sees the error and can fix it immediately

CI/CD Integration

Add slopwatch to your CI pipeline as a quality gate:

GitHub Actions

jobs:
  slopwatch:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup .NET
        uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '9.0.x'

      - name: Install Slopwatch
        run: dotnet tool install --global Slopwatch.Cmd

      - name: Run Slopwatch
        run: slopwatch analyze -d . --fail-on warning

Azure Pipelines

- task: DotNetCoreCLI@2
  displayName: 'Install Slopwatch'
  inputs:
    command: 'custom'
    custom: 'tool'
    arguments: 'install --global Slopwatch.Cmd'

- script: slopwatch analyze -d . --fail-on warning
  displayName: 'Slopwatch Analysis'

Detection Rules

RuleSeverityWhat It Catches
SW001ErrorDisabled tests (Skip=, Ignore, #if false)
SW002WarningWarning suppression (#pragma warning disable, SuppressMessage)
SW003ErrorEmpty catch blocks that swallow exceptions
SW004WarningArbitrary delays in tests (Task.Delay, Thread.Sleep)
SW005WarningProject file slop (NoWarn, TreatWarningsAsErrors=false)
SW006WarningCPM bypass (VersionOverride, inline Version attributes)

Configuration

Create .slopwatch/slopwatch.json to customize:

{
  "minSeverity": "warning",
  "rules": {
    "SW001": { "enabled": true, "severity": "error" },
    "SW002": { "enabled": true, "severity": "warning" },
    "SW003": { "enabled": true, "severity": "error" },
    "SW004": { "enabled": true, "severity": "warning" },
    "SW005": { "enabled": true, "severity": "warning" },
    "SW006": { "enabled": true, "severity": "warning" }
  },
  "exclude": [
    "**/Generated/**",
    "**/obj/**",
    "**/bin/**"
  ]
}

Strict Mode (Recommended for LLM Sessions)

For maximum protection during LLM coding sessions, elevate all rules to errors:

{
  "minSeverity": "warning",
  "rules": {
    "SW001": { "enabled": true, "severity": "error" },
    "SW002": { "enabled": true, "severity": "error" },
    "SW003": { "enabled": true, "severity": "error" },
    "SW004": { "enabled": true, "severity": "error" },
    "SW005": { "enabled": true, "severity": "error" },
    "SW006": { "enabled": true, "severity": "error" }
  }
}

The Philosophy: Zero Tolerance for New Slop

  1. Baseline captures legacy - Existing issues are acknowledged but isolated
  2. New slop is blocked - Any new shortcut fails the build/edit
  3. Exceptions require justification - If you must update baseline, document why
  4. LLMs are not special - The same rules apply to human and AI-generated code

The goal is to prevent the gradual accumulation of technical debt that occurs when LLMs optimize for "make the test pass" rather than "fix the actual problem."


Quick Reference

# First time setup
slopwatch init
git add .slopwatch/baseline.json

# After every LLM code change
slopwatch analyze

# Strict mode (recommended)
slopwatch analyze --fail-on warning

# With stats (performance debugging)
slopwatch analyze --stats

# Update baseline (rare, document why)
slopwatch analyze --update-baseline

# JSON output for tooling
slopwatch analyze --output json

When to Override (Almost Never)

The only valid reasons to update baseline or disable a rule:

ScenarioActionRequired
Third-party forces patternUpdate baselineCode comment explaining why
Generated code (not editable)Add to exclude listDocument in config
Intentional rate limiting delayUpdate baselineCode comment, not in test
Legacy code cleanupOne-time baseline updatePR description

Invalid reasons:

  • "The test is flaky" → Fix the flakiness
  • "The warning is annoying" → Fix the code
  • "It works on my machine" → Fix the race condition
  • "We'll fix it later" → Fix it now

Individual skills in this repo

This repo contains 20 individual skills — each has its own dedicated page.

akka-hosting-actor-patterns

Patterns for building entity actors with Akka.Hosting - GenericChildPerEntityParent, message extractors, cluster sharding abstraction, akka-reminders, and ITimeProvider. Supports both local testing and clustered production modes.

akka-net-aspire-configuration

Configure Akka.NET with .NET Aspire for local development and production deployments. Covers actor system setup, clustering, persistence, Akka.Management integration, and Aspire orchestration patterns.

akka-net-best-practices

Critical Akka.NET best practices including EventStream vs DistributedPubSub, supervision strategies, error handling, Props vs DependencyResolver, work distribution patterns, and cluster/local mode abstractions for testability.

akka-net-management

Akka.Management for cluster bootstrapping, service discovery (Kubernetes, Azure, Config), health checks, and dynamic cluster formation without static seed nodes.

akka-net-testing-patterns

Write unit and integration tests for Akka.NET actors using modern Akka.Hosting.TestKit patterns. Covers dependency injection, TestProbes, persistence testing, and actor interaction verification. Includes guidance on when to use traditional TestKit.

api-design

Design stable, compatible public APIs using extend-only design principles. Manage API compatibility, wire compatibility, and versioning for NuGet packages and distributed systems.

aspire-configuration

Configure Aspire AppHost to emit explicit app config via environment variables; keep app code free of Aspire clients and service discovery.

aspire-integration-testing

Write integration tests using .NET Aspire

aspire-service-defaults

Create a shared ServiceDefaults project for Aspire applications. Centralizes OpenTelemetry, health checks, resilience, and service discovery configuration across all services.

crap-analysis

Analyze code coverage and CRAP (Change Risk Anti-Patterns) scores to identify high-risk code. Use OpenCover format with ReportGenerator for Risk Hotspots showing cyclomatic complexity and untested code paths.

csharp-concurrency-patterns

Choosing the right concurrency abstraction in .NET - from async/await for I/O to Channels for producer/consumer to Akka.NET for stateful entity management. Avoid locks and manual synchronization unless absolutely necessary.

database-performance

Database access patterns for performance. Separate read/write models, avoid N+1 queries, use AsNoTracking, apply row limits, and never do application-side joins. Works with EF Core and Dapper.

dependency-injection-patterns

Organize DI registrations using IServiceCollection extension methods. Group related services into composable Add* methods for clean Program.cs and reusable configuration in tests.

dotnet-devcert-trust

Diagnose and fix .NET HTTPS dev certificate trust issues on Linux. Covers the full certificate lifecycle from generation to system CA bundle inclusion, with distro-specific guidance for Ubuntu, Fedora, Arch, and WSL2.

dotnet-local-tools

Managing local .NET tools with dotnet-tools.json for consistent tooling across development environments and CI/CD pipelines.

dotnet-project-structure

Modern .NET project structure including .slnx solution format, Directory.Build.props, central package management, SourceLink, version management with RELEASE_NOTES.md, and SDK pinning with global.json.

efcore-patterns

Entity Framework Core best practices including NoTracking by default, query splitting for navigation collections, migration management, dedicated migration services, and common pitfalls to avoid.

ilspy-decompile

Understand implementation details of .NET code by decompiling assemblies. Use when you want to see how a .NET API works internally, inspect NuGet package source, view framework implementation, or understand compiled .NET binaries.

mailpit-integration

Test email sending locally using Mailpit with .NET Aspire. Captures all outgoing emails without sending them. View rendered HTML, inspect headers, and verify delivery in integration tests.

marketplace-publishing

Workflow for publishing skills and agents to the dotnet-skills Claude Code marketplace. Covers adding new content, updating plugin.json, validation, and release tagging.

相關技能