arcodange/dance-lessons-coach

Fork 0

Files

Gabriel Radureau 89f17cba7d

CI/CD Pipeline / CI Pipeline (push) Failing after 7m12s

Details

🔧 chore: fix skill naming and gitea actions compatibility (related to #2 )

2026-04-06 16:56:11 +02:00

7.0 KiB

Raw Blame History

Skill Creation Best Practices

Based on the Agent Skills Best Practices Guide

Core Principles

Start from Real Expertise

Effective skills are grounded in real domain knowledge, not generic LLM knowledge. Feed project-specific context into the creation process.

Sources of expertise:

Hands-on task completion with agent assistance
Project artifacts (runbooks, API specs, schemas)
Code review comments and issue trackers
Version control history and patches
Real-world failure cases and resolutions

Refine with Real Execution

Test skills against real tasks and refine based on execution traces. Look for:

False positives (skill activating when it shouldn't)
Missed steps or edge cases
Unproductive paths (agent trying multiple approaches)
Context overload (too much irrelevant information)

Context Management

Spend Context Wisely

Every token in your skill competes for attention in the agent's context window.

Add what the agent lacks:

Project-specific conventions
Domain-specific procedures
Non-obvious edge cases
Specific tools/APIs to use

Omit what the agent knows:

Basic concepts (what is a PDF, HTTP, database)
Generic best practices
Obvious implementation details

Design Coherent Units

Skills should encapsulate a coherent unit of work that composes well with others:

Too narrow: Multiple skills needed for one task
Too broad: Hard to activate precisely
Just right: One skill handles one class of problems

Aim for Moderate Detail

Concise, stepwise guidance with working examples outperforms exhaustive documentation.

Instruction Patterns

Gotchas Sections

List environment-specific facts that defy reasonable assumptions:

## Gotchas

- The `users` table uses soft deletes (add `WHERE deleted_at IS NULL`)
- User ID is `user_id` in DB, `uid` in auth service, `accountId` in billing API
- `/health` returns 200 even when DB is down (use `/ready` for full health check)

Templates for Output Format

Provide concrete format examples rather than prose descriptions:

## Report Structure

```markdown
# [Analysis Title]

## Executive Summary
[One-paragraph overview]

## Key Findings
- Finding 1 with data
- Finding 2 with data

## Recommendations
1. Actionable recommendation
2. Actionable recommendation


### Checklists for Multi-Step Workflows

```markdown
## Deployment Workflow

Progress:
- [ ] Step 1: Run tests
- [ ] Step 2: Build artifacts
- [ ] Step 3: Validate configuration
- [ ] Step 4: Deploy to staging
- [ ] Step 5: Run smoke tests

Validation Loops

## Code Review Process

1. Make changes
2. Run linter: `npm run lint`
3. If errors: fix and re-lint
4. Run tests: `npm test`
5. If failures: fix and re-test
6. Only commit when all checks pass

Plan-Validate-Execute Pattern

## Database Migration

1. Generate migration plan: `migrate plan`
2. Review plan against schema
3. Validate: `migrate validate`
4. If invalid: revise and re-validate
5. Execute: `migrate apply`

Control Calibration

Match Specificity to Fragility

Give freedom when multiple approaches are valid:

## Code Review

Check for:
- SQL injection vulnerabilities
- Proper authentication
- Race conditions
- PII in error messages

Be prescriptive when operations are fragile:

## Database Migration

Run exactly:
```bash
python scripts/migrate.py --verify --backup

Do not modify this command.


### Provide Defaults, Not Menus

```markdown
<!-- Avoid -->
You can use pypdf, pdfplumber, PyMuPDF, or pdf2image...

<!-- Better -->
Use pdfplumber for text extraction:

```python
import pdfplumber

For scanned PDFs, use pdf2image with pytesseract.


### Favor Procedures Over Declarations

Teach *how to approach* problems, not *what to produce*:

```markdown
<!-- Avoid -->
Join orders to customers on customer_id, filter region='EMEA', sum amount.

<!-- Better -->
1. Read schema from references/schema.yaml
2. Join tables using _id foreign keys
3. Apply user filters as WHERE clauses
4. Aggregate and format as markdown table

Progressive Disclosure

Keep SKILL.md Under 500 Lines

Move detailed reference material to separate files in references/:

skill-name/
├── SKILL.md              # Core instructions (<500 lines)
├── references/
│   ├── api-spec.md       # Detailed API documentation
│   ├── error-codes.md    # Error code reference
│   └── schemas/          # Data schemas
└── scripts/
    └── validate.sh       # Validation script

Load Context on Demand

Tell the agent when to load reference files:

Read references/api-errors.md if the API returns non-200 status.

Skill Structure Checklist

Required Elements

SKILL.md with valid YAML frontmatter
name field (lowercase alphanumeric + hyphens, 1-64 chars)
description field (1-1024 chars, specific about what/when)
Clear instructions in Markdown body

Recommended Elements

license field
metadata with author/version
scripts/ directory for reusable code
references/ directory for detailed docs
assets/ directory for templates/resources

Validation Checklist

Skill name matches directory name (underscores → hyphens)
Description is specific and actionable
Instructions focus on what agent wouldn't know
Gotchas section for non-obvious issues
Examples provided for key workflows
Progressive disclosure used for large skills
Validation loops for critical operations

Iteration Process

Create initial draft from real expertise
Test against real tasks
Review execution traces for inefficiencies
Refine instructions based on observations
Add gotchas from corrections made
Validate with skill_creator
Repeat until performance is satisfactory

Common Anti-Patterns

❌ Generic Advice

Handle errors appropriately and follow best practices.

✅ Specific Guidance

Check for HTTP 429 errors and implement exponential backoff:

```python
import time
import requests

for attempt in range(5):
    try:
        response = requests.get(url, timeout=10)
        break
    except requests.HTTPError as e:
        if e.response.status_code == 429:
            time.sleep(2 ** attempt)
        else:
            raise


### ❌ Overly Broad Scope
```markdown
This skill handles all database operations including queries, migrations, backups, and administration.

✅ Focused Scope

This skill handles database query optimization for read-heavy workloads on PostgreSQL 14+.

You can use any of these libraries: pandas, numpy, polars, or vaex.

✅ Clear Default

Use polars for DataFrame operations:

```python
import polars as pl
df = pl.read_csv("data.csv")

For pandas compatibility, use the .to_pandas() method.

7.0 KiB Raw Blame History