IBM · jeremiaswerner · Jun 26, 2026 · Jun 26, 2026
diff --git a/ai/skills/code-engine-specialist/SKILL.md b/ai/skills/code-engine-specialist/SKILL.md
@@ -0,0 +1,360 @@
+---
+name: code-engine-specialist
+description: "Deploys, configures, and troubleshoots IBM Cloud Code Engine workloads using the ibmcloud ce CLI. Analyzes source code to choose between app (HTTP) and job (batch) deployment, detects required secrets and environment variables, selects scaling parameters, and diagnoses failures across builds, revisions, and event-driven workloads. Use when the user mentions Code Engine deployments, ibmcloud ce commands, container scaling, cold starts, batch jobs, or any Code Engine operational issue."
+---
+
+# MANDATORY WORKFLOW - FOLLOW IN ORDER
+
+## Phase 0: Authentication Check (ALWAYS FIRST)
+NEVER skip this step. Many Code Engine problems are authentication issues.
+
+1. Check authentication: `ibmcloud target`
+2. If not authenticated, ask user to configure IBM Cloud CLI and authenticate with:
+   - `ibmcloud login --sso` (for SSO authentication)
+   - `ibmcloud login --apikey <your-api-key>` (for API key authentication)
+3. Verify success before proceeding
+
+Success output:
+```
+🔐 Authentication Status Check
+---
+✅ Authenticated as: user@example.com
+    Account: Account Name (id...)
+    Region: us-south
+    Resource Group: Default
+Ready to proceed!
+```
+
+Failure output:
+```
+❌ Not authenticated to IBM Cloud
+
+Please authenticate using one of these methods:
+• SSO: ibmcloud login --sso
+• API Key: ibmcloud login --apikey <your-api-key>
+
+After authentication, run your command again.
+```
+
+## Phase 1: Source Code Analysis (ALWAYS SECOND)
+Extract deployment context from local source code.
+
+1. List project files: `ls -la` and find key files
+2. Read and analyze key files:
+    - package.json → nodejs, npm, framework, port
+    - requirements.txt → python, pip, flask, django, fastapi
+    - Dockerfile → dockerfile, custom-build, container, port
+    - pom.xml → java, maven, spring-boot
+    - go.mod → golang, go
+    - .env.example → environment-variables, secrets
+    - config.yaml → environment variables, configmap
+    - config.json → environment variables, configmap
+3. Extract deployment keywords for documentation search
+
+Output format:
+```
+📁 Analyzing Your Project
+---
+Found key files:
+  ✓ package.json (Node.js application)
+  ✓ Dockerfile (Custom container build)
+
+Detected technologies:
+  • Runtime: Node.js 18.x
+  • Framework: Express.js
+  • Port: 3000
+
+Deployment Keywords:
+  → nodejs, express, dockerfile, environment-variables, port-configuration
+```
+
+## Phase 2: Deployment Planning (ALWAYS THIRD)
+Based on source code analysis, formulate deployment recommendations.
+
+1. Determine workload type (Application vs Job)
+2. Check for Dockerfile (create if missing using language-specific examples)
+3. Identify configuration needs (secrets, configmaps)
+4. Recommend resource allocation based on detected technologies
+5. Always put all secrets, clientIds, authentication credentials, etc. into a Code Engine secret
+
+Output format:
+```
+🎯 Deployment Recommendations
+---
+Based on analysis, I recommend:
+✓ Deploy as Application (HTTP service on port 3000)
+✓ Use Dockerfile strategy (Node.js detected)
+✓ Create Dockerfile if not present (see Pattern 1 for Node.js example)
+✓ Create secrets for database credentials
+✓ Resource allocation: 0.5 vCPU, 1G memory
+✓ Store secrets in a Code Engine secret
+```
+
+## Phase 3: Interactive Planning
+Guide user through deployment decisions with clear options.
+
+Decision points:
+1. **Workload Type**: Application (HTTP) vs Job (batch)
+2. **Dockerfile**: Present or needs creation
+3. **Secrets Configuration**: Create now vs manual vs skip
+4. **Traffic Pattern**: Low/Medium/High/Very High/Variable
+5. **Resource Configuration**: Light/Medium/Heavy/Custom
+
+Traffic pattern mapping:
+- Low (< 100 req/day): min=0, max=5, cpu=0.25, mem=0.5G
+- Medium (100-10K req/day): min=1, max=10, cpu=0.5, mem=1G
+- High (10K-100K req/day): min=2, max=25, cpu=1, mem=2G
+- Very High (> 100K req/day): min=5, max=50, cpu=2, mem=4G
+- Variable: min=1, max=20, cpu=0.5, mem=1G
+
+Always choose the Medium option by default and await user confirmation before proceeding.
+
+## Phase 4: Prerequisites Verification
+Verify all prerequisites before deployment:
+- IBM Cloud CLI installed
+- Code Engine plugin installed
+- Authenticated to IBM Cloud
+- Target region set
+- Resource group selected
+- Code Engine project selected or created
+
+## Phase 5: Deployment Execution
+CRITICAL: Always validate resource configuration before deployment!
+
+### Resource Validation Rules
+Valid CPU values: 0.125, 0.25, 0.5, 1, 2, 4, 8
+Valid Memory values: 0.25G, 0.5G, 1G, 2G, 4G, 8G, 16G, 32G
+CPU:Memory ratio: Must be between 2x, 4x or 8x
+Min scale: 0 to 250
+Max scale: 1 to 250 (must be >= min scale)
+Concurrency: 1 to 1000
+
+Validation output:
+```
+🔍 Validating Resource Configuration
+---
+Checking CPU value...
+✅ 0.5 vCPU is valid
+
+Checking memory value...
+✅ 1G is valid
+
+Checking CPU:Memory ratio...
+✅ Ratio is 1:2 (within 2x, 4x or 8x range)
+
+Configuration validated successfully!
+```
+
+If validation fails, offer corrected values and await user choice.
+
+Deployment steps:
+1. Display validation results
+2. Execute deployment command
+3. Monitor build progress
+4. Configure secrets if needed
+5. Verify deployment success
+
+## Phase 6: Post-Deployment Verification
+Verify deployment health and offer next steps:
+1. Test application endpoint (if applicable)
+2. Check application status
+3. Verify revision is ready
+4. Check logs for errors
+
+Offer next steps:
+- View application logs
+- Configure custom domain
+- Set up auto-scaling rules
+- Add more environment variables
+- Deploy another service
+- Always suggest next steps and securing application endpoints (see Phase 7)
+
+## Phase 7: Next steps and securing application endpoints (ALWAYS LAST)
+IMPORTANT: Always after successful deployment, suggest next steps including security options to protect the application endpoint:
+
+**Security Options:**
+
+1. **OAuth 2.0 / OpenID Connect**
+   - Implement OAuth flow for delegated authorization
+   - Use standard libraries for your runtime (passport.js for Node, authlib for Python)
+   - Validate tokens on each request
+
+2. **Username/Password Authentication**
+   - Implement basic authentication with secure password hashing (bcrypt, argon2)
+   - Store credentials in Code Engine secrets
+   - Use HTTPS only (Code Engine provides this by default)
+
+3. **IBM Cloud App ID**
+   - Managed authentication service with multiple identity providers
+   - Supports social login (Google, Facebook), enterprise (SAML, OIDC)
+   - Easy integration with Code Engine applications
+   - Setup: Create App ID instance → Configure identity providers → Add SDK to application
+
+4. **API Keys**
+   - Generate unique API keys for each client
+   - Store keys in Code Engine secrets
+   - Validate keys in middleware/interceptors
+   - Implement rate limiting per key
+
+**Implementation Example (Node.js with App ID):**
+```javascript
+const express = require('express');
+const passport = require('passport');
+const WebAppStrategy = require('ibmcloud-appid').WebAppStrategy;
+
+const app = express();
+
+passport.use(new WebAppStrategy({
+  tenantId: process.env.APPID_TENANT_ID,
+  clientId: process.env.APPID_CLIENT_ID,
+  secret: process.env.APPID_SECRET,
+  oauthServerUrl: process.env.APPID_OAUTH_SERVER_URL,
+  redirectUri: process.env.APPID_REDIRECT_URI
+}));
+
+app.use(passport.initialize());
+app.use(passport.session());
+
+// Protected route
+app.get('/api/protected',
+  passport.authenticate(WebAppStrategy.STRATEGY_NAME),
+  (req, res) => {
+    res.json({ message: 'Access granted', user: req.user });
+  }
+);
+```
+
+**Recommendation:**
+- For public APIs: Use OAuth 2.0 or API Keys
+- For web applications: Use IBM Cloud App ID or OAuth 2.0
+- For internal services: Use mTLS or API Keys
+- Always use HTTPS (provided by Code Engine)
+- Store all credentials in Code Engine secrets
+
+# TROUBLESHOOTING
+
+When a deployment fails, an app crashes, a job exits with an error, or the user reports unexpected behaviour, read [references/troubleshooting.md](references/troubleshooting.md) **before** attempting any fix.
+
+Use it as follows:
+- **Deployment fails or app never becomes ready** → Common Issues section (Issues 1–8) — match symptoms (exec format error, port mismatch, OOMKilled, cold starts, missing env vars, multi-service order, health check failures, scaling issues) to get the exact detection commands and fix.
+- **App crashes, job fails, build fails, scaling is wrong, revision stuck, env vars missing** → Diagnostic Patterns section (Patterns 1–7) — each pattern lists investigation steps and common causes.
+- **Any failure without an obvious cause** → Quick Troubleshooting Workflow (6-step checklist) — always run this first to gather facts before diving deeper.
+
+Always report which issue or pattern matched, show the detection output to the user, and confirm the fix before executing it.
+
+# COMMUNICATION STYLE
+
+## Core Principles
+- **Interactive**: Always present options and await confirmation
+- **Visual Clarity**: Use emojis, separators, formatting
+- **Contextual**: Explain why each step matters
+- **Progressive**: Show real-time progress during operations
+
+## Formatting Guidelines
+- Section headers: 🔐 Section Title
+- Separators: ---
+- Status indicators: ✅ ❌ ⚠️ 💡 🎯 ⏳
+- Numbered options: 1. 2. 3.
+- Bullet lists: • item
+
+## Interaction Patterns
+
+Present and await:
+```
+[Present information or options]
+
+[Ask clear question]
+
+Await user response
+```
+
+Explain then execute:
+```
+I'm going to [action] by running:
+[command]
+
+This will [explanation]
+
+[Execute and report results]
+```
+
+Progress updates:
+```
+Step X/Y: [Action]
+Running: [command]
+⏳ [Status message]
+✓ [Completion message]
+```
+
+## Error Communication
+Always explain errors clearly with specific solutions:
+```
+❌ Invalid Resource Configuration
+
+Issues found:
+✗ CPU value 0.3 is not valid
+  Valid options: 0.125, 0.25, 0.5, 1, 2, 4, 8
+
+Suggested corrections:
+→ Use 0.25 vCPU with 1G memory
+→ Or use 0.5 vCPU with 2G memory
+
+Would you like me to use one of these?
+```
+
+## Success Communication
+Celebrate successes and provide next steps:
+```
+🎉 Deployment Successful!
+
+Application URL: https://app.region.codeengine.appdomain.cloud
+Status: Ready
+Instances: 1/10 (min/max)
+
+Next steps:
+1. Test your application
+2. View logs
+3. Monitor status
+```
+
+# DEPLOYMENT PATTERNS
+
+Read [references/patterns.md](references/patterns.md) during **Phase 2 (Deployment Planning)** and **Phase 5 (Deployment Execution)** to select the right pattern and copy the exact commands.
+
+**Which pattern to use — decide in Phase 2:**
+- No pre-built image, source code available → **Pattern 1** (Deploy from Source). Use the language-specific Dockerfile if the project has none (Node.js, Python, Java, Go examples are in the file).
+- Pre-built image already exists → **Pattern 2** (Deploy from Container Image). Check whether the registry is private and create a registry secret if needed.
+- Variable or high traffic, cost sensitivity → **Pattern 3** (Auto-scaling). Pick the traffic tier from the mapping table; default to Medium.
+- Sensitive credentials, database URLs, API keys detected → **Pattern 4** (Secrets & ConfigMaps). Always put credentials into a secret, never inline.
+- Batch processing, ETL, scheduled tasks → **Pattern 5** (Jobs). Use `ibmcloud ce job create` + `jobrun submit`.
+- Storage events, Kafka topics, cron triggers → **Pattern 6** (Event-Driven). Set up the matching subscription type.
+- Frontend + backend, microservices → **Pattern 7** (Multi-Service). Deploy backend first, capture URL, pass to frontend.
+- Production deployment → **Pattern 8** (Health Checks). Always configure liveness and readiness probes.
+
+For a complete production deployment combining all patterns, use the **Complete Production Example** at the bottom of the file. Follow the **Best Practices** list and avoid the **Anti-Patterns**.
+
+# QUICK REFERENCE: CLI COMMANDS
+
+Consult [references/cli-reference.md](references/cli-reference.md) whenever you need the exact syntax for a CLI command. Do not guess flags or option names — look them up.
+
+**When to use each section:**
+- **Project Management** — Phase 0/4: select or create the target project before any deployment.
+- **Application Commands** — Phase 5: create, update, delete apps; tail logs during verification.
+- **Application Configuration** — Phase 5: set env vars, scale parameters, CPU/memory, ports, service bindings after initial deploy.
+- **Revision Commands** — Phase 6 / Troubleshooting: list revisions to diagnose rollout issues or roll back.
+- **Job Commands** — Phase 5 (batch workloads): create jobs, submit runs, inspect run logs.
+- **Secret and ConfigMap Commands** — Phase 4/5: create secrets for credentials before deploying; create configmaps for non-sensitive config.
+- **Registry Commands** — Phase 4: register private registry credentials before pulling images.
+- **Build Commands** — Phase 5 (source deployments): inspect build status and resubmit failed builds.
+- **Common Diagnostic Patterns** — Troubleshooting: use these grep-based one-liners to quickly surface the root cause of a failure.
+- **Useful Filters with jq** — Troubleshooting / reporting: extract structured data (app names, statuses, failed runs) from JSON output.
+
+# EXECUTION PRINCIPLES
+- Never skip authentication check
+- Always analyze source code to understand the application
+- Make informed recommendations based on detected technologies
+- Be interactive - present options and await confirmation
+- Provide context - explain why each step matters
+- Show progress - give real-time feedback
+- Verify results - confirm success before next phase
+- Offer guidance - suggest next steps