MCP and Skills - Giving AI Hands Feet and a Playbook

1 天前
/ , , ,
1
摘要
MCP connects AI to the outside world. Skills make sure it does things right. One solves "can it act," the other solves "will it act correctly."

MCP and Skills - Giving AI Hands Feet and a Playbook

cover

cover

A Very Specific Frustration

In the previous piece, I covered Prompt and Context Engineering, which boils down to one question: how do you get the right information in front of AI? But seeing information and acting on it are completely different things.

Here's something that happened to me recently. I asked AI to check my calendar for next week, find scheduling conflicts, and email the summary to a colleague. Three simple tasks. It couldn't do any of them. No access to Google Calendar, no ability to check real-time data, no way to send email. It was like a consultant locked inside a soundproof glass booth: brilliant hearing, razor-sharp analysis, but no hands and no eyes.

This isn't a cognitive limitation. It's a physical one. AI has three innate blind spots: its knowledge has a cutoff date, it can't execute actions, and it can't access private data. You can craft the perfect Prompt, set up the most thorough Context, and those three walls are still there.

The two concepts I want to unpack here address different layers of this problem. MCP handles "letting AI reach the outside world." Skills handle "making sure AI does the job right."

MCP: USB for the AI World

Cast your mind back to computers around the year 2000. Keyboards used PS/2 ports, printers used parallel ports, mice used serial ports, scanners used SCSI. A different connector for every device. Then USB arrived and one port handled everything.

MCP, the Model Context Protocol, does essentially what USB did: it defines a standard protocol so AI applications can connect to external services in a uniform way.

Without MCP, if you want AI to read your Google Calendar, you write custom integration code: authentication flow, API calls, response parsing. Want to add GitHub? Write another integration. Gmail? Another one. N AI apps talking to M services means N times M integrations. Anyone who's done enterprise systems integration knows this pain.

With MCP, Gmail provides a single MCP Server, and any MCP-compatible AI client plugs right in. N times M collapses to N plus M.

MCP Decoupling: From N×M to N+M

MCP Decoupling: From N×M to N+M

Engineers will recognize the pattern. MCP is essentially an RPC specification where the caller happens to be a language model instead of a program. If you've used ODBC or JDBC, the mental model transfers almost directly: the application doesn't need to know the database's specific dialect because the middleware translates.

The protocol defines three primitives: Tools (actions AI can perform, like "send an email"), Resources (data sources AI can read, like "recent meeting notes"), and Prompts (predefined interaction templates). Communication runs over JSON-RPC, so there's no learning curve for anyone with web development experience.

During our roundtable discussion, my colleague Xiaolin offered an analogy I thought was spot on: MCP is like a unified system for restaurant servers. Before, checking kitchen inventory meant calling the back line, looking up a customer's loyalty status meant flipping through a binder, and ordering delivery meant opening a separate app. Now one system, one interface, handles it all.

A Bucket of Cold Water

The technical design of MCP is elegant, but whether a standard actually unifies anything has never been about the technology itself.

USB succeeded because Intel, Microsoft, and Apple all lined up behind it. If Intel had pushed USB while Microsoft pushed FireWire and Apple went its own way, your desk might still be a tangle of different cables today. MCP has momentum. Anthropic led the initiative and other AI companies are following. But the ecosystem politics are far from settled.

There's a security dimension too. MCP gives AI the ability to perform real actions: send emails, modify files, execute code. If a prompt injection attack triggers a Tool through MCP, the consequence isn't just a wrong answer. It could be an email that should never have been sent. Greater capability means a larger attack surface. That's not a problem technology solves on its own.

Skills: A Standard Operating Procedures Manual for AI

AI now has hands and feet. But having hands and feet doesn't mean knowing how to work.

In the operations world, there's a document called a Runbook: when a service goes down, step one is check this, step two is restart that, step three is notify this person. Every step has preconditions and verification checkpoints. A Runbook doesn't teach you ops fundamentals. It assumes you already have the baseline capability and locks down "the correct way to handle this specific situation."

A Skill does the same thing, except the reader is AI instead of a human engineer.

A concrete example. I have a recurring task: publish a Markdown article to a WeChat Official Account. This involves converting Markdown to HTML, uploading images to a CDN and rewriting links, calling the WeChat API to create a draft, and dealing with various formatting compatibility issues. Spelling all that out in a Prompt every single time is both inefficient and error-prone. Details I remembered last time might slip through the cracks this time.

Package those steps into a Skill, and AI automatically loads it whenever it encounters this type of task. The Skill specifies: the trigger condition ("when the user asks to publish a WeChat article"), the steps involved, which tools to use, constraint rules ("images must be uploaded to R2 before links are rewritten"), and representative examples.

The relationship between Skills and Prompts fits on a single axis:

Prompt vs Skill Positioning Spectrum

Prompt vs Skill Positioning Spectrum

A Prompt is a one-time instruction that disappears after the conversation. A Skill is persistent, version-controlled, and shareable. Fix a bug by adding a rule to the Skill, and everyone using it benefits. This mirrors the logic of a software library: a packaged unit of capability that you import when needed.

But a Skill isn't quite a code library either. It's closer to a Runbook plus Configuration. The Runbook defines the procedure. Configuration defines parameters and constraints. A Skill contains both.

Here's the key insight: a Skill doesn't teach AI new abilities. AI already "sort of knows" how to do most things. Ask it to convert Markdown to HTML, it can. Ask it to call an API, it can. What a Skill solves is turning "sort of knows" into "reliably gets it right." The difference lives in the details. WeChat's HTML renderer doesn't support certain CSS properties, image links must use HTTPS, code blocks in the body need special treatment. AI won't know these things spontaneously, but once they're written into a Skill, it gets them right every time.

A formula: Skill = general capability x specific constraints x repeatable execution.

Getting Your Hands Dirty

Enough concepts. Here's what it looks like in practice.

Installing an MCP Server, using Claude Code as an example. Suppose you want AI to query up-to-date technical documentation, say a framework's latest API. You add a context7 MCP Server in your project's .mcp.json:

{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@anthropic/plugin-context7"]
    }
  }
}

Once configured, when you say "look up how Server Actions work in Next.js 15," the AI fetches the latest docs through the MCP Server rather than relying on potentially stale training data.

Writing a simple Skill is equally straightforward. Suppose your team frequently asks AI to do code reviews, and you have specific standards. Create a Skill file:

---
name: team-code-review
trigger: When the user requests a code review
---

## Steps

1. Check for uncommitted changes and list modified files
2. For each changed file, check:
   - No function exceeds 50 lines
   - All public functions have docstrings
   - No use of `any` type (TypeScript projects)
   - No empty catch blocks
3. Summarize issues, sorted by severity
4. Provide specific fix suggestions for each issue

## Constraints

- Only inspect files changed in this diff, not the entire project
- Separate style issues from logic issues
- If nothing is found, explicitly state "No issues found in this review"

That's a minimal Skill. It's short, but it pins down the steps and standards for a code review. Next time anyone on the team triggers it, AI follows the same criteria.

What I Haven't Figured Out

MCP and Skills solve problems at two different layers, but new questions are already surfacing.

MCP keeps expanding what AI can reach. Skills keep tightening how AI should act. Stack these two together and AI autonomy becomes a real design decision: how wide an action radius do you grant it? Which operations require human confirmation, and which can run automatically? This is no longer a technical choice. It's a question of trust architecture.

I'm also uncertain about Skill granularity. Write it too coarsely and AI still stumbles on details. Write it too finely and maintenance costs climb, and you might actually be constraining AI from doing something better than your script anticipated. "Specific beats generic" feels like the right principle from my experience so far, but I haven't found the optimum.

In the first piece I left an open question: can AI decide for itself what information it needs to see? I can extend that now. When AI can independently fetch information through MCP and follow standardized procedures via Skills, does it already have the prerequisites for autonomous task completion?

That's exactly what the next piece will cover: Agents. An AI that can perceive, decide, and act is a fundamentally different creature from one that only answers questions. But before we get to Agents, MCP and Skills need to be understood first. They are an Agent's infrastructure. An Agent without hands is empty talk. An Agent without a playbook is a liability.

Pick the most repetitive AI task in your workflow and write a Skill for it. It doesn't have to be perfect. Just get it running. That's the fastest way into this whole system.

  • Loading...
  • Loading...
  • Loading...
  • Loading...
  • Loading...