A Feasible Path for AI Implementation in Enterprises
Yesterday I thought I had a genius idea.
Wrap an Agent inside Docker, turn CLI interactions into Web APIs. Dynamically render those y/N prompts and multi-select menus, then pass commands back to the Agent in the container. The benefits seemed obvious: isolated environment solves permission issues, custom base images with built-in skills, horizontal scaling, instant snapshots via docker commit.
Then Claude threw cold water on it. ANSI-to-structured-API cleanup would constantly break, and worse — there's already an open-source implementation.
I clicked the link. OpenHands. 65K stars.
What Got Validated Wasn't Just Technical Feasibility
Docker sandboxing, WebSocket real-time interaction, PTY handling, configurable runtime, enterprise-grade access control, Kubernetes deployment. Every feature I'd imagined was there, polished to production quality.
But what actually excited me wasn't "I reinvented the wheel" — it was seeing a 65K-star project validate the same architectural choices. This validation isn't at the technical level. Technically, Docker + Agent obviously works. What really got validated is: this direction solves the core problem of enterprise AI deployment.
What's the current state of AI tools in enterprises? A few technically strong engineers install a bunch of CLI tools and use them brilliantly. But that's it. No way to scale, no unified management, and definitely no way to productize for non-technical teams. These tools exist only as productivity multipliers for a few individuals, not as organizational infrastructure.
What did OpenHands get right? Instead of directly manipulating the host OS, it runs a lightweight HTTP/WebSocket server (action_server or runtime_client) inside the container at startup. The host sends commands to this server via API.
This design seems simple, but it solves a fundamental problem: it transforms uncontrollable CLI interactions into orchestratable API calls. You can build permission systems, audit logs, resource quotas, multi-tenant isolation on top — all the things enterprises need.
But Why Isn't Every Company Using It?
Someone will ask: if OpenHands is so good, why isn't our company using it yet?
Because most companies aren't ready.
Enterprise AI deployment isn't about installing a tool. You need to ask yourself several questions first:
Do your people know what they're doing? Not whether they can use AI, but whether they understand AI's position in the workflow. Do they know what to delegate to AI and what not to?
Do you have the capability to manage at scale? Single-machine application or cluster deployment? Do you have fully automated CICD? Is your data collection pipeline complete? These sound unrelated to AI, but without this foundational infrastructure, AI Agents remain toys.
What's your workflow paradigm? AI won't define workflows for you — it only amplifies the efficiency or chaos of existing processes. If your team collaboration is already a mess, adding AI will just accelerate the mess.
OpenHands has 65K stars, but companies that can actually use it probably account for less than 1%. Not because of technical barriers, but because of organizational readiness.
This Architecture Isn't Universal Either
Docker + Agent has an annoying aspect: pulling code from containers can be cumbersome. If your workflow requires frequent switching between inside and outside the container, this friction might offset the benefits of isolation.
There's a more subtle problem: excessive isolation might prevent the Agent from "seeing" the context it needs. Your local development environment configuration, IDE linter rules, team code style conventions — if these can't be passed to the Agent inside the container, the code it generates will clash with your project.
So it's not "all AI should use Docker," but rather: this architecture only makes sense when you need scalable deployment, multi-tenant isolation, and auditable execution environments. For personal use, running locally might be more convenient.
Where the Real Problem Lies
I tested OpenHands locally, and it's genuinely impressive. But after testing, one thing kept bothering me:
Why do 90% of enterprise AI deployment discussions focus on prompt engineering and model selection, with less than 10% on infrastructure?
This is strange. If you ask a company how they do software development, nobody would only talk about "what programming language we use" without mentioning CICD, monitoring, and deployment processes. But when it comes to AI, everyone only focuses on GPT-4 vs Claude, whether to use RAG, how to write prompts.
The infrastructure layer determines whether you can transform AI from a few people's toy into organizational capability. The Docker + Agent architecture matters not because it's technically advanced, but because it treats AI as a productivity tool requiring engineering management, not as a magic black box.
You can version-control the Agent's runtime environment (via Docker images), audit its every operation (via API logs), control its resource consumption (via container quotas), and horizontally scale to hundreds or thousands of concurrent tasks (via Kubernetes). This is what "AI deployment" should look like.
But this requires your organization to already have modern software engineering foundations. If you haven't even got CICD running smoothly, don't think about deploying AI yet.
How I Think About This Now
65K stars validated not how smart my idea was, but that this direction is genuinely right. But it also shows that most people are still struggling at an earlier stage — they haven't even figured out why they need this infrastructure.
If you're thinking about deploying AI in your company, don't start with prompts. First check if your infrastructure layer is ready. Not saying you need to perfect all infrastructure first, but you need to know what you're missing, and where this gap will block you.
Sometimes, having your technical intuition validated is a good thing. At least it means you're looking in the right direction.