Building dotbox-mcp: My Journey Creating a .NET Sandbox MCP Server

A developer’s experience building a Model Context Protocol server using Claude Skills

TL;DR#

I built dotbox-mcp, an MCP server that lets Claude Desktop execute .NET code in isolated Docker containers. This post shares my journey from idea to v1.0, the tools that helped me succeed, and what I learned about building MCP servers.

Why dotbox-mcp?#

Anthropic’s official API includes sandbox environments for Python and JavaScript - you can ask Claude to run code and get real results. For .NET developers nothing equivalent existed. If you ask Claude Desktop to execute .NET code in such a sandbox it does an attempt by trying to install the dotnet SDK but that doesn’t work.

As an AI architect, I also wanted to understand how these sandboxing systems actually work, because they play a role in building powerful agents that can generate code on demand. E.g., there is the sandbox system in the Anthropic API, also, Azure recently launched Container Apps Dynamic Sessions, and there are lots of other systems like this. So, I wanted to learn by building something similar for .NET, in a way that it can be used in an LLM context.

So there were two goals for this project:

1. Enable the workflow: Experiment with .NET packages, test code snippets, and demonstrate concepts with working examples - inside a chat environment (Claude Desktop). There are of course alternatives like working with coding agents like Claude Code or Cursor. But I wanted a lightweight chat based workflow.

2. Learn sandbox architecture: Understand security models, isolation strategies, resource management, and API design for LLM-driven execution environments. Docker-based sandboxing first, then maybe later explore cloud-based approaches like Azure Container Apps.

dotbox-mcp manages Docker containers for .NET workloads. Ask Claude to execute C# code, and it spins up an isolated Alpine Linux container with the appropriate .NET SDK, builds & runs your code, captures output, and cleans up. It can also manage the lifecycle of the container so you can run Web Apps or APIs and access them locally.

What dotbox-mcp is (and isn’t)#

It’s important to set expectations clearly. dotbox-mcp is designed for rapid experimentation and prototyping, not as a comprehensive development environment.

What it does well:

Testing .NET snippets without creating local projects
Quick validation of API designs or small webapps (spin up an endpoint, get a URL, test it with curl)
Trying out .NET version specific features (8, 9, 10)
Experimenting with NuGet packages in complete isolation

What it’s not built for:

Full codebase navigation and multi-file editing (use Claude Code or Cursor for that)
Production deployments or CI/CD pipelines
Long-running applications or services
Replacing your local development environment

Think of dotbox-mcp as a scratchpad with superpowers - you get the immediacy of a REPL combined with the isolation of containers and the convenience of staying in your Claude conversation.

Example: Building a minimal API with dotbox-mcp

Starting Point: Learning from the MCP Builder Skill#

So for me this was also an experiment in building an MCP server. Fortunately, Anthropic recently released the mcp-builder skill - essentially a guided tutorial and architectural consultant rolled into one - that came in quite handy.

The skill works like having an experienced architect sitting next to you, answering questions about design patterns, framework choices, and best practices when building MCP servers. In the beginning it quickly supported my choice of building it with FastMCP (a lightweight framework for building MCP servers)

Architecture Guidance: Agent-Centric Design#

The most valuable insight from the mcp-builder skill was the concept of agent-centric tools. Instead of exposing low-level primitives (create container, copy file, run command, get output, stop container), you build tools that handle complete workflows in a single call, instead of having lots of primitive tools that the LLM then has to orchestrate.

For example, the dotbox-mcp tool dotnet_execute_snippet(code) handles a lifecycle:

Create container with appropriate .NET SDK
Write code to workspace
Build and execute
Capture output (stdout, stderr, exit code)
Format results
Clean up container

The complexity moves from Claude’s coordination logic into my server’s implementation - where I can test it, optimize it, and handle edge cases properly.

Other Design Patterns That Made a Difference#

The skill emphasized several other patterns for an mcp architecture:

1. Context Optimization MCP responses compete for precious context window space. The skill recommended defaulting to concise output (first 50 lines of build output, summary of results) with an optional detail_level parameter for full details. This keeps common cases snappy while allowing deep dives when needed.

2. Actionable Error Messages Don’t just report “CS0246: type or namespace not found” - parse it and say “Add NuGet package or add using directive.” Claude can act on specific suggestions; generic compiler errors are harder to work with.

3. Human-Readable Identifiers Container IDs like 3a7f92bc are meaningless in conversation. Naming them dotnet8-webapi-abc123 makes logs and error messages immediately understandable. This seems trivial but pays dividends when debugging.

AI Assisted Development Journey: Using the Right AI Tool for Each Job#

One of the fun aspects of building dotbox-mcp was experiencing how different Claude interfaces excel at different development tasks. I ended up using four different tools throughout the project, each at the stage where it made the most sense.

Claude Code: The TDD Engine#

The bulk of development happened in Claude Code (the CLI agent), and for good reason. Building an MCP server requires systematic thinking: defining data models, implementing tools, writing tests, handling edge cases, managing git workflow. This is where Claude Code shines.

The TDD cycle was particularly smooth. Claude Code wrote failing tests that described the desired behavior, then implements just enough code to make it pass. Then we’d refactor together, improving structure while keeping tests green. Not always perfect, the human in the loop is still needed for regularly nudging the code assistant back to the right track.

Claude Desktop or ChatGPT: Design and Exploration#

For larger features that need more exploration and design discussion, I switched to Claude Desktop or ChatGPT and started discussing, iterating and researching architectural or technical choices, then I summarized my documented decision in a clear prompt which I then pasted back in Claude Code.

For example, Windows Docker integration required exploring several approaches: WSL2 socket mounting (complex), Docker Desktop TCP port 2375 (simple but needs security notice), or CLI wrappers (compatibility issues). After discussing tradeoffs with Claude, I settled on the TCP approach and used that design decision to guide the PowerShell installer implementation in Claude Code.

Claude Code on the Web (Tried for a larger feature)#

At one point, I used Claude Code on the web for implementing a larger feature in the background based on a detailed plan made together with Claude Code (switching to dual response format JSON/markdown), while I was doing something else. It did implement it partly, but I had to seriously refactor it and add tests / features, so this wasn’t too helpful. I guess that the Claude Code on the web workflow is still best suited for small features.

The Workflow#

So basically my workflow is:

Research and design → ChatGPT or Claude Desktop (conversational, exploratory)
Implementation → Claude Code (systematic, test-driven)
Documentation and polish → Claude Desktop (re-writing, refinement)
Bug investigation → Claude Code (reading logs, fixing issues)

What I Learned About Sandbox Architecture#

Building dotbox-mcp taught me several lessons about designing execution sandboxes for LLM workflows:

Security requires layers. Docker isolation is just the start - add resource limits, filesystem restrictions, network controls, and timeouts. Each layer catches what others might miss.

Cleanup is harder than creation. Every failure path needs cleanup logic. I implemented multiple safety nets: explicit cleanup, timeout-based garbage collection, and session tracking.

Startup time matters more than execution time. Containers take 3-5 seconds to create; C# runs in milliseconds. This is why agent-centric tools that handle complete workflows outperform chatty APIs.

What’s Next for dotbox-mcp?#

Version 1.0.0 just shipped, marking dotbox-mcp as experimentation-ready. All core tools work reliably, the API is stable, and the Windows + macOS installers handle the complexity of setup. But there’s always room for improvement.

Near-Term Roadmap#

Container Pooling (Performance) The biggest performance win would come from pre-warmed container pools. Instead of creating a container on-demand (3-5 second startup), maintain 1-2 ready containers per .NET version. Requests could complete in under 500ms instead of 3+ seconds. The challenge is managing container state safely - cleaning workspace, resetting environment variables, handling failures gracefully.

Long-Term Ideas#

MCP Protocol Extensions (Task Management) One limitation I noticed: MCP has no concept of multi-step workflows or progress tracking. When executing a complex task (create project, add packages, build, run tests, host API), Claude doesn’t have a good way to communicate progress to the user. Extensions for task tracking, progress indicators, and step-by-step guidance would improve the experience significantly.

Try It Yourself#

If you work with .NET and use Claude Desktop, I’d encourage you to give dotbox-mcp a try. Installation is easy, and the automatic installers handle the complexity.

Prerequisites : -Windows or MacOS -Claude Desktop -Docker Desktop (running)

Installation (macOS):

curl -fsSL https://raw.githubusercontent.com/domibies/dotbox-mcp/main/scripts/install-claude-desktop.sh | bash

Installation (Windows):

irm https://raw.githubusercontent.com/domibies/dotbox-mcp/main/scripts/install-claude-desktop.ps1 | iex

After installation, restart Claude Desktop and try: “Execute this C# code: Console.WriteLine(DateTime.Now);”

“Create a minimal API with a single endpoint that returns the current time in JSON format. Host it in the background so I can test it.”

Source code and full documentation: https://github.com/domibies/dotbox-mcp

Takeaways for Aspiring (MCP) Builders#

If you’re thinking about building your own MCP server, here’s what I’d recommend based on this experience:

1. Start with FastMCP (Python) for prototyping FastMCP gets out of your way and lets you focus on tool logic. You can always port to the official MCP SDK (TypeScript) later if needed, but for MVPs, FastMCP’s simplicity is unbeatable.

2. Use Claude’s mcp-builder skill Treat it like a senior architect who’s built dozens of MCP servers. Ask it about framework choices, design patterns, error handling strategies. The guidance it provides will save you from common pitfalls.

3. Design agent-centric tools, not primitives Build tools that handle complete workflows in a single call. Push complexity into your server implementation where it can be tested and optimized, not into Claude’s coordination logic.

4. Default to Markdown output It’s more context-efficient than JSON and easier for humans to read. JSON has its place, but Markdown should be your default response format.

5. Embrace test-driven development Write tests first, make them pass, refactor. For infrastructure projects (anything involving Docker, filesystems, networks), this discipline is essential. It catches subtle bugs that would be nightmares to debug in production. Don’t forget full e2e tests, for me they were essential for verifying full workflows.

6. Optimize for context efficiency MCP responses compete for precious context window space. Default to concise output (summaries, first N lines, key metrics) with an option for full details. Users can request more if needed.

7. Make errors actionable Don’t just report errors - suggest fixes. Parse error messages, add context, propose solutions. Claude should be able to act on your error responses without asking clarifying questions.

8. Security by default Don’t rely on users to configure safety settings. Bake resource limits, timeouts, and cleanup into your architecture so the safe path is the default path.

9. Use release-please for automation Conventional commits + automated releases will save hours over the project lifetime. No manual CHANGELOG updates, no version bump errors, no inconsistent release notes.

10. Use the right AI Tool for each task Claude Code for systematic implementation and TDD. Claude Desktop / ChatGPT for design discussions and documentation. Each tool has its sweet spot.

Connect#

I’d love to hear about your experiences with dotbox-mcp or MCP server development in general.

GitHub: domibies
Project issues: dotbox-mcp/issues

If you do something cool with dotbox-mcp, please share - I’d enjoy seeing what people use it for.

Tags: #MCP #ModelContextProtocol #Claude #DotNet #CSharp #AI #DevTools #Docker #Python