Building an AI-Powered CLI for File Operations

I come to this topic from frontend and product-minded engineering, so I usually look at systems through developer experience first. Over time that pulled me toward AI tools, especially the kind that can do actual work and not only generate text.

This research project sits exactly in that space.

In it, I explored how an AI agent could automate file management across both local devices and remote cloud systems from the command line. I wanted something more practical than a demo: a system that reduces context switching, removes repetitive work, and makes distributed resources less awkward to use.

This was the starting point of the research.

Why this problem matters

AI is already a practical part of modern software development. Developers use it for code generation, debugging, optimization, and documentation, and that trend is still accelerating. The value is clear enough: repetitive tasks get faster, problem solving becomes more interactive, and the feedback loop gets shorter.

But there is still a gap between what AI can say and what it can actually do, and for me that is where the interesting work starts.

Many current tools are good at conversation, but much less effective when they need to operate across real systems in a reliable way. As soon as file operations, local environments, or cloud storage enter the picture, the workflow often becomes fragmented. You end up switching between terminal, dashboard, apps, and helper scripts. That friction adds up faster than it seems, and I think it is still under-discussed compared with model quality itself.

I wanted to explore a more direct model: an agent that works in the CLI, understands natural-language instructions, and can perform file-related operations across both local and remote systems.

Research focus

The core of the project was an agent-based application that runs directly in the command line and supports automated file management on:

local devices
remote cloud storage systems

The idea was to give the user one faster and simpler way to access distributed resources. Instead of manually navigating multiple systems, the user can issue a request in natural language and let the agent interpret and execute the required steps.

In practice that looks like this. The host is started with the directories the agent is allowed to touch, either as an open chat loop or as a one-off prompt:

# interactive chat mode, scoped to two local folders and the CDN root
bun run host.ts --local assets downloads --remote /

# single-run mode: execute one prompt, then exit
bun run host.ts --prompt "summarise README.md"

A typical session chains several tool calls from one instruction:

User: Upload 3 facts about life.
Assistant: Step 1 – Check the allowed local directories.
Assistant: Step 2 – Write a new file (`downloads/life_facts.txt`) with the three facts.
Assistant: Step 3 – Upload that local file to the cloud storage as `life_facts.txt`.

That makes the CLI feel less like a collection of commands and more like a working surface.

Key concepts behind the project

To make the implementation easier to follow, I first defined the main concepts used throughout the project.

Large Language Models

Large Language Models, or LLMs, are one of the core building blocks of modern AI systems. They process and generate language by breaking text into tokens and predicting the next token step by step based on patterns learned from large datasets.

In practice, this makes them strong interfaces for reasoning, instruction following, and natural-language interaction.

One important limitation is the context window. An LLM can process only a limited amount of information at once, which becomes a challenge when working with larger codebases, file trees, or distributed system state.

To help with this, systems often use embeddings, which are numerical representations of text or code in a high-dimensional space. These make it possible to retrieve semantically relevant information efficiently even when the full context cannot fit into a single model input.

Agentic systems

Agentic systems are software systems that can plan, make decisions, and execute tasks with some level of autonomy. Instead of only responding with text, they can act through tools and adapt to the environment they operate in.

That distinction is important.

A useful agent is not just a chat interface. It needs access to capabilities outside the model itself: file systems, APIs, services, or execution environments. Recent progress in LLMs has made these systems much more practical, but they still need careful orchestration to work reliably.

In this project, the focus was specifically on agent-driven file management: using an LLM-based agent to understand requests and coordinate operations on both local and cloud resources. I picked this scope on purpose because it is small enough to test properly, but still messy enough to reveal where current agents break down.

Content Delivery Networks

The project also touched on cloud infrastructure concepts, especially Content Delivery Networks, or CDNs.

CDNs are distributed server systems designed to improve availability and performance by caching content closer to end users. In modern platforms, they often go beyond static delivery and support edge computing, which allows logic to run near the edge of the network.

Because CDN-backed platforms are a common way to manage cloud-hosted files in real-world business environments, this model was relevant for the cloud side of the project. It also matched the focus on practical developer experience.

Model Context Protocol

Another important concept in the research was the Model Context Protocol, or MCP.

MCP is a protocol and architectural model designed to extend AI systems through external tools and structured context exchange. It follows a host-client-server setup:

the host application coordinates the system
MCP clients manage the connection to tools
MCP servers expose tools, resources, and capabilities

Communication can happen over standard input/output or HTTP, depending on the setup. What matters is that MCP creates a standardized way for AI systems to discover tools, access external resources, and exchange context in a predictable way.

In this project, I built a system that brings several of these roles together at once: host application, client, server, and agent.

What I found interesting about this work

What made this project meaningful to me is how naturally it connects several areas I care about:

developer experience
interface design
distributed systems
AI-assisted workflows
reducing cognitive overhead in complex environments

Although the implementation lived in the command line, I still saw it as an interface design problem as much as a systems problem. The interesting question for me was not only how to connect an LLM to tools, but how to shape that interaction so it becomes useful, trustworthy, and low-friction for real users.

That perspective comes directly from my frontend background.

Good interfaces are not just visual. They are about clarity, feedback, mental models, and helping users move through complexity with less effort. Agentic systems bring a new interaction layer into software, and I think they need stronger product and engineering thinking than many current AI products are getting.

Why I’m continuing in this direction

This research was a practical step toward the kind of systems I want to keep building: tools that combine AI reasoning, structured workflows, and well-designed interaction.

I’m especially interested in the space between raw model capability and usable product experience. That includes:

agentic interfaces
AI-powered developer tooling
multi-system orchestration
workflows that reduce operational and cognitive friction

As AI tools mature, I think the biggest opportunities will come from making them feel less like isolated assistants and more like integrated collaborators inside real software environments. Right now, too many products still stop at the assistant layer.

For me, this was one of the first serious steps in that direction.

Final note

From there, the next step was to look at the surrounding ecosystem.

If you are working on agentic UI, AI-assisted developer tools, or workflow automation, I would be glad to compare notes. This is a space I am actively exploring from both a systems perspective and a product experience point of view.