What if you put an AI assistant directly inside the black screen of your computer? (feat. Context Window)

AI Summary

Beyond simply using AI as a chatbot, this introduces the latest technology trends that allow AI to read all files and the latest documents on your computer at once and take direct commands.

Imagine this. You’re given the heavy task of summarizing dozens of massive quarterly performance reports at work. Normally, you’d probably go through this process. First, you grab your mouse and open the Excel files and Word documents one by one. You drag to select the contents and copy them (Ctrl+C). Next, you switch to the artificial intelligence chatbot (ChatGPT, Claude, etc.) window open in your web browser, paste (Ctrl+V), and carefully type, “Please summarize this.” If it were just one or two files, it would be bearable, but what if there are dozens of documents running into hundreds of pages? Just the simple manual labor of copying and pasting will waste a precious day. Moreover, if the chatbot spits out a red error message saying, “The number of characters entered is too large to process at once,” everything will go dark before your eyes, and leaving work on time will be out of the question.

However, your genius developer colleague sitting next to you works a little differently. They don’t even touch the mouse. They simply open a ‘Terminal’ window with white text blinking on a black background, like something out of a hacker movie. Then, they type a few lines of seemingly incomprehensible English words and casually press the Enter key. In just a few minutes, dozens of documents are perfectly summarized, and a single report file with only the core points extracted magically appears on the desktop. What on earth happened on this colleague’s computer? The colleague never visited a chatbot website, nor did they copy and paste even once.

This is the vivid reality of the latest artificial intelligence technology we are going to dig into today. Developers and data scientists in Silicon Valley no longer use artificial intelligence simply at the level of a ‘chatbot’ that converses in an internet window. They summon artificial intelligence directly into the deepest parts of their computer’s operating system, making it manually touch and manipulate all the files and data on their hard drive. How has such magic become possible? It is thanks to the dazzling advancement of ‘Command-Line Interface (CLI)’ tools and the evolution of the ‘Context Window’, which has explosively expanded the brain capacity of AI to read and remember everything at once. Today, at MindTickleBytes, we will explain this latest technology trend, which might seem a bit difficult in everyday terms, as easily and engagingly as a close friend explaining it over a cup of coffee.

Why It Matters

Dropping the Mouse and Grabbing the Keyboard: The Crucial Difference Between GUI and CLI

To understand why this technology is important enough to fundamentally change the way we work, we must first clearly understand the crucial difference between GUI, our everyday way of using computers, and CLI, favored by experts.

We are usually accustomed to screens where colorful icons and folder pictures appear when we turn on the computer, and we click around by moving the mouse pointer. This is called a ‘GUI (Graphical User Interface)’. Because it consists of pictures and buttons, it is intuitive and easy to learn. However, it has a fatal flaw: when performing complex and repetitive tasks, such as processing dozens of files at once, it requires a lot of manual intervention, significantly slowing down the work speed. On the other hand, the ‘CLI (Command-Line Interface)’ loved by experts is a method of communicating with the computer using only text-based commands. There are no flashy pictures on the screen, just a single blinking cursor where you can type text.

Why stick to this difficult and bleak-looking method? Let’s compare it to a high-end restaurant to make it easier to understand. If GUI is the process of a customer looking at a menu full of pretty food pictures and calling a waiter to order a dish, CLI is like the customer barging straight into the kitchen and directly instructing the head chef, “Take the salmon and asparagus from the corner of the second shelf in the fridge and bake them in a 200-degree Celsius oven for exactly 15 minutes and 30 seconds.” Because there is no need to go through a waiter or flip through a menu, the speed is incomparably faster. Furthermore, you can perfectly control and create bizarre and complex dishes that are not on the set menu exactly as you want.

With the rapid development of Large Language Model (LLM) technology recently, an era has opened up where a ‘genius AI head chef’ is hired and stationed 24 hours a day in this secret kitchen called CLI. Users no longer need to go through the trouble of opening a web browser to copy and paste text. Looking at the case of LLM: A CLI utility and Python library for interacting with Large Language Models, users can simply command, “Explain this code in detail,” by streaming the contents of their Python code file through a pipe in the terminal window of their computer.

For example, a single line of text command, cat myfile.py | llm -s "Explain this code", completes everything. Put simply, here cat means to open the lid of the file and show its contents, and the | (pipe symbol) means to pour the spilled contents directly into the mouth of the AI (llm) without dropping a single drop, just like connecting a water pipe. Without even needing to access a website or log in, your computer’s files, databases, and system settings are connected directly to the AI’s brain. As a result, the time wasted on unnecessary mouse clicks completely disappears, and human work efficiency increases beyond imagination.

The Explainer

AI’s Limitless Short-Term Memory, The Magic of the Context Window

Then, how can AI read all the numerous files and code scattered in the folders of my computer at once and perfectly understand the context? Here, two core concepts emerge that you must grasp to understand the future AI era: ‘Token’ and ‘Context Window’.

First of all, artificial intelligence does not understand the words and entire sentences we use as whole chunks like taking a picture, as a person would. It breaks down words into tiny pieces called ‘tokens’ so that the computer can calculate them quickly and mathematically. As detailed in the document [What is a context window?

IBM](https://www.ibm.com/think/topics/context-window), if you use tools provided by the Hugging Face platform, you can visually see how various AI models tokenize (fragment) text inputs.

To use an analogy, tokens are the ‘Lego blocks of language’. The word “apple” might be 1 Lego block (token), while a complex loanword like “Transformer” might be shattered into 3 to 4 tokens. Generally, for English, you can roughly think that one word translates to approximately 1.2 tokens.

And the AI’s ‘short-term memory’ or ‘workspace’, where it can place these numerous finely chopped Lego blocks (tokens) on a desk all at once and think by connecting the context before and after, is what we call the Context Window.

To help you understand, let’s compare it to a complex criminal investigation. Imagine you are a veteran detective tasked with investigating a highly intertwined serial case. To solve the case, you need to put vast and fragmented evidence materials—such as hundreds of photos of the crime scene, a month’s worth of call records for ten suspects, and statements from dozens of witnesses—into your head, meticulously compare them, and find contradictions. Here, the ‘context window’ means the size of the massive ‘whiteboard’ in the investigation squad where you can spread out all those evidence materials without overlapping and compare and analyze them at a glance while drawing red lines.

Unfortunately, the early artificial intelligence in the past had a whiteboard that was too small. At best, it was a cramped size that could barely hold three or four sheets of A4 paper. Therefore, if you attached the first suspect’s statement to the whiteboard and read it carefully, but then had to pull out a new document to verify the second suspect’s alibi, you inevitably had to take down and throw away the first document that was already attached. Naturally, the AI would quickly forget the important information it had just read and exhibit hallucinations, making up lies and giving irrelevant answers to questions it wasn’t asked.

But now, the situation has completely reversed. Due to dazzling advancements in hardware and innovations in AI algorithms, the size of the whiteboard used by AI has become as vast as a massive World Cup stadium. According to the technical report LLMs with largest context windows, today’s leading LLMs in the industry basically support an ultra-massive context window that can process a staggering 400,000 to a maximum of 1,000,000 input tokens at once without error.

Just how massive is this 1 million tokens in real life? It is an overwhelming amount that allows you to instantly spread out hundreds of thousands of lines of the entire source code (codebase) of a running computer program, bundles of hundreds of dense legal contracts that only lawyers would read, the full transcript text of a multi-hour documentary video, and even the complete chat history of daily conversations between a specific user and an AI over several months on this massive whiteboard all at once without missing a single thing, with just a single question. Now, the smarter AI doesn’t need to nervously fumble through its memory or stutter, asking, “What did the very first document say?” Instead, it can simultaneously see through thousands of files you throw at it at once without a single second of error and derive perfect contextual analysis results.

Where We Stand

This amazing technological leap is not a story of the distant future trapped in thick academic papers. Even at this moment, it is a vivid and dynamic reality happening every day inside the black terminal windows of countless data scientists and programmers around the world.

Over 100 Artificial Intelligence Brains to Choose From

In the past, users had to connect to a single chatbot website monopolized by a specific global corporation and passively use only the artificial intelligence models they permitted. But now, smart users easily swap out the artificial intelligence’s brain to fit their work situation and budget, just like changing a smartphone case. As specified in llm · PyPI, by simply installing a Python-based CLI tool utility named ‘llm’ once, you can freely switch and control between top-tier commercial AI models developed with astronomical amounts of money by giant global Big Tech companies like OpenAI, Anthropic, and Gemini, and ‘local artificial intelligence models’ downloaded directly to your computer’s hard drive and run offline without the internet, all with a single line of command in the terminal window.

Currently, artificial intelligences with diverse personalities beyond our imagination are pouring out into the world every day. Looking at the statistics compiled on the LLM Leaderboard - Comparison of over 100 AI models from OpenAI… website, well over 100 different AI models are boasting their respective strengths. They are fiercely competing for survival, overtaking one another in various key metrics such as the logical accuracy level of intelligence, price per token, output speed of typing text, wait time (latency), and the maximum supported size of the context window discussed in depth earlier.

Managing these numerous models is also now done perfectly inside the terminal. By utilizing management tools like A CLI tool to list available LLM models from various providers, you can securely encrypt and set API (Application Programming Interface) keys from various companies in the terminal environment, and neatly query and command a list of all artificial intelligences that can be called up on your computer right now. A skilled worker has become able to perfectly conduct like an orchestra maestro, whether to use the smartest and most expensive model to solve a difficult mathematical algorithm problem, or to use a lightning-fast, 100% free small local model on their computer to simply correct text typos.

Local Models Dancing in Closed Networks and Ultimate Security

Another revolutionary trend exploding recently in the expert and developer communities is the popularization of local AI models that work perfectly even in offline environments. Transmitting confidential source code containing a company’s core technology or patient data containing millions of social security numbers entirely over the internet network to the cloud servers of other global Big Tech companies is an incredibly dangerous act that is absolutely unacceptable from a security standpoint.

As kindly introduced in the [Run LLMs Locally: 7 Simple Methods

DataCamp](https://www.datacamp.com/tutorial/run-llms-locally-tutorial) tutorial document, highly intuitive and easy-to-use free open-source frameworks such as GPT4All, LM Studio, Ollama, and llama.cpp have recently emerged. Using these tools, anyone can download and install a high-performance AI entirely on their Windows laptop, MacBook, or Linux PC without complex network settings. Because of this, even in a deep underground bunker where the internet LAN cable is completely unplugged or inside an airplane at an altitude of 10,000 meters where Wi-Fi is disconnected, the AI assistant in your laptop still perfectly summarizes your private documents and helps with coding.

Furthermore, looking at the case in the blog post Using a local LLM in OpenCode with llama.cpp – Aayush Garg, using these local LLMs has evolved beyond simply using them for questions and answers to the stage where they are directly connected like an API to dedicated coding assistant platforms (OpenCode, etc.) to automatically complete code as soon as you type. The article The 6 Best LLM Tools To Run Models Locally also emphasizes that these tools no longer remain at the level of terminal commands as in the past, but excellently perform the role of an independent API server within your computer environment, providing a seamless experience offline that is completely identical to using OpenAI’s paid services.

“If I Don’t Know, I’ll Search and Bring It” - A Perfect Cure for Hallucinations

However, no matter how smart a genius artificial intelligence is, there is still a fatal Achilles heel it must overcome. It is the phenomenon of ‘Hallucination’, where it does not honestly admit it doesn’t know about facts it is unaware of, but instead makes up plausible lies with an overly confident and logical attitude. When developers ask an AI to “build a website skeleton with the latest React framework syntax,” it happens very frequently that the AI proudly writes useless, nonsensical code that doesn’t even work right now, based on outdated, expired knowledge it learned last year.

To completely block such fatal errors in advance, groundbreaking and clever tools like the GitHub - upstash/context7: Context7 Platform – Up-to-date code… platform have appeared. To compare how this system works to our reality, it is like completely changing the rules from a ‘memory test’ taken in a tightly closed exam room to an ‘open-book exam’ where you can look up books at any time.

The AI is not left to struggle to fumble its memory, relying solely on its blurry training data from the past to force out an answer. The moment the user throws a prompt (command), an auxiliary tool called Context7 connects to the internet and the latest official documentation repositories like lightning. Then, it scrapes the most up-to-date official manual documents tailored to the specific software version and fresh, latest code examples that work without errors in practice immediately, which are related to the question. It then neatly sets up that information on the massive context window (work desk) equivalent to 1 million tokens explained earlier, merges it with the user’s question, and injects it into the AI’s brain.

The AI now has absolutely no need to unreasonably dredge up old, outdated knowledge. It just needs to slowly read through the perfect, latest manual that was delivered to its desk 1 minute ago, and then generate the correct answer exactly as written in the manual. As a result, the probability of hallucinations, where it churns out irrelevant code, is dramatically reduced to near zero, and the developer gets perfect code that runs the service immediately just by copying and pasting.

Furthermore, when there is a need to search through vast internal company documents, it is also combined with vector embedding technology (FastEmbed) that maximizes hardware support, as in the latest case of Retrieval with Qdrant - Docling. Through this, a powerful technological synergy is currently exploding, where it rummages through vast text data amounting to tens of millions of cases at the speed of light within the CLI window, sharply plucks out only the most relevant information like tweezers, and pushes it into the AI’s window.

Also, evolution is repeating at a terrifying speed in terms of the format of the data. Not only smooth prose that is good for human eyes to read, but also mechanical data formats essential for communication between computer programs are being considered. The tool Linearis, A Linear CLI Tool Built for Humans (and LLM Agents), while handling data from Linear, a widely used project management tool in the field, cleanly outputs results in a perfectly structured JSON format, which is very easy for both AI agents (robots) and human programmers to process secondarily, rather than simply spitting out text. A seamless chain reaction where the CLI spits out data, the AI consumes it, and other software flawlessly processes the answer the AI spits out, has become possible in earnest.

What’s Next

Artificial Intelligence That Has Moved Beyond the Mouth Inside the Monitor and Begun ‘Actions’ Touching the Digital World

If artificial intelligence so far has remained at the level of an excellent ‘secretary’ or an ‘advisor’ sparing no advice, eagerly reading the vast documents thrown by the user with a magnifying glass and kindly displaying the answers in text on the screen, the artificial intelligence of the future we will face is completely different. It will perfectly transform into a reliable and ‘independent practical worker’ who walks on its own two feet directly into our computer system, rolls up its sleeves, physically sweats, and works proactively. The ultimate technology that makes this a reality, not a fantasy, is the ‘Tool Use’ or ‘Function Calling’ capability.

Surprisingly, this massive paradigm shift is happening right before our eyes right now, not in the distant future. According to the fascinating article Large Language Models can run tools in your terminal with LLM 0.26, as the ‘llm’ CLI tool loved by numerous experts worldwide underwent a massive update to major version 0.26, the most shocking and disruptive new feature since the start of the project was equipped. That is, the large language model crouching in the terminal has finally been granted the immense power to logically judge and directly execute various third-party ‘tools’ installed on the user’s computer on its own, without having to get permission or approval from a human every single time.

What on earth does this mean for ordinary users? Let’s imagine a bitter situation from just a little while ago. If a user asked, “Extract the letters from this scanned paper document picture on the desktop and make a Word file,” the AI would only kindly give a vague textual methodology, “Please install Python’s Tesseract library, write such and such script code yourself, enter it into the terminal, and run it.” The person receiving the instruction had to leave the AI’s advice floating in the corner of the monitor, hit the keyboard themselves, fix any errors that occurred, and bear the physical labor entirely. To use an analogy, it was merely an annoying site supervisor sitting back in an air-conditioned office with their hands behind their back, only giving verbal advice.

But now, the evolved AI since version 0.26 can be equipped not just with a speaking ‘mouth’, but with ‘hands and feet’ capable of exercising powerful physical force in the digital world, that is, real hammers and Phillips screwdrivers (digital tools). Looking at the latest experiment cases pouring out endlessly from the GitHub - markomanninen/llm-experiments: Large Language Models… open-source repository, it goes beyond wonder and is almost chilling. The artificial intelligence deeply embedded in the black CLI environment of the terminal has now far surpassed the level of simply exchanging text chats. It has completely armed itself with the active function calling ability to use system control tools that convert audio files to other formats or play them directly, data management tools that cleanly classify and organize complex numerical data that you are bored of and hate looking at, a code runner that instantly compiles and runs the code it just wrote in a virtual sandbox environment, and even to flawlessly play classic games like Tic-Tac-Toe or Chess following the rules perfectly within the terminal window with a human user to cool off its head during work.

Imagine coming to work tomorrow morning, rubbing your tired eyes, and casually typing this into the black terminal window in front of your computer: “Analyze all 30 branch Excel files downloaded to the company’s public folder early this morning, and extract only the rows where the ‘net profit’ item is marked in red deficit. Gather only that data, convert it into a clean PDF report file containing a nice pie chart, and send it immediately to our team leader’s email address with the title ‘Urgent: Deficit Branch Report’.”

Just 1-2 years ago, this would be an absurd command from a sci-fi novel, something you would only give to Jarvis in the Iron Man movie. However, in front of the latest AI agent perfectly combining a massive context window capable of digesting millions of tokens and autonomous CLI tool execution rights, it is nothing but a very ordinary and boring morning routine. Receiving this complex command, the AI rapidly spins up its own logic engine and begins to act sequentially as follows:

It pulls out the ‘local file search tool’ on its own, finds all 30 Excel files in the folder without missing any, and reads them into memory.
It spreads out all the hundreds of thousands of cells of data in the middle of a massive 1-million-token context window, meticulously compares the numbers, and smartly and sharply extracts only the data of the branches with a deficit.
It writes the visualization code itself, directly runs the ‘data to image conversion tool’, and draws a nice pie chart in a single stroke.
It runs the ‘PDF format generator’ tool to combine the extracted text and the drawn picture, quietly creating 1 plausible document file on the desktop.
Finally, it controls the ‘Email SMTP sending tool’ linked to the system to specify the exact recipient and fire off the email.

Even before the user goes to the breakroom and gets a cup of hot Americano from the coffee machine, this entire complex, labor-intensive, multi-step process is completed lightning-fast, quietly, and perfectly within the black terminal window. The user just needs to return to their seat and happily check the single line of text displayed on the screen after the AI completes the transmission: “The requested task has been successfully completed.”

Furthermore, this magical realm of unmanned automation is explosively expanding beyond replacing individuals’ simple repetitive tasks to a massive server infrastructure scale that moves the entire enterprise. As seen in the expert post [LLM Benchmarking with worktree-compose

Mostafa Ali…

LinkedIn](https://www.linkedin.com/posts/mostafasudo_im-guilty-of-jumping-on-a-new-llm-and-benchmarking-activity-7431765787385679873-GUDD), the latest open-source CLI automation tools emerging recently have already reached a shocking level where the AI can entirely grasp and control the entire skeleton of a company’s complex server infrastructure—automatically opening and allocating firewall system ports whenever a new work environment is needed, building backend databases, spinning up temporary memory caches, and automatically configuring dozens of containers with Docker commands on its own.

Artificial intelligence is no longer a passive and boring entity locked in the narrow prison of a square web browser window on a monitor, hopelessly typing away while waiting for the user to ask a question. It is taking giant strides deep into our daily lives and work as a powerful, omnipotent administrator who directly controls and manipulates the complex laws of physics of the digital world, namely the computer’s file system and network.

AI’s Take

From the perspective of MindTickleBytes’ meticulous AI specialized reporter, this latest technology trend can be summarized as follows:

“Artificial intelligence is now perfectly evolving beyond a ‘talker’ who is simply eloquent at giving answers, into a ‘worker’ who puts down deep roots in the operating system, the heart of our computer, and physically sweats. The overwhelming context window brain capacity of 1 million tokens that perfectly remembers all past conversations and vast corporate history at once without a single second of delay. And the awe-inspiring thought of how AI, which has simultaneously acquired the invincible physical hands and feet of ‘Tool Use’ capable of operating various software equipment on its own, will fundamentally shatter and then magnificently reassemble the limits of humanity’s working methods and productivity in the future.

The almighty head chef exclusively yours, waiting behind the blinking white cursor in the black terminal screen you stare at every day, has finished all preparations to complete extraordinary dishes for you today as well. The moment you put down your mouse and place your hands on the keyboard, your old and frustrating work environment will turn into a space of infinite possibilities breathing together with AI. Now, what amazing miracle would you like to order from this chef?”

References

Linearis, A Linear CLI Tool Built for Humans (and LLM Agents)
GitHub - markomanninen/llm-experiments: Large Language Models…
Using a local LLM in OpenCode with llama.cpp – Aayush Garg
[Run LLMs Locally: 7 Simple Methods DataCamp](https://www.datacamp.com/tutorial/run-llms-locally-tutorial)
LLMs with largest context windows
A CLI tool to list available LLM models from various providers
GitHub - upstash/context7: Context7 Platform – Up-to-date code…
[What is a context window? IBM](https://www.ibm.com/think/topics/context-window)
Retrieval with Qdrant - Docling
LLM Leaderboard - Comparison of over 100 AI models from OpenAI…

[LLM Benchmarking with worktree-compose

Mostafa Ali…

LinkedIn](https://www.linkedin.com/posts/mostafasudo_im-guilty-of-jumping-on-a-new-llm-and-benchmarking-activity-7431765787385679873-GUDD)

Large Language Models can run tools in your terminal with LLM 0.26
llm · PyPI
LLM: A CLI utility and Python library for interacting with Large Language Models
The 6 Best LLM Tools To Run Models Locally

Share this article:

Test Your Understanding

Q1. Which of the following is the best analogy for the 'Context Window' explained in the article?

The size of a chef's cutting board (workspace)
The maximum speed of a car
The resolution of a computer monitor

Since the context window refers to the amount of information an AI can read and remember at one time, it can be compared to a cutting board where a chef can place ingredients or the size of an investigator's whiteboard.

Q2. Up to how many tokens can recent AI models process at once?

10,000
100,000
1,000,000

According to the article, recent LLMs support massive context windows that can process from 400,000 up to 1,000,000 input tokens at once.

Q3. Which of the following is NOT an advantage of using AI in a CLI (Command-Line Interface) environment?

It can directly read the computer's files or code.
It is fun to click colorful buttons with a mouse.
You can make the AI directly execute the computer's tools.

CLI is an environment where you communicate with the computer using only text, so it is far from a flashy graphical environment (GUI) that uses a mouse. Its purpose is to maximize work speed and efficiency rather than intuitive fun.