Ideas for free
Prompts already won
Thursday 6 February 2025 at 09:00 CET
Sometimes, I have ideas, and while it’s unlikely I’ll ever pursue them, I can’t stop thinking about them. So I write them down. I hope to get them out of my head, and ideally, into the head of someone else’s, who might benefit from them. If you want to make this, or even just take one small idea from the pile, please do. It’s yours.
The siren song of the command line is strong.
Pretty much every programmer, and many people who would not consider themselves programmers, eventually get comfortable with their terminal emulator, and the shell within. (Liking them is another thing entirely.) We memorise wild incantations, where a tiny slip, a misplaced ampersand, can have catastrophic consequences.
Me, personally, I love it. Composing an arcane shell command and watching as files are brought into being, processes are spun up, Things are Happening… it’s the closest I’ll get to ever being a wizard. (Probably. I won’t rule out finding a spell book one day.)
Command-line tools, the shell, and the terminal emulator, are so embedded in most programming and administrative disciplines (but not all; for example, plenty of Windows-focused computer people don’t bother) that they’re often considered the difference between an “expert” user and a “novice”.
Here’s how it’s often seen: experts type a lot, leaving the keyboard rarely, prefer text input and output, and have no need for graphical interfaces; novices, by contrast, hate typing, mostly use the mouse, want to see pretty pictures, and therefore need a GUI to get anywhere.
The implication, of course, is that the novice is doing something inferior, and if only they’d take the time to learn a better way, they’d be much more productive and efficient.
The only problem is that the assumptions are completely false.
Prompts already won
Prompts won decades ago. We just weren’t paying attention.
Here’s a prompt:
That one, you’re probably familiar with, assuming you’re an average reader of this blog.
Here’s another one:
Yup. Most popular user interface in the world (and it hasn’t changed much in the last 25 years, except to pile on the adverts in the results), and it’s not too far off your terminal emulator of choice. It’s a box, you type in it, and you get a page of textual output.
You may be old enough to remember what came before Google:
Sure, there was a prompt, but there was all this other… guff. Google’s big idea was that if they could just make the search box work well, they could drop all that other nonsense.
And it worked. People flocked to it. First the experts, but then the novices. Even people like me, who had barely figured out this whole “internet” thing at the time.
Nowadays it looks more like this:
The Omnibox, a powerful concept: just type, and we’ll figure out what you mean.
Here’s another prompt. It’s called “ChatGPT”.
No other product has seen such explosive growth. It’s insanely popular. (And only possible because the powers that be have decided that laws don’t apply here, and they’re able to lose billions of dollars per year and still keep going.)
And its interface is… a text prompt.
Non-programmers don’t shy away from text input. They love it.
Affordances
Now, as for the real difference between command-line and graphical interfaces, it’s this: GUIs have affordances. This means they give you clues as to what you can do with them. They provide buttons, prompts, handles, and other mechanisms to draw your eye and help you figure out what’s possible (at least, if your eyesight is functional). They help you out, correcting small mistakes on the fly.
By contrast, most CLIs provide almost no affordances in their default state (with a shout-out to the fish shell for defying this trend). You are expected to know your tools and read the manuals. Your shell might give you some information via tab-completion, but that implies you even know which tool you’re looking for. A lot of effort goes into making the terminal emulator and shell usable, and everyone, including experts, finds them frustrating.
We all need affordances to help us learn or remember how to accomplish a task, or perhaps even learn better ways of achieving a goal. We’re all novices at first, and while experts may not need them for a given set of tools, there’s always others they’re unfamiliar with, or just plain forgotten about. I‘ve been using the Linux shell for about 20 years and I still forget that common utilities even exist. (Don’t ask me which ones; I have forgotten them.)
Think about how you, an esteemed keyboard warrior, navigate an unfamiliar source code repository. Do you clone it, open it in Midnight Commander, and poke around there? Or do you use GitHub’s web interface, which lets you use your mouse, and gives you a bunch of useful info on whatever you click?
(This is not an endorsement for GitHub’s UI, which I find atrocious in many respects, just an observation that we will use it despite its flaws because it makes other things easier.)
There is no shame in using what’s easy and comfortable. The shame is in purposefully making something harder or more dangerous in order to create exclusivity.
Because of course, there is power in the more complex, harder-to-use tools, but it doesn’t come from them being harder to use.
Composition
There is power in the shell, but it’s not from the lack of handholds.
It’s how easy it makes it to wire tools together.
We pipe to sed
, then grep
, then jq
, then curl
. It’s natural. It’s the whole point. We reconfigure our tools to make new ones every 5 minutes.
If you don’t have a shell, then there’s only one way to do this: Ctrl+C, and Ctrl+V.
Power tools compose. They know they can’t possibly be enough for an expert user, and so they make it easy for that expert to modify them, enhance them, rebuild them.
Novice tools come in a lovely plastic (or aluminium) package designed to be impenetrable without 3 different kinds of esoteric screwdriver, a spudger, and a chisel.
Reasoning
Perhaps the widest gulf in user expectations is this: a novice wants the computer to do as they mean.
If they mistyped a word, they want their search engine to detect this, and correct them automatically. A camera should post-process to make photos look good. Skipping through video should snap to transitions or the start of dialogue where appropriate.
We’re all in “novice” mode most of the time.
An expert, or more accurately, someone performing a task in which they believe they are expert, wants the computer to do as they say. Code is not subject to interpretation and whimsy; spreadsheet formulas do not have “wiggle room” (except when rendering the number 65,535); photo editing software allows you to be incredibly exact when making adjustments.
An expert still makes mistakes, of course, but typically solves them with safety measures, such as an “undo” button or a manual verification procedure. They might even write automated tests.
It’s worth noting here that there’s a word for someone who is not an expert expecting the tool to do as they say: “hubris”, as eloquently explained by Ian Smith in their essay, humans interacting with computers. And there’s a word for the opposite, someone who understands the system well but chooses to allow it to behave as it sees fit: “sophrosyne”. Because they have explained it better than I could, I’m simply going to steal their diagram and place it here:
(Regardless, read the article, it goes into way more detail than I can.)
A tool that serves anyone, regardless of their level of understanding or level of intention, is a tool that needs to be aware of its user. It needs to provide help where appropriate, make automatic changes if required, and back off when asked.
Modern search has fallen into the trap of giving up on the expert user, instead treating everyone as a novice. Advanced search expressions have generally been dropped in favour of LLM-powered inference. Sometimes useful, often actively harmful. DuckDuckGo, on the other hand, provides “bangs”: shortcuts to other search engines. They realise they can’t be everything to everyone, and so they don’t try, instead making it easy to use a different tool when appropriate. A tool for an expert that doesn’t detract from the default experience of a novice.
We see the same fight play out in social media. Mainstream products (Facebook, TikTok, etc.) provide The Algorithm™; they decide what you see, and your preferences merely influence the feed. You don’t have control. Excellent for someone who doesn’t want to curate their own feed, and appalling to those who do. The indie players such as Mastodon, on the other hand, decry this, instead opting for a purely chronological timeline, which feels confusing and boring to anyone who doesn’t want to invest the time, including those sophrosyne who understand the situation but wish that the computer would help them. Neither solution is appropriate for everyone.
Our terminal-based tools typically don’t make assumptions or corrections on behalf of the user. They’re intended for experts who read the manual, and so they don’t offer improvements or suggestions.
But they could.
The best of both
We don’t have to live in two separate worlds.
Expert tools can have affordances. Take a look at Zellij, which doesn’t need docs because it tells you exactly how to use it at the bottom of the screen. An expert can turn this off, but many don’t.
My shell highlights the program name in green when it’s recognised (on the PATH
), and red when it’s not. A simple affordance that’s saved me from several grey hairs.
Perhaps expert tools can even have graphics. Why can’t my terminal emulator render a picture? (Spoiler: it can, but no one makes use of it.) Why can’t it render a web page, or play a song, complete with graphical audio controls?
Why is it limited to fixed-width text?
Our command-line applications run in layers: the program is invoked by, and managed by, a shell. The shell runs in a terminal emulator (though it might be nested in another shell). And that terminal emulator, well, emulates a terminal. One of those green-on-black screens that are familiar to anyone who watched sci-fi in the 1990s. Specifically, it emulates a DEC VT100 terminal, developed and released in the late 1970s.
Let’s imagine that instead of emulating a terminal, we might, instead, display anything we want, unencumbered by needing to support a graphics environment literally designed for a text-only, 80x24 screen.
What if it was, for example, a web browser, with an Omnibox just like the one above? The only difference would be that in addition to rendering web pages, it would be able to invoke shell commands too, faithfully emulating a terminal when required.
And if the shell command were to output an interactive (but self-contained) web page, well, we could render it.
I guess what I’m saying is, what if the Omnibox was a little more Omni? In addition to browsing and search, maybe it could run other programs too: command-line programs, a calculator… perhaps even an on-device LLM, assuming we could make one that’s ethically trained.
One could extend this to interact with the current page as well; why shouldn’t I be able to pipe the result of a sed
invocation directly into my to-do list?
There and back again
In 1968, Doug Engelbart pioneered the mouse, and later, Apple pushed it front and centre, to the exclusion of a keyboard-driven workflow.
Perhaps this idea needs flipping on its head, at least a little.
The mouse (and its friends: the trackpad, trackball, etc.) is a wonderful device and I couldn’t work without one. But sometimes, typing suits me best.
Imagine the desktop environment had an Omnibox; one which let you engage with a command-line shell, or a calculator, or a website, spawning or scrolling windows as necessary.
(Of course, you’d be able to drive it with your voice, if you wanted to.)
In fact, it already has one, doesn’t it?
macOS has had Spotlight since it was called “Mac OS X”. It predates the Intel chips. And there are alternatives for those who want more power: Alfred, Raycast… even the Windows Start Menu has a search bar now.
It’s always been second-class though. A power feature, for power users, hidden away until called out through an incantation. Why shouldn’t your entire computer be text-driven, rather than mouse-driven? Or, more fairly, why can’t we have both, at the same time?
The average computer user already spends most of their time interacting with their computer through the keyboard; it’s just that they open an application or website with their mouse first, and then use the search box.
I’d like to invert this. Text first, and then graphics as required.
By flipping this relationship, we also flip one more thing. Nowadays, a user is application-minded: they first decide which application to use, and then use it to accomplish a task. By moving to text-first input, in tandem with affordances to guide the user to the correct tool, we may be able to start with the task to accomplish, and then, as we type, decide the application (or set of applications) best suited to solve the problem.
And while we’re living in dreamland, let me go a little further. My web browser has tabs. It has bookmarks. It makes me very sad that I am effectively running a second, superior operating system inside my first, with features that are missing from the outer OS. Let’s collapse them! The OS needs bookmarks, it needs tabs, regardless of the situation. If I’m talking to a friend about meeting up at a pub later, let me put the chat in a tab, right next to the map that gets me to the pub. Let me put the bookmark for my email client (currently in my Dock) right next to the bookmark for my bank account (in my Firefox bookmarks bar).
I don’t know how to fully express my dream for a text-first, task-first desktop; I’m a writer, not a graphical designer, and ironically, words seem lacking here. But perhaps some part of what I’ve written will resonate with someone, and we’ll be able to live in a future where the gulf between command-line nerds and the average computer user isn’t so wide.
This idea is free as in birds
If you like this idea, it’s yours. While I’d be happy to discuss it with you (and please get in touch!), and there’s even a non-zero chance I might get on board, it’s more likely I’ll wish you the best of luck, help a little when I can, and tell everyone I know how wonderful you are.
You can read more ridiculous ideas by browsing the series:
- Starting from scratch
- Structured archival, and the web as it once was
- Search is broken
- Prompts already won
A big thanks to Ian Smith, Ross Wintle, and Sara Joy for reviewing this, giving me some incredible ideas, and generally being a lot smarter than me.
If you enjoyed this post, you can subscribe to this blog using Atom.
Maybe you have something to say. You can email me or toot at me. I love feedback. I also love gigantic compliments, so please send those too.
Please feel free to share this on any and all good social networks.
This article is licensed under the Creative Commons Attribution 4.0 International Public License (CC-BY-4.0).