Windows Are Evil

What are windows to the user? Nothing. Windows are nothing. They're not business objects, objects of interest to the user. Nor are they actions, operations of interest to the user. Windows are interface artifacts and of no interest whatsoever to the user.

Windows are overhead, and since overhead is bad, it follows that windows are bad. Or rather, they would be bad if:

windows required less than massive overhead on behalf of the user
windows were difficult to eliminate
windows didn't mangle the UI's system metaphor until it was lying dead and bloody
the means of eliminating windows weren't so very well documented

First things first. What exactly is the overhead associated with having windows? That one's easy, it's having to manage them. Managing windows is so widely recognized as an intolerable strain that numerous applications come with special features and concepts for the sole purpose of managing windows.

I refer to tabbed browsers like OperaBrowser and text editors like notetab. At this time, my computer has exactly 13 top-level windows, 8 of which contain subwindows (tabbed pages), for a total of 86 windows. Setting up this galaxy of windows was a serious pain, modifying them in any significant way is too painful to contemplate. And that's WITH tabbed windows to make managing windows "easier"!

But in fact, creating interface artifacts so the user can manage other interface artifacts is just plain idiotic. It is exactly like trying to dig deeper to get oneself out of a hole. The correct solution is to eliminate the interface artifacts in the first place, to get rid of windows.

Windows are actually very easy to eliminate. You start by switching to 3 dimensions in order to have enough room for all the user's BusinessObjects. Then you get rid of manual placement since it is utterly vile (see AutomaticVsManualPlacement). Once that's done, you clean up by getting rid of applications, which are yet another interface artifact.

You get rid of applications by turning all the nouns (eg, "blank document") into genuine BusinessObjects in the system, turning all the verbs into context menu actions , and getting rid of anything that's neither a noun nor a verb acting on a noun. As one example, "opening" or "saving" a file needs to be eliminated since "files" aren't business objects and "saving" a movie or song is intrinsically meaningless. In contrast, publishing a new object version (eg, after editing a wiki page) ought either become automatic or handled by turning individual versions into full fledged objects direct manipulable by the user.

(The only legitimate actions that don't act on business objects are translations and rotations in the 3D space. And those aren't the purview of the "applications" programmer. Every other verb either acts on an explicit or implicit business object of some kind, or needs to be eliminated. For example, "italic mode" operates on the cursor object.)

Having shown how easy it is to eliminate windows, let's go on to discuss the effect of windows on the UI system metaphor. The only reasonable system metaphor for a general-purpose UI that handles large numbers of objects is some kind of DirectManipulation AlternateReality?. An alternate reality (or AlternateRealityUserInterface) is a notional space which obeys the key rules we are familiar with from our reality. These include:

spatial unity
solidity or spatial exclusion
locality
unity of identity
persistence
spatial connectedness or visual locality
completeness

Windows destroy the alternate reality by violating most of its key principles. In order:

the taskbar provides a secondary space for windows completely dissociated from the primary desktop space, it isn't even the same kind of space!
windows in the taskbar merge together into groups
you can have multiple windows open to machines that have no logical connection to each other
you can have more than one window open on the same object (the worst part is it's inconsistent; you can't create a window on an object over which you've got a window open)
windows outright disappear when they are "minimized"
you never see a connection between windows even when the business objects they're at are connected to each other
the windows' interface cannot be a part of the alternate reality so every time you handle a window, you're stepping out of the alternate reality

Finally, the literature has plenty of studies and examples of ZoomableUserInterfaces. All of the key elements have been studied extensively. Any competent engineer, taking account of known research, will know that you can eliminate windows and will have at least some idea how to go about it.

This would be sufficient cause for condemnation without the fact that window-less interfaces can be easily found in mainstream systems. See NoApplication for examples.

--RichardKulisz

Much of the above feels weirdly resonant with my own desire to apply what we have learned from game UI design to business applications, but expressed it in a very different (and probably more useful) way than I have up to now. Perhaps the origin of the thought dictates its expression. --CraigEverett?

"saving" a file needs to be eliminated

I have seen an application (MoonEdit) that effectively saves continuously, and provides a scrollbar-esque widget (note: may be evil, but that's a side issue) that allows you to run backwards and forwards through the complete document history and undo/redo with individual keystroke finegrainedness. Is that the general sort of thing you mean? -- EarleMartin

: Short of TransparentPersistence, yes.

"... getting rid of anything that's neither a noun nor a verb acting on a noun." What do you do about verbs that act on more than one noun simultaneously? Why should the UI suffer from the same (truly stupid) limitations of typical programming object models? (Current UI conventions don't have very good answers to this question, either.) -- DanMuller

Please give concrete examples. In the case of a selection stack, where there are multiple selections, the standard "pick up object" operation actually means "(top selection) pick up: (object or selection)" and so has two arguments, the first implicit. Perhaps if I knew enough such examples, some general rules would emerge. -- RK

Say I have two different editors, or two different compilers, whatever, and a file that can be processed by either of the two programs. (E.g. two C compiler implementations; or a WYSWIG editor and a text editor for runoff-like language.) Now I want to Edit (or Compile) the File with EditorA (resp. CompilerA). That's a verb and two nouns.

And what about verbs without nouns? Aren't these the essence of the handy shortcut? If I do X with Y often, I want to define a new verb that encapsulates this notion. Just like in verbal language -- if I tell someone at work that I'm rebooting, they're going to assume that I mean I'm rebooting my computer, without having to be explicit.

I just don't buy into the absolute primacy of nouns over verbs.

-- DanMuller

If you want to have verbs without nouns, then you define them as actions performed on the void. So instead of selecting an object and right-clicking on it, you select nothing and right-click on the nothing. This is the equivalent of methods defined for 'nil'. And all without the ludicrous nouning of verbs so common in WIMP (turning actions into icons).
If you want to have transitive verbs (keyword methods) then you can provide the arguments on a stack. Or you can curry them. In this case, you'd provide "compile with A" and "compile with B" as predefined verbs. If the argument actually varied significantly, which it does not in this case, then the right way to do it is to leave the argument dangling.
So to do '(canvas paint: brush) with: colour' you'd select the canvas or region, then you'd right click to select paint from a menu. Once you had a brush, you would right-click on the brush (access the reflective actions in the right click menu) in order to change the brush style. Then you'd right click on the brush strokes to change the colour they're painted with.
In the case of editing, you're actually talking about two different representations of objects, and not actions that can be performed on objects at all. Every object has a representation and a uniform way to switch between representations which for uniformity's sake may be an action, but this is only a convention. -- RK

I think (though I'm not sure) that the preference for "noun verb", versus "verb noun", order reflects the statistical prevalence of nouns over verbs -- there are many more things to be acted upon than actions to be performed. The problem, then, is that the verb-noun order requires that each verb be required to handle all possible nouns. The information that a particular noun has been chosen limits the set of meaningful verbs far more than the converse. The noun-verb order doesn't preclude supplying an additional argument. Thus, for you example above, you could readily say (using Smalltalk syntax):

fileA editWith: plainTextEditor fileA editWith: wysiwygTextEditor fileA compileWith: compilerA fileA compileWith: compilerB

This could just as easily be expressed as:

plainTextEditor edit: fileA wysiwygTextEditor edit: fileA compilerA compile: fileA compilerB compile: fileA

Now, note that in the latter example, that the need to discriminate between "edit:" and "compile:" has disappeared -- and thus the expressions can be reduced to:

plainTextEditor openOn: fileA wysiwygTextEditor openOn: fileA compilerA openOn: fileA compilerB openOn: fileA

These expressions are readily expressed as gestures in a graphical user interface -- for example, dragging an icon representing fileA onto icons representing the four tools. Do you have some problem with the user interface this implies? -- TomStambaugh

Yes. In fact, I have many, many problems with it.

For one thing, 'openOn:' is an intrinsically meaningless term. You might as well rename the operation 'doSomething:', it is that meaningless.

When you decompose your HCI sequence, it becomes obvious that it's impossible to have alternatives to 'doSomething:' unless you accept equally meaningless sequences such as drag and shift-drop, drag and alt-drop, drap and ctrl-drop, and so on. Of course, nobody would ever remember what these sequences do, which is why they're not widely implemented. So your 'doSomething:' (drag and drop) operation is meaningless (it can't not be there, nor can it be replaced). It's gratuitous syntax that carries no informational value and serves no purpose higher than separating the noun and the verb. For another thing, it's impossible to create a toolset that includes all useful tools. You end up with multiple palettes of toolsets floating around and having to be managed. This management is overhead.

For a third thing, it's impossible to figure out what tools can handle a particular object. You're forcing the user to consciously manage object types. And if you help them along by greying out those tools that don't respond to the object the user picked up, then this only creates non-locality in the UI.

For a fourth thing, the HCI for drag and dropping an object on top of another object is simply more involved than right clicking on an object and having a choice of actions you can perform on it (edit or compile). You're creating unnecessary overhead, more overhead than there is in WIMP.

In your scheme, the toolsets exist whether or not they are needed and they require manual placement onscreen before you can drag an object onto one of them. In my scheme, a menu doesn't exist until it's needed, at which time it's automatically placed onscreen for the precise duration that it is needed, and nothing needs to be dragged onto it for it to work.

For a fifth thing, it's impossible to use a second or third argument with the drag and drop scheme you propose, so it's not like it's got any intrinsic advantages to it. The only way to add further arguments is to do it my way, with a stack of objects so you can push the second and third arguments on the stack before performing the action. Or best of all, to curry the actions so that 'paintbrush paintOn: canvas with: colour' becomes '(paintbrush paintOn: canvas) with: colour' or '(paintbrush with: colour) paintOn: canvas'. WIMP uses the latter, but the former is infinitely more useful. And most useful of all is '(canvas paintUsing: brush) with: colour'.

Finally, the reason why '(canvas paintUsing: brush) with: colour' is the right order of operands is because in the user's mind canvas is least variable and colour is most variable. So when the user explores different options, they're able to change most easily exactly those operands they want to change most.

The same applies to 'fileA compileWith: compiler'. If you select fileA first, you can choose the compiler very easily. And if you want to compile a bunch of files using the same compiler then you select them all as a set. What choosing the compiler first implies is that the user will change their mind about what file to compile. Something along the lines of "Hmmm, I think I'll compile fileA ... no, I'll compile fileB first ... no, wait, maybe fileC ... no, I think fileA was best after all". This is ludicrous.

Order of operations matters and only interaction designers can determine the right order of operations because it's an interaction design problem.

-- RichardKulisz

Sorry, but that 'action on void' thing seems to me like the sort of non-intuitive artifice which is also common to programming in object-centric languages. Some people collect a lot of short-cuts (which is what the currying amounts to; the actions available on 'void' could become quite cluttered over time. It also seems reminiscent of one of the wierder things I often do in the WinXP interface: right-clicking on an empty area of the task bar to get to the Task Manager menu item. That just always seems wrong, especially at those times when I have trouble finding an empty spot on the task bar.

I don't know what you mean by 'leaving an argument dangling'.

BTW, I didn't make a 'proposal to use verb-nouns'. (And I wonder what you thought I proposed!) I just asked how you that would handle some situations that seem common to me.

You're right that in the compilation example, it makes more sense to pick the file or files first, since there are few compilers; this is exactly the sort of thing that you can set up in Windows Explorer by defining actions on file types.

But it also sometimes makes sense to go straight for an action, often a user-defined one (a la currying, as you characterize it). In such cases, I really don't see the conceptual problem with objectifying actions. People do it whenever they write a verb in a sentence; it's not difficult to understand that the word represents an action, and this notion wasn't invented merely to support programming or user interfaces. It all just seems like trying to solve a problem that doesn't exist. In a very real sense, your action menus consist of objectified verbs, and all this talk is really just about the navigational/compositional paths among nouns and verbs.

(So now you could say that I've made a proposal, I suppose.)

I think that this sort of oversimplification of the base vocabulary will lead to more complexity in the long run. By trying to enforce just one way to do things, users will eventually have to work around the one way when they have tasks that don't fit what you anticipated, or what you thought was exclusively important. But then, I would think so, since I also think that multi-paradigm programming is natural and convenient. :)

-- DanMuller

No, in a very real sense, my action menus are non-objectified verbed verbs. As opposed to the nouned verbs of icons.

Users always have to work around a paradigm. The important thing is that the paradigm handles 99% of the common cases. WIMP does not. Neither does verb-noun or object-verb-subject.

By leaving an argument dangling, I mean that you make up some kind of default and let the user change it after the fact. So if you have a paint program, you get the user to paint using black brushstrokes (or whatever paint they used last) and let them change the colour of the brushstrokes after they've been made. This is also exactly what File Dialogs do, they make up a directory to browse in and let the user change the directory to browse after the fact. This directory is either the desktop or whatever directory was last used.

The WinXP taskbar behaviour actually makes sense given the constraint that they don't have zooming and so can't differentiate between selecting an object in the taskbar and selecting the taskbar itself. It's the least bad option and not even a bad one at that.

You're saying that people collect shortcuts. This is true. 99.99% of shortcuts people collect are to objects or places. In other words, to nouns.

I don't know about you but I've got zero scripts or argument-less programs in either Windows or Linux. The only thing that even approaches an argument-less script is 'shutdown' and that's a script that shouldn't exist since the power button on my computer should be sufficient for that purpose. And if it's not, then an action like zooming out of what's onscreen should be sufficient to save everything so that I can just turn off the power on my computer.

All those shortcuts on the desktop to programs? They'd be infinitely more useful if they were inverted to point to nouns. It would be more useful to click on 'savegame 1 of Quake' than the usual 'Quake' then 'savegame 1'. VisualWorks doesn't allow you to "load" an image after the fact; you run an image and that's it. The only reason applications provide save/load is to workaround the fact that operating systems have incredibly lousy sharing and caching of software code.

You're optimizing for functional-oriented people when 99% of the population is object-oriented. You claim that people collect all of these argument-less scripts, but you've yet to produce any evidence of it. If you want to provide evidence to support your point of view, all you have to do is:

produce a domain where the user changes functions less frequently than any of the arguments to those functions so that VSO order makes sense
produce a large set of functions where the Subject is implicitly known without the user having to specify it somehow so that VO order makes sense

Until you do this, it makes sense to heed the fact that out of 16 possible orders, most natural languages are SVO and most sentences uttered in those languages are SVO.

And actually, I question the second set of possible evidence you might bring. It makes a lot of sense to me to have 'shutdown' be available only by right-clicking over an object that represents the entire system. The advantage of this is obvious, it demystifies 'shutdown', metaphorically cracking open the black box, and aggregates it together with other actions that operate on the same objects. And if there aren't any such actions, it compels the programmer to think long and hard about providing some.

-- RK

Why not let the user make the decision? Start by supporting both noun-centric and verb-centric interactions. Track usage over time and allow unused interactions to wither and possibly die. -- EricHodges

You can't support both without either massive redundancy or massive non-uniformity. The cost of supporting both is too high, and at least part of this cost would be paid by unwilling users. 99% of interaction is noun-centric, 99% of people think in a noun-centric manner. Remember, choice is a bad thing when you can't justify it. -- RK

How is offering a choice equivalent to imposing a paradigm?

Your points about many shortcuts being objects is true enough. But when an object has a default action associated with it, and a user's interaction is to double click on it in order to do something, then how is the user thinking about it? I think it can vary. I know that when I want to do the household bookkeeping, I click on a shortcut that starts Quicken using the household database. I always use the same database; the only time I think about multiple databases is when I'm making a backup copy. So the conversation in my head, to the extent I can identify one, is more like "computer, let's do bookkeeping". The subject is defaulted and not involved in the converstaion at all.

So that shortcut to a file, with a defaulted action -- does the user double-clicking on it think in terms of "computer, do something", or "computer, take this file and do something"? I say it depends on the task. And I don't think people have a problem with that; but maybe designers do. :)

BTW, the Quake example is a bad one, because usually you're thinking "save my game", and only afterwards do you think, "oh yeah, I need a name for it". And usually the name is a new one, so the object doesn't even exist in advance.

Also BTW, I am sceptical of the truth of your statements about SVO order and natural languages, or its relevance. Consider that most of the user's conversation with a computer consists of imperatives; the user wants the computer to do something for him, and tells it so.

And furthermore, please stop mischaracterizing what I'm saying. I'm not optimizing anything for 'functional-oriented people'. I'm claiming that people's conversations with a computer aren't as one-dimensional as you're making out.

-- DanMuller

One dimensional?? Now who's mischaracterizing things?

You'd be wrong about the subject being "defaulted" whatever that's supposed to mean. You'd also be wrong about characterizing things in terms of "the conversation in the user's head" which is an oxymoronic absurdity. A conversation has two parties, in this case the user and the computer. That's why how you think of things simply doesn't matter.

The only sensible way to characterize the HCI problem is to ask oneself what's the most expressive way to carry on a conversation between the human and the machine. And that's natural language SVO. Which is why SVO is deeply relevant and why your 'imperative' mischaracterization of the GUI conversation is irrelevant.

What you call "the default action" associated with most objects isn't an action at all. You might have guessed that given my derisive attitude towards the entire concept of "opening" a file. The notion that a user should need to explicitly request an object be represented is a holdover from CLIs which has no place whatsoever in a modern GUI.

And while I'm on this topic, I'll note that CLIs are very imperative but guess what? GUIs are supposed to be better. Yes, better. Not "different" but better.

Dan, one does not "offer" choices, one imposes choices on users. What one offers is options. And when those options are useless, then imposing a choice is unconscionable.

As for Quake, Thief has a better handling of savegames. When you save a game, it creates the title for you. And BlackAndWhite is better than that since it automatically saves the game when you quit. And this is hardly the best way to go about things.

The best way to go about things is to pause and snapshot the game when the user zooms out, and let them access game-wide functions when they can deal with the game as a single object. At that point, they can copy the savegame like they can any other object, ensuring they don't clobber it when they reenter. So why don't games do that?

-- RK

I prefer buttons over context menus since it takes one click instead of 2, and can use a keyboard shortcut (Alt-C to close). Thus my navigation speed is trebled and my carpal tunnel is halved.

Context Menus also use keyboard shortcuts. Also, the buttons which will typically be one click away are also typically toolbar buttons. The increased precision required to use them accurately is not necessarily a win from an ergonomics view.

CategoryInteractionDesign, CategoryUserInterface, CategoryRant (in a major way)