Object Browser

An object browser is a non-lobotomized version of a file or web browser. It's a framework that lets users view and manipulate a general object graph and calls special code (variously called inspectors, viewers or type handlers) to handle the presentation of different classes of objects. For the sake of power and extensibility, object types form a hierarchy and type handlers can be freely registered with the object browser at runtime.

Clearly, this is the way web browsers are evolving with plugins, mime-types and XML. However, the initial choice of HTML has irredeemably crippled them, saddling them with extremely coarse granularity, no security whatsoever, unidirectional links, and many other architectural flaws. Additionally, no significant innovation has been made in web browsers since Mosaic.

An object browser is to GUIs what a shell is to CLIs. A well-designed object browser would be strictly more powerful than the most powerful CLI. This just goes to show that nothing significant has even been attempted in the field in the last decade.

(is SelfLanguage's UI an object browser? I highly doubt it)

Automatic Placement

One of the key decisions in the design of an object browser is whether object placement is automatic (enabling free and smooth navigation) or whether the user is forced to place objects manually. Like garbage collection, placement of objects is not a key user concern and detracts from the user's key concerns. (The user's key concrens are pathfinding and locomotion through the graph, and the interpretation & manipulation of objects & the graph). And just like garbage collection, an object browser should place objects automatically. See also AutomaticVsManualPlacement to read about the numerous and well-documented problems with manual placement. Most systems use manual placement, games frequently use automatic placement.

Given that an object browser automatically places objects, we can now discuss what an object browser does more precisely. An object browser automatically represents the topology of the graph of objects using a certain geometry. An object browser also maps any geometric manipulations of the representation onto a set of topological transformations of the underlying graph.

What that means is that an object browser shows you what the graph looks like and then lets you drag and drop objects on your screen to make changes to the underlying graph. So when the object browser displays an object over here and another over there, then that's because it means something in terms of the underlying graph. And when you move an object from over here to over there, the object browser automatically translates this into an operation over the underlying graph. And when the object browser can't make such a translation for whatever reason, then it simply won't let you move the object; as soon as you let go, it will just move it back.

(The three reasons why an object browser can't make such a translation are because there is no difference between over here and over there, because the translation is ambiguous, or because the operation is simply not permitted to the user for whatever reason.)

To summarize, a 3D model of an object graph does not qualify as an object browser. An object browser must automatically place objects on the screen, and not allow users to freely move objects around willy-nilly. If the UI does not automatically place objects, if the repositioning of objects doesn't mean anything in terms of the underlying graph, then the UI is a space or viewer but does not facilitate browsing sufficiently to be called an object browser. The ways and means by which people interact with a viewer and a browser are so different that the two interfaces deserve different names.

See also ObjectBrowserInterface?

I know that there's fairly widespread interest in any such thing that would be even as powerful as good CLIs, let alone better, but they've had a problem with designing such a thing. Do you have an idea for the design that would achieve that effect?

A GraphicalProgrammingLanguage for composition (scripting) and good olde text for the deeper programming. The key is to fully integrate the text into the GUI, much the same way that it was done in Self objects, instead of keeping the two modes independent like Unix's standalone xterms. Put another way, you have to get rid of the xterm app at the same time and in the same way as you get rid of all other apps.

What would make an object browser more powerful than a CLI are two things:

ability to easily represent and manipulate non-linear flows of control / information, showing all branches in a uniform manner and without any misleading precedence
ability to easily represent and manipulate live objects, allowing a programmer to stick a live object in the source code without any special literal representation

Note that the first isn't a problem in Smalltalk precisely because the language uses an object (class) browser.

There's more of course. In an object browser, the mouse gestures replace the shell commands. Therefore, mouse gestures must be fully programmable. It must be possible to right click somewhere, pin the context menu in place, and rearrange it however you want; deleting the junk commands that were put there by dumb applications and adding your own custom commands.

And then there's all the other things that any good GUI should have, such as 3D, ZoomableUserInterfaces, and in general the second half of NewOsFeatures.

This is what I wrote on the subject a just a month back:

Now, as for representing actions. The best way to go about it is as Gestures, again as in BlackAndWhite. So after selecting the two objects you want to compare, you would trace a glyph to trigger the action. The current right-clicking of the mouse to bring up a menu is just a special-case of this. This is the right way to do commands and it is used by bittorrent and daemon-tools under the rubrik of "shell extensions". JefRaskin goes this route in TheHumaneInterface.

Obviously, any command/gesture system must be reflexive, so that you are able to create, delete or reassociate commands at your whim.

Not so obviously, commands should be reified as objects, as scripts. But these scripts should NOT be represented as they currently are in CLIs, as monolithic objects isolated from what they run on. They should not, under any circumstance, be unbound and disconnected as they are in Unix. Neither should their bindings be hidden or disguised as they are in Smalltalk and Self (thisContext and self respectively). Rather, scripts must be bound to concrete and visible objects. The script's input and environment must be concrete objects that can be accessed from the script, and vice versa.

While gestures (including mouse gestures) are nice 'n all, I doubt the assertion that they are the best way to represent actions. Ultimately, all that matters is providing a powerful human-to-computer interface. And, while gestures are one means to this end, they are neither the only means nor provably the best. Also, your argument that they must be bound directly to underlying objects is indefensible. To do such would hinder the creation of such things as mobile agents, and of context-based command languages provided by some sort of UserAgent?. I do agree that a particular, running instance of a script would ideally be represented as a local object that one may observe (in addition to having a place in something like a task manager, which is probably just another object browser).

: The distinction between 'commands' and 'scripts' is clarified below, and results in an understanding very close to your running instance of a script: an instance, but not necessarily running.

Since commands operate on standard input (a selection of objects represented by the top node of a graph), the commands must all be bound to the same predefined location where the object browser will then create that input; "user interface >> input >> selected objects" should do nicely. Commands are then just blocks of code taking one argument, the top node of a graph, performing some arbitrary operation on the graph and returning a tooltip value. Sort blocks are special in that they don't operate on selections but on the graph itself and return an ordered collection for the object browser to display that node of the graph.

One way of representing scripts within the object graph is for the object browser to run them when the user zooms into them. This would work since "time delay", "pause for user input", and myriad other things would be concrete objects in the system. However, it assumes that scripts are coded entirely in a GraphicalProgrammingLanguage and that there is no textual source to see. A more general solution is to treat scripts as blocks and provide them with a value command.

So how do you visually represent a script coded in a GraphicalProgrammingLanguage? The default representation of a container linked to sub-objects works. If users want a more sophisticated representation, they'll be able to change it.

One interesting question when you're dealing with pipes / scripts / queues, is whether to represent them as arrays or as linked lists. And I don't mean this at a low-level, but rather at the user interface level. Because you could just as easily do either or both. If you do the former, then a pipe is a container within which are contained sub-objects 1, 2, 3 and so on. If you do the latter then a pipe is a sequence of links between multiple objects linked with each other using "input.pipe" and "output.pipe".

'Objects References

Just to clarify the lexicon, this scheme presupposes:

primitive objects
primitive references

The object browser provides the object type abstraction.

Messages are those requests sent to primitive objects and references by the object browser (ie, exist at the base layer of abstraction). Scripts are an object type the object browser knows about. Commands are scripts bound to standard input and output (ie, exist at the higher layer of abstraction). Meanwhile, methods are what's used to implement the primitive objects and references (ie, exist beneath the lowest layer of abstraction).

While the graph itself must be prototypical, it seems necessary to separate classes (class handlers) from instances (object types in the graph). Or perhaps that's just an illusion since the only separation necessary is between objects and their representation. If Self provides this separation then it must be consistent with prototypes. But I don't know the details of how NakedObjects works.

Visits Actions

To be at all useful, an object browser must provide facilities to create, manage and organize the following:

visits
actions

Note that most web browsers provide a visits stack but not an actions stack. Meanwhile, the RefactoringBrowser provides an actions (undo / changesets) stack but no visits stack. Further, nobody implements them as CactusStacks nor provides facilities for managing multiple such stacks.

Other necessary facilities include bookmarks and searches.

ALL of these facilities must be reified as first-class objects in the object graph so that users can store, manage, organize and publish them in the same way as they would any other object. Currently, only internet explorer reifies bookmarks as first-class objects, RB reifies actions in a limited way, and nobody reifies either searches or visits.

Yep, good stuff. Motherhood and apple pie. One wonders why these lacks are so prevalent. -- dm

I don't. Programmers are retards. That's my answer to everything. And the only evidence I need is that C++ exists.

(Literally, people lack imagination. They're satisfied with the present circumstances because they can't imagine anything different, they can't imagine what the system is lacking unless someone tells them what to imagine. The vast majority of people aren't original or even independent.)

(Take a look at http://www.chris-lott.org/misc/kaiser.html, maybe you'll find the answers you're looking for.)

Do you know of references to this kind of stuff? Cause except for the terminology, everything here was quite old for me. -- rk

Now come on. Lots of things exist that are worse than C++; though few if any of 'em are as popular. And C++ pays the bills for me... :) As far as why the aforementioned motherhood-and-apple-pie isn't as prevalent; similar systems (not as advanced as RK proposes, but more along those lines certainly than InternetExploder is) have been developed - and haven't generally fared well. Part of it is undoubtedly economics; another big part is programmer ego; another big part is most such systems didn't make it out of the prototype stage. And a final big part is that many users probably didn't care. ObjectBrowers?s are great for programmer types, but if not carefully done can burden Grandma with too much detail.

Here's an idea, though, for an end-user-friendly object browser. As part of the metadata for the methods for each object, have an "expertise" level. Things marked "beginner" or such are on the top-level menu of the browser; things mark "advanced" go in a submenu. In the default configuration, of course; advanced users can get a flat listing (and apply filters based on the metadata) if they want. But have a mode which Grandma can sensibly use.

If that's your proposal then you really don't understand what an Object Browser is meant to be.

To understand the complete generality and power of an object browser, imagine a system with just a few object types:

container
movie
music
document

That takes care of 90% of everybody's day to day computer use right there. And you know what, there is no such thing as a "method" in an object browser, only commands. And do you know why? Because methods exist at a low level of abstraction while commands exist at a vastly higher level of abstraction so you need different names for pretty much the exact same thing in order to keep yourself from getting hopelessly confused. In particular, in order to not get the mistaken impression that there would be any commands available to the average user that would do freaky stuff which the user has no hope of understanding.

Of course, you could integrate the two levels to some degree. If you bring up a context menu on an object, you could right click on a command to bring up a context menu on the command. And zooming into the command should give you access to its details, which means the source code for it.

(Organizing context menus by difficulty is interesting but I'm certain there's gotta be a much better solution. Probably something along the lines that commands become all but invisible after the user has used the associate glyph, or key combination, a couple dozen times. The more advanced a user is, the LESS stuff they should see.)

If you wanted to browse code then that's slightly different because it's not sufficient for your component (ie, part of the object graph) to provide generic read and write. Rather, the component has to provide a special read operation called execution (in a VM of course). So you've got your choice of implementation. You can do it the Self way, where execution is the default read operation and you use mirrors to access meta-operations (eg, passive read) on objects. In that case, displayOn: frame would just be another method continually read by the Object Browser. Or you could have a more traditional arrangement where execution is a specialized form of read. The former is the right way to go since it allows you to use any arbitrary editor to read, run and modify objects.

So anyways, when you create a .method object, the graph it attaches to has to be somewhere. And that somewhere is either with that object (so you'd be creating an .image subgraph) or they're in the .method type handler (a VM + shared libraries set). And without extremely good security, you couldn't have shared libs at all. Though that wouldn't stop you from automatically duplicating the libs.

References to programmers being retards? :-) Well, there's ThePsychologyOfComputerProgramming, an oldie but a goodie which I can't resist mentioning, even though presumably that's not the sort of thing you meant.

I'm not sure what you actually meant; references on what? "Make everything first class unless you've got a damn good reason" applies to all areas of programming, of course.

The thing is that software that we all see is dominated by what is easy to implement. Neither companies nor hobbyists tend to want to do a gem-like design, because it inevitably turns out to be hard to implement, so they punt.

Also it takes an unreasonable amount of effort to create user interfaces, so people tend to sigh with relief once they've gotten one working, and go into denial about its problems.

Another aspect of this is that many problems are glaringly obvious to generalists, but invisible to specialists. Yet most practitioners are specialists.

Hybrid Menus

You're not seriously telling me that this cobbled together crap I came up with hasn't been thought up a dozen times already, are you?

Oh, is that what you meant. No, I don't mean to tell you that. I know I've seen some mostly-old very sharp research papers that talk about this sort of thing, but I can't point to them this second.

Ah, wait...I remember one, on the topic mentioned above about expert vs beginners. Background: a long time ago there were various CHI studies studying the tradeoffs of expert input/navigation versus beginner (CLI commands versus menus is the classic, although not only, example of that tradeoff). One set of choices is more efficient for experts, the other for beginners, according to the various measures studied. (Whether this is all absolutely true is irrelevant, I'm just setting up for the next study; the point is that it was the conclusion of many studies that tried to actually measure these things.)

Then a new study came out, which has been entirely ignored as far as I can see, that tried a rather novel hybrid. They allowed terse fast abbreviated expert commands, but also the prompted/menu-ed/etc beginner stuff, in the exact same points in the interface, and if the user used the beginner aspect of the interface (let's say, menu select "explore files"), it would show what sort of expert command that could expand into ("ls -l" or something).

The net effect was that all users were better off with this approach. Beginners understood the dumbed-down interface better by seeing it translated into the very precise expert equivalent in real time, and simultaneously became experts much faster, and once they were experts, they made fewer mistakes because they always had a memory aid right at their fingertips.

I presume this was ignored because CLI fans thought it said something good about menus, and menu fans thought it said something good about CLIs, so neither camp liked it. I think the actual application area was database search for bibliographic references, so "CLI" and "menu" is here used for illustrative purposes.

Some of the other topics you touch on I'll have to think about; quoting things I read long ago isn't always my strong point.

How can anyone ignore that? Coupling context menus and key combinations is such an obvious and self-evident principle of GUI design! I plan to have three-way coupling; menus, mouse gestures and key combinations.

I don't know how. But do note that it wasn't just coupling, it was also translation, to educate the user to do things more efficiently, that they claimed was important. I suppose Microsoft thinks that putting keyboard shortcuts on pulldown menu entries satisfies that - which is true in the least ambitious sense possible.

Jef would probably approve of the directions you're talking about, I think, considering how he usually talks. Yesterday he was asked for references to good CHI/UI books beyond his own, and he said he couldn't come up with any - which is my conclusion, too, after looking around for some months recently. I thought maybe I was just out of touch, but no...Come to think of it, it'd be great if someone just came out with a collection of all of the classic papers in CHI. There've been a lot of really good ones; I just never see them synthesized into a book.

Oh...and IvanSutherland's SketchPad really was as cool as claimed. His ancient paper about it is much smarter than 99% of what has been published since, and his software was more modern than a lot of what is available today. I seem to recall that he did "mouse gesture" thingies with the lightpen.

[Microsoft doesn't just show you how to do keyboard shortcuts within their menus, but also Alt-key-sequences. So when you press Alt-F-S for example in sequence, I'd call that a keyboard sequence. Sometimes, I find key sequences far more useful than shortcuts. I don't want to have to stretch my fingers in the weirdest directions to hold down 5 keys at once, so I just use a keyboard sequence (swapping caps-lock with the Alt key is also a good idea).

You said "least ambitious sense".. What do you mean by that? Personally, I feel that if they were any more ambitious about it, it would require too much overhead in the software. What would you suggest that they have? An abrupt pop up window telling you what to do every time you click something? Kind of like those annoying tooltips that start up at the beginning of applications?]

pop up window? Are you a fan of animated paper clips? No one else is.

He means that Windows doesn't show you the code that's run when you pick a menu item. In fact, you haven't the slightest bloody clue what it does when you pick a menu item, it's just black box magic, so there's no way to manually reproduce it. Showing alternative ways to invoke the same magic is the least ambitious thing you can do towards educating users.

Correct. -- dm
[Does 99 percent of the world actually give a brown flying fudge (FakeCussWord) about the code that runs their pop up menu? I can see how some open source obsessive compulsive lunatic might want an OS to display a tool tip popping up a message "here is the source code to the click you just did!" on demand. I can see how some people would prefer to peak into the sources of OberonSystem and its modules.. but the reality is 99 percent of people don't care. Most users would be greatly annoyed if they had to diddle with the source code to modify the operating system. That is what an API and abstraction is for. One could most likely extend the pop up menu using MSDN documentation anyway. In fact it is trivial to implement a new pop up menu with any developer tool - delphi, visual basic, you name it. Do the users care? No. Do 0.01 percent of the world care to see the source code of the menu item that just popped up in a particular application? Possibly. That is why Microsoft is so successful - because they don't target 0.01% of the market (the obsessive compulsive BrainDamaged people who think that we need to see the source code of the pop up menu immediately). When the Microsoft Windows source code was released accidentally to the public - no one cared. It was slashdotted and funny for about 3 hours. Then the joke wore off. Guess why? Because no one cares about the source code that runs the pop up menu. Of course the phrase no one is generalization.]

Oh, and your naivete would be sweet if you weren't so hostile. For instance, tooltips are a great idea when used correctly.

Well, well, now. Who is calling who naive and hostile? That is an act of hostility itself. No one claimed that tool tips were a bad idea. Specifically, the ones that come up in a pop up window that distort application are annoying - and people turn them off for this very reason. There are differnt kinds of tool tips - there isn't just one good kind. This wasn't ever about naivety or hostility - although the person who is calling those names, definitely is hostile and naive.
Indeed. They are much less intrusive than going into help mode to find what something means, along with the difficulty of looking up glyphs rather than words.
Even better would be a tooltip that didn't pop up in the center of the screen. Maybe a panel in the application at the bottom, or somewhere on the side.
That would be separating cause from effect, which is a big no-no.
- Apparently you are missing the point. Some tool tips are shoved in your face as a window when you start the application. Most people close it off and never let that window come up again since it is so annoying. If it was implemented as a graceful panel, or even a status bar tool tip notification system, or even graceful non windowed small pop ups - it would be much more useful than the tool tips of the time when this page was originally written on C2. At the time, tool tips were usually thrown in a users face at the start of an application in the center of the screen with an obtrusive window. This is a case of ViolentAgreement.
On the other hand, displaying them in-place occludes what you're looking at which is another big no-no. As long as theres a consistent, obvious place for "tooltip" information to be placed, I don't see a big objective difference. Like a lot of UI/interaction decisions, being consistent and obvious is at least as, if not more, important than exactly how a concept is implemented.

Objects States Capabilities

What is an object? Or perhaps, objects for beginners?

An object is a thing which possesses, notionally at least, some internal state. In an object-oriented system, such as Java is not, objects are contrasted with references (aka pointers) or capabilities which typically do not have state.

Even in those systems where capabilities do have state, this state isn't of interest to the user in the same way that the integer 455's "state" of being 455 is of interest. That's because capabilities are never, ever, business objects but only support objects. But it's more than that too.

Even in those systems where capabilities have state, the state of capabilities is still thought of by users as belonging to the objects which the capabilities point to, and not to the capability itself. The only time when the user realizes that a capability's state is separate from the object it gives access to is when they're see multiple capabilities that point to the same object simultaneously.

Since mainstream languages don't have capabilities and mainstream filesystems don't allow multiple capabilities to point to the same object (I include Unix) most users never even realize that capabilities are objects. So they're not.

Most (many?) modern filesystems (as of early 2005) do have this capability, but the OS support systems built on top of them do not leverage/expose it. Minor quibble.

Unix specifically excludes/suppresses it despite the fact that it's part of the filesystem. So it really doesn't matter that it's part of the filesystem.

CategoryInteractionDesign, CategorySourceManagement