March 13, 2013
How do modern GUI toolkits work?

I’ve been programming for several years now (Java primarily, with lots of Processing recently as it is the language we use at the university where I work). I earned an undergraduate degree in computer science. Nowhere in my education did I receive any instruction on how toolkits like GTK+, or Qt work. As a HCI researcher I commonly use GUI toolkits, but if asked I couldn’t tell you how the work. Recently, I have found that to create the interfaces that I design I need to better understand the fundamental concepts of GUIs and window managers

How do modern GUI toolkits like GTK+, Qt, or Swing work? Can you point me to papers or websites where I can learn? I understand this is a huge topic and there are different approaches, so links to architectural descriptions are fine. I’m willing to read a few different descriptions to figure out how they work. I’d like to avoid having to work through a library like GTK+ line by line.

Some of the specific questions I’d like to answer for myself are:

  • How do components detect interaction like mouse clicks, touches, etc?
  • How does data flow from the hardware through the OS to the application?
  • How is the geometry of a component rendered? How can I tell if it is rendered on the CPU or GPU?
  • How are UI animations achieved? What is the render cycle for the animation.

I have looked around, but can’t find a good summary. There is lots of library or toolkit specific stuff out there. I primarily interested in the core concepts that all toolkits have in common. Can anyone point me in the right direction?

July 16, 2012
When the GUI might kill the commandline

Wired magazine recently ran a piece discussing the role of the commandline and why as computers develop and change the commandline remains an important part of the interaction toolkit.

As an interaction designer and researcher, I’d like to offer a different view point. To compare a GUI with a commandline interface (CLI), you have to evaluate the two interaction styles based on their fundamental design concepts. The modern GUI consists of buttons and icons that allow you to do a number of different things. In a sense, a GUI is an advanced control panel. You can click buttons and drag icons. Doing so has the affect of triggering a set of functions the developer has written. This is why we have the model-view-controller architecture. GUIs are simply a graphical layer that causes the execution of controller code. We click a button the computer does something. Click the buttons in the right order and you are able to do more complicated things. A CLI is a different beast all together. Every shell, or terminal has its own variant on a shell scripting language. As a person works she types out the commands she wants executed. She can write complicated commands by using programming constructs like loops and functions. She can also take advantage of the OS’ standard input and standard output channels to string together a pipeline of commands. A CLI enables expressivity on the level of languages. While a CLI executes system functions too, the difference is the degree of expressivity the interface allows.

I think the most important factor when comparing a CLI with a GUI is expressivity. It is possible that one day a GUI will provide the same expressive power as a CLI. For that to happen, interaction designers need to begin to think about designing GUIs that allow people to express things rather than cause system level functions to run. They have to begin to think of the GUI as a means of graphical human expression. We have to shift our paradigm of interaction from one of use to one of expression. I think if we are able, we will find ways to make the GUI as expressive as the CLI.

I would hypothesize that an expressive GUI will probably make use of a touch interface to allow direct manipulation of graphical objects in the same way a keyboard allows direct manipulation of linguistic elements - the letters beneath your fingers.

(View comments
Filed under: GUI CLI