Rotating 3D objects on a 2D screen is a fundamental building block of human-computer interaction. Being able to reach through a pane of glass and touch virtual objects is absolutely critical for CAD & industrial design, 3D modeling & animation, medical visualization, and scientific data interaction, not to mention the still-young fields of virtual & augmented reality.
Click and drag to rotate!
Unfortunately not much thought is given to this interaction. There have been no new methods developed since 1994, and no new theory around those methods since 2004. Those methods were quite ingenious, and built strong mathematical foundations for the “virtual trackball problem”. But they have shortcomings. And there are several options to choose from – which should someone making a 3D interface pick? I can find no discussion comparing the experience of using different virtual trackballs, or a taxonomy of their tradeoffs.
So, let’s look at the virtual trackball problem from the perspective of user experience rather than mathematical formalisms. What exactly do users want to do when they rotate a view? What properties of different rotation methods help or hurt this goal? This investigation will help clarify the pros and cons of existing methods, and in the end will help us develop a new, better virtual trackball.
User Experience
What does a users want to do when they rotate objects on screen? Here’s a typical workflow:
- You want to look at the object from a particular direction. You click on the screen and wiggle the mouse around to try and figure out how to rotate the thing you want to look at, so that it faces you straight-on.
- You’re looking at the thing now, but it’s upside-down. Maybe that’s fine, or maybe you click and try to twist the screen to rotate things right-side up.
Your challenge throughout this post will be to try and rotate the teapot to the following views as quickly as possible. Which methods make this easy? Which make this a major pain, especially during final fine adjustments?
Virtual Trackball Control Methods
Basic Azimuth / Elevation Control
The idea behind this control method is the simplest: moving the cursor up and down controls the azimuth angle, and moving the cursor right and left controls the elevation, where azimuth and elevation map to yaw and pitch in the starting position. This works really well for anything with a natural orientation, where “up” is a meaningful and useful direction. For example, spinning a virtual globe, looking at maps or objects that are sitting on a flat ground plane, or camera controls in a first-person video game.
You will notice that it’s impossible to reach View B, and this is due to the lack of roll control. This can be an asset in the scenarios where “up” should stay “up”, and there is no way to get lost in an off-vertical view.
Oftentimes elevation will be locked to the range [-90, 90] degrees, because if you flip the scene upside down then azimuth will spin the opposite way of the mouse movement, which can be rather confusing. There is also gimbal lock at the poles which prevents yaw motion of the view, and tracking motion smoothly along a line passing near the poles requires very fast and unintuitive mouse movement – this is known as the keyhole problem.
Azimuth / elevation control also has the first nice formal property we will identify, which is that the motion only depends on how you move your mouse and not where on the screen you click. I call this property position independence. It can be thought of as satisfying Fitt’s Law, which says that good UX should have a large target area for interaction.
Trackball Control
How do we get rid of gimbal lock? Trackball control allows us to do this. At every moment in time, we redefine the pitch and yaw axes to be relative to the current orientation of the scene. So a mouse movement right will always rotate the scene with a rightward yaw motion, even if you are looking down the poles. This replicates the behavior of physical trackballs.
The biggest downside of this control method is that roll control is implicit, meaning that there is no way to directly control roll without chaining together multiple rotations. And because the control axes are updated with each new mouse position, it is easy to get “lost” in orientation space. This is easiest to see if you click and drag the mouse in a bunch of clockwise circles around the center – you will see the view slowly roll counterclockwise. This undesirable behavior I call precession.
Trackball Control without Precession
Precession is an annoying behavior when controlling 3D views with a mouse. Because the local pitch and yaw axes are being constantly updated, this prohibits “undoing” an errant click-and-drag by moving the mouse back to where you initially clicked. This violates the UX principle of forgiveness. Being able to easily undo rotations within a single click requires that each position on the screen maps onto a single view angle, no matter what mouse movements got to that position. This property I call path independence, and I strongly feel that it is necessary for any good virtual trackball control method.
Path independence is easy to add to the classic trackball control method. Instead of updating the local pitch and yaw rotation axes at each new mouse position, only update these axes on each new click. Within each drag, moving the mouse back the original position will recover the original view, and you no longer see the view precess if you drag in little circles about the center.
However, roll control is still implicit. To control roll, we will need to move beyond local pitch/yaw axes.
Shoemake’s Arcball Control
Arcball control was first proposed by Ken Shoemake back in 1992, in his paper ARCBALL: A user interface for specifying three-dimensional orientation using a mouse. The idea is to imagine that the points on the screen are projected down onto a half-sphere that lies below the screen. Clicking and moving the mouse from one point to the next causes a rotation along the great-circle arc that connects those two points. See these images from the excellent article by Robert Eisele, Trackball Rotation using Quaternions:
- Radial mouse movement has a great circle that passes through the top of the sphere, resulting in a pure pitch/yaw movement.
- The area outside the circle where the half-sphere sits on the plane maps to its equator. So mouse movements that start and end in this area will result in pure roll.
- Other mouse movements result in some combination of pitch/yaw and roll movement.
This allows us for the first time to solve both parts of the user workflow. First, click near the center and wiggle the mouse around to get the view looking down along the angle you want to see. Then click and drag along the edge to roll the view to how you’d like it. But this does have some disadvantageous properties:
- Rotation behavior is not the same everywhere. It depends on where on the screen you click, not just how you move the mouse after you click.
- Without a visual indicator of where the sphere’s edge is (like in the demo above), it can be hard to know if you are clicking in the pure roll control zone.
- Near the sphere’s edge, the slope approaches vertical and there is a discontinuity in control for that angular range.
Note that rotation from one side of half-sphere to the next is a 180° rotation. But in Shoemake’s implementation, the angle along the great circle arc is doubled so that the same motion results in a 360° rotation. In the 2004 paper Virtual Trackballs Revisited by Henriksen, Sporring, & Hornbæk, the authors express confusion as to whether this “2θ” rotation was an accident or not. I am nearly sure that it was intentional, because the double angle rotation is necessary for path independence!
An example to show why this is the case: consider a rotation starting with a mouse click at the top of the screen that moves to the bottom of the screen. You could get there through a pure roll motion along the outside, or you could get there through a pure pitch motion top-to-bottom. If each of these were only a 180° rotation, the view at the bottom be different along the two paths. For path independence both of these must end up being the same view, and so doubling the angle to create a 360° rotation is necessary.
Another nice property of the 360° double-angle rotation is that you can get from any view vector to any other in a single click.
Sphere Control
Github user MischaMegens2 raises the point that controlling roll when dragging along the outside edge is more intuitive without the double-angle rotation of Shoemake’s method, since the scene is then rotating 1:1 with the movement of the mouse. The tradeoff for this is the loss of the path independence property, and the reemergence of precession, but this could be useful in some situations. He dubs this “sphere” control.
Bell’s Trackball Control
Gavin Bell looked at Shoemake’s Arcball control back in 1994, and in the OpenGL function trackball.c decided to fix the discontinuity around the sphere’s edge by making the control surface smooth. Smoothness is our third desirable property, and ensures that the user can wiggle their mouse around to figure out a local gradient in control response. This local gradient gives immediate feedback to the user that shows them how the view will change with further movements, and smoothness allows them to “course correct” as they rotate the view around.
Bell’s method to smooth the arcball half sphere was to morph it into a hyperbolic sheet further away from the center, which he chose “after trying out several variations.” Unfortunately, this removes the ability to control pure roll by dragging along the outside edge of the screen.
Rounded Arcball Control
The logical endpoint to this progression is a virtual trackball that smooths out Shoemake’s arcball without removing the roll control area around the edges. I propose adding a circular fillet / taper around the edge of the half-sphere. I call this “rounded arcball” control.
Comparison Table
Each of these seven virtual trackballs has different behavior for controlling the pitch / yaw, and roll viewing angles, and each has different combinations of the three nice properties we identified: position independence, path independence, and smoothness.
I find some of these much easier to use than others, and have graded them on their subjective usability. For cases where roll control is not desired, simple azimuth / elevation control does great. When you do want to control roll, I find the new rounded arcball method the all-around best (and is strictly superior to Bell’s trackball and Shoemake’s arcball). The sphere and trackball methods (with trackball classic being strictly worse than no precession) offer different tradeoffs which may be appropriate for some situations, but I’m not a fan. Which do you prefer? I’m genuinely interested in the comments here.
Source and Implementation Details
The source for the virtual trackball controller in the widgets in this post is on github here to inspect: https://github.com/scottshambaugh/trackball
Making these widgets is the first time I have ever touched javascript, so shoutout to cursor‘s AI code editor for enabling me to model some somewhat complex behavior in a totally new language. This does mean that there might be odd coding choices there and I don’t know how to package it for outside use.
If you implement a virtual trackball that uses the arcball-derived methods that project the screen coordinates onto a half-sphere, the size of that ball relative to the screen will be an important parameter to tune. In the examples here its diameter spans 90% of the width – much bigger and you lose the pure roll control area.
Virtual Trackballs in Practice
This deep dive into virtual trackballs emerged from implementing new control methods for Matplotlib’s 3D plots. Thanks to @MischaMegens2’s contributions, Matplotlib 3.10 will ship with several new options: ‘arcball’ (the rounded arcball method) as the default, along with ‘azel’ (the previous default), ‘trackball’, and ‘sphere’.
Beyond Matplotlib, virtual trackballs are invisible yet everywhere – from CAD software, to video games, to maps, to phone apps. Yet as we’ve seen, not all implementations are created equal. Understanding the tradeoffs between different methods helps us make better choices as developers and more informed users. The next time you’re implementing 3D controls, consider:
- Does your use case need roll control?
- If not and there is a natural vertical axis, should you restrict the elevation angle?
- How big is your interface? Would position independence give your users a larger control area that is forgiving of where they click?
- Do you want to overlay a circle in your UI that shows a pure roll control region, or let users discover this behavior?
- Does your virtual trackball library use a smooth control method with path independence?
And yes, I apologize in advance – after reading this, you’ll probably notice precession in an annoyingly large fraction of 3D interfaces. But perhaps that’s a good thing. The more we understand these subtle aspects of 3D interaction, the better we can make our interfaces for everyone.