This article explores how to create a custom gesture API for use in both desktop and mobile applications.
Users interact with desktop applications using a mouse and keyboard; in contrast, they interact with mobile apps using a touchscreen. When software developers write for both desktop and mobile devices using a single codebase, they need to handle both mouse and touch events. This isn't always simple.
Treating mouse and touchscreen events as gestures, instead of raw events, is an approach that provides uniformity to an application's underlying logic. To make this happen, developers listen for low-level events sent by the browser and transform them into their own higher-level events. This is done using the DOM's built-in dispatchEvent
function together with a CustomEvent
object. The result is a message-based gesture API.
In the mobile phone era users have accustomed themselves to several new ways to interact with touchscreen apps. Gestures such as these have become part of the user interface vernacular: slide to scroll up/down; swipe to page next/previous; flick to jump a long distance; pinch to collapse/zoom out; spread to expand/zoom in; and long-press to select/choose.
Coming to terms with mouse and touch events
Developers looking for an easy way to listen for gestures will find no support from the browser. Gestures must be built from the underlying pointer events and mouse events APIs. Further complicating matters, those APIs are not symmetrical. Two limitations must be overcome.
First, mouse events arrive with information about the state of the special keys (Ctrl, Alt, Shift, Cmd), plus the state of the mouse buttons (left button, right button, wheel), while touch events carry no such information.
Second, mouse events always assume there is one mouse, while pointer events may occur with two or more fingers simultaneously interacting with the touchscreen.
To get started, here's a quick overview of the DOM events that we'll need to handle:
Mouse events
mousedown/mouseup
events arrive when the left or right mouse button is depressed or released.mousemove
events arrive as a continuous stream while dragging or hovering over an element.
Each mouse event contains information about which mouse buttons are depressed (left button / right button / middle wheel). Each event also contains information about which keyboard special keys are depressed (Ctrl, Alt, Shift, Cmd).
Touch events
pointerdown/pointerup
events are the touchscreen equivalent tomousedown/mouseup
.pointermove
events are the touchscreen equivalent tomousemove
.pointercancel
events are generated when any of these occur:- the device orientation switches between landscape and portrait,
- the browser detects an accidental palm touch,
- the Home button is pressed,
- the element's CSS
touch-action
allows the browser to take direct control of a pan or zoom operation, - the element's CSS
user-select
allows the browser to initiate a "COPY|SHARE|SELECT ALL" operation, - the element's HTML
draggable
attribute allows the browser to initiate a drag'n'drop operation.
Pointer events do not contain any information about mouse buttons or keyboard special keys. Also, in contrast to mouse events, two or more finger movements may be monitored and handled simultaneously. Each new finger touching the screen is assigned a pointerId
that is used to distinguish one from the other.
Gesture API architecture
The custom gesture API can be architected using an assembly comprising FingerPointer
objects, a Gestures
delegate and an InteractionHandler
.
A FingerPointer
object is a data structure that holds everything known about a single finger/mouse interaction: its initial position; latest position; elapsed time; distance moved vertically, horizontally, and diagonally; direction of its movement left/right and up/down; which mouse buttons are depressed; and which special keys are depressed.
The Gestures
delegate monitors the life-cycle of each finger pointer from its creation to its removal. When a finger moves or is lifted, it determines which fingers have been stationary, and which have traveled. Depending on how their geometric relationship has changed, it determines which high-level gesture message to broadcast.
The InteractionHandler
sets up listeners for incoming raw mouse and touch events: pointerdown
events create new FingerPointer
instances; pointermove
events update those instances with new position, time, distance, direction, and key state; and pointerup
events remove those instances.
These are the steps a developer will need to take to recognize gestures:
- Capture the starting and ending position of each finger or mouse pointer.
- Compute the distance and direction of each pointer's movement.
- Calculate the geometric relationship between multiple pointers.
- Determine a pointer's speed using the system clock.
- Check whether any special touch zones should be applied.
- Suppress any automatic browser-generated actions.
- Discard any unwanted raw events.
1. Capture the starting and ending position of each finger or mouse pointer
First, the InteractionHandler
is a DOM element proxy. For demonstration purposes, consider the case where the element is a full-screen <canvas>
. The handler sets up raw event listeners in its constructor.
class InteractionHandler {
constructor(canvas) {
const gestures = new Gestures(canvas);
canvas.addEventListener('pointerdown', (event) => {
gestures.addFinger(event);
gestures.sendInitialGesture();
};
canvas.addEventListener('pointermove', (event) => {
gestures.updateFinger(event);
gestures.sendIntermediateGesture();
};
canvas.addEventListener('pointerup', (event) => {
gestures.updateFinger(event);
gestures.sendFinalGesture();
gestures.removeFinger(event);
};
canvas.addEventListener('pointercancel', (event) => {
gestures.cancelFingers();
};
canvas.addEventListener('mousemove', (event) => {
gestures.sendMouseHoverGesture(event);
};
}
}
Next, the Gestures
delegate acts as the intermediate between the canvas element and the finger pointers as they appear and disappear. It contains the logic for determining which gesture message to broadcast at the initial, intermediate, and final stage of each pointer's life. The gesture logic depends heavily on which pointers have been stationary and which have traveled. Helper functions scan the array of finger pointers to make this identification.
class Gestures {
constructor(canvas) {
this.canvas = canvas;
this.fingerPointers = [];
}
// finger pointers appearing and disappearing
addFinger(event) { ... }
updateFinger(event) { ... }
removeFinger(event) { ... }
cancelFingers(event) { ... }
// determining which gesture message to broadcast
sendInitialGesture() { ... }
sendIntermediateGesture() { ... }
sendFinalGesture() { ... }
// helpers to identify stationary versus travelers
fingerCount() { ... }
stationaryCount() { ... }
travelerCount() { ... }
getStationary() { ... }
getSecondStationary() { ... }
getTraveler() { ... }
getSecondTraveler() { ... }
}
Finally, a FingerPointer
object is created whenever a pointerdown
event is received. Its constructor stores the initial position of the finger or mouse. For mouse events, it also captures the state of the buttons and special keys. For finger events, it captures the dimensions of the finger (width/height) for use as a measure of tolerance when determining whether a finger has moved.
class FingerPointer {
constructor(event) {
this.pointerId = event.pointerId;
this.pointerType = event.pointerType;
this.epsilonX = event.width;
this.epsilonY = event.height;
this.initial = {
x: event.offsetX,
y: event.offsetY,
t: Date.now()
};
this.latest = {
x: event.offsetX,
y: event.offsetY,
t: Date.now()
};
this.deltaT = 0;
this.deltaX = 0;
this.deltaY = 0;
this.deltaXY = 0;
this.directionX = '';
this.directionY = '';
if (event.buttons == undefined) {
this.leftButtonDown = false;
this.wheelButtonDown = false;
this.rightButtonDown = false;
}
else {
this.leftButtonDown =
(event.buttons & 0b0001) ? true : false;
this.rightButtonDown =
(event.buttons & 0b0010) ? true : false;
this.wheelButtonDown =
(event.buttons & 0b0100) ? true : false;
}
this.ctrlKey = !!event.ctrlKey;
this.altKey = !!event.altKey;
this.shiftKey = !!event.shiftKey;
}
}
2. Compute the distance and direction of each pointer's movement
The FingerPointer
object is updated with each pointermove
event. This is where pointer deltas are calculated and direction of movement is determined.
calculateLatest(event) {
this.latest.t = Date.now();
this.latest.x = event.offsetX;
this.latest.y = event.offsetY;
// time difference
this.deltaT = this.latest.t - this.initial.t;
// distance traveled
this.deltaX = Math.abs(this.latest.x - this.initial.x);
this.deltaY = Math.abs(this.latest.y - this.initial.y);
this.deltaXY = Math.hypot(this.deltaX, this.deltaY);
// direction of movement
this.directionX = (this.latest.x - this.initial.x >= 0) ?
'right' : 'left';
this.directionY = (this.latest.y - this.initial.y >= 0) ?
'down' : 'up';
}
3. Calculate the geometric relationship between multiple pointers
Many gestures use two fingers and thus need to determine how the relationship between the two has changed from initial state to latest state. The deltaSweep
computation is the change in the angle between two points over time, and is used to rotate objects on the canvas. It can be calculated like this:
// initial metrics
var initialRise = finger0.initial.y - finger1.initial.y;
var initialRun = finger0.initial.x - finger1.initial.x;
var initialTheta = Math.atan2(initialRise, initialRun);
var initialAngle = 180 - (initialTheta * 180 / Math.PI);
if (initialAngle < 0)
initialAngle += 180;
// latest metrics
var latestRise = finger0.latest.y - finger1.latest.y;
var latestRun = finger0.latest.x - finger1.latest.x;
var latestTheta = Math.atan2(latestRise, latestRun);
var latestAngle = 180 - (latestTheta * 180 / Math.PI);
if (latestAngle < 0)
latestAngle += 180;
// change in the angle between the two points over time
var deltaSweep = Math.abs(initialAngle - latestAngle);
When the deltaSweep
is above a threshold value the user's intention is interpreted to be a rotation:
- If the initial angle is less than the latest angle a counterclockwise gesture is broadcast.
- If the initial angle is greater than the latest angle a clockwise gesture is broadcast.
The panning functions reuse the just-computed rise and run values to observe whether two fingers have moved across the touchscreen in tandem.
var initialDistance = Math.hypot(initialRise, initialRun);
var latestDistance = Math.hypot(latestRise, latestRun);
var deltaDistance = Math.abs(latestDistance - initialDistance);
When the computed deltaDistance
(the change in the distance between the two points over time) is below a threshold value, it will trigger a two-finger panning gesture:
- When the fingers have traveled mostly in the x-direction, a horizontalpan gesture is broadcast.
- When the fingers have traveled mostly in the y-direction, a verticalpan gesture is broadcast.
The deltaDistance
can also be used to detect when fingers are moving apart — triggering a spread gesture, or moving closer together — triggering a pinch gesture.
Finally, if the deltaDistance
is close to zero and their absolute positions haven't changed, a twofingertap gesture is triggered.
4. Determine a pointer's speed using the system clock
When a single finger touches the screen and is released without any secondary finger events intervening, the time difference between the initial touch and final release can be used to differentiate gestures. This is the deltaT
value.
When the finger has not traveled across the touch surface it is considered stationary and the deltaT
can be used to distinguish between doubletap, tap, and press gestures.
When the finger has traveled a long distance across the touch surface in a short period a time the horizontalflick and verticalflick gestures can be determined and broadcast.
Finally, when deltaT
is greater than the flick threshold, simpler gestures can be implied:
- When the finger has traveled mostly in the x-direction, a swipeleft or swiperight gesture is broadcast.
- When the finger has traveled mostly in the y-direction, a scrollup or scrolldown gesture is broadcast.
5. Check whether any special touch zones should be applied
Some applications may employ a user interface with hidden panels that are designed to be pulled out by the user. The user does this by touching the outer edge of the device and sliding the finger towards the center.
Applications with features like this can configure the Gestures
delegate to evaluate whether or not the finger's initial position is inside one of these special zones, and if so broadcast a slidein gesture.
6. Suppress any automatic browser-generated actions
Android and iOS devices have built-in gestures that might conflict with the the custom gesture API being assembled by the developer. For example, a longpress may trigger a "COPY|SHARE|SELECT ALL" popup on Android devices. When these interfere with an application's needs, they can be disabled with CSS by setting the 'user-select'
property.
canvas.style.userSelect = 'none';
Sometimes the browser may interfere with the developer's intention by automatically panning or zooming. This can be disabled with CSS by setting the touch-action
property.
canvas.style.touchAction = 'none';
Links and images are draggable by default. To prevent this from interfering with custom gestures, the element's draggable
attribute can be set.
canvas.draggable = 'false';
7. Discard any unwanted raw events
Historically the mouse event API was the only way for developers to interact with pointing devices. In order to automatically support legacy applications on newer mobile devices, the pointer events API emulates the mouse API. So each pointerdown
message sent by the browser is followed immediately by an emulated mousedown
message. The same goes for pointermove/mousemove
and pointerup/mouseup
. The touch event API is redundant and can trip up developers not aware of this.
The way to suppress this behavior is simply to prevent the event from performing its default behavior. For example, the pointerdown
event listener shown above should be:
canvas.addEventListener('pointerdown', (event) => {
gestures.addFinger(event);
gestures.sendInitialGesture();
event.preventDefault();
};
In addition to the raw events already discussed, there are several related events dealing with user-interface devices. These are for hovering, hit-detection, and drag'n'drop features; they play no part in gestures. For completeness, they are identified here.
A digital tablet's pen generates these pointer events:
pointerover/pointerout
events arrive when the pen enters and leaves the rectangular area of an element, even when one element is an ancestor of the other.pointerenter/pointerleave
events arrive when the pen position enters and leaves the rectangular area of an element, but not when crossing from an outer parent element to an inner child element.
The mouse API has analogous events. All of these can safely be ignored by the custom gesture API and disabled like so:
canvas.addEventListener('pointerover', (e) => {e.preventDefault()});
canvas.addEventListener('pointerout', (e) => {e.preventDefault()});
canvas.addEventListener('pointerenter',(e) => {e.preventDefault()});
canvas.addEventListener('pointerleave',(e) => {e.preventDefault()});
canvas.addEventListener('mouseover', (e) => {e.preventDefault()});
canvas.addEventListener('mouseout', (e) => {e.preventDefault()});
canvas.addEventListener('mouseenter', (e) => {e.preventDefault()});
canvas.addEventListener('mouseleave', (e) => {e.preventDefault()});
Review
Handling the raw mouse and touch events is the key to creating a gesture API.
Key points:
- Simple gestures like tap, press, and doubletap can be recognized from a single stationary pointer.
- Gestures like horizontalflick and verticalflick can be distinguished from swipeleft/swiperight and scrollup/scrolldown by monitoring the system clock.
- Two finger-gestures can recognize a change in their relative distance as a pinch or spread.
- Two fingers moving in tandem can be recognized as horizontalpan, verticalpan, or a twofingertap.
- Two fingers with a change in the sweep angle can be recognized as a clockwise or counterclockwise gesture.
For demonstration purposes, many of these have been implemented in the gesture API used by the Simply Earth website. When viewed on the desktop, the mouse plus Ctrl, Alt, Shift combinations are used to initiate gestures. When viewed on mobile devices, two-fingers are used to initiate all of the same gestures.