Web components

Take Control of Mouse and Touch Events to Create Your Own Gestures

by Joe Honton

This article explores how to create a custom gesture API for use in both desktop and mobile applications.

Users interact with desktop applications using a mouse and keyboard; in contrast, they interact with mobile apps using a touchscreen. When software developers write for both desktop and mobile devices using a single codebase, they need to handle both mouse and touch events. This isn't always simple.

Treating mouse and touchscreen events as gestures, instead of raw events, is an approach that provides uniformity to an application's underlying logic. To make this happen, developers listen for low-level events sent by the browser and transform them into their own higher-level events. This is done using the DOM's built-in dispatchEvent function together with a CustomEvent object. The result is a message-based gesture API.

In the mobile phone era users have accustomed themselves to several new ways to interact with touchscreen apps. Gestures such as these have become part of the user interface vernacular: slide to scroll up/down; swipe to page next/previous; flick to jump a long distance; pinch to collapse/zoom out; spread to expand/zoom in; and long-press to select/choose.


Coming to terms with mouse and touch events

Developers looking for an easy way to listen for gestures will find no support from the browser. Gestures must be built from the underlying pointer events and mouse events APIs. Further complicating matters, those APIs are not symmetrical. Two limitations must be overcome.

First, mouse events arrive with information about the state of the special keys (Ctrl, Alt, Shift, Cmd), plus the state of the mouse buttons (left button, right button, wheel), while touch events carry no such information.

Second, mouse events always assume there is one mouse, while pointer events may occur with two or more fingers simultaneously interacting with the touchscreen.

To get started, here's a quick overview of the DOM events that we'll need to handle:

Mouse events

  • mousedown/mouseup events arrive when the left or right mouse button is depressed or released.
  • mousemove events arrive as a continuous stream while dragging or hovering over an element.

Each mouse event contains information about which mouse buttons are depressed (left button / right button / middle wheel). Each event also contains information about which keyboard special keys are depressed (Ctrl, Alt, Shift, Cmd).

Touch events

  • pointerdown/pointerup events are the touchscreen equivalent to mousedown/mouseup.
  • pointermove events are the touchscreen equivalent to mousemove.
  • pointercancelevents are generated when any of these occur:
    • the device orientation switches between landscape and portrait,
    • the browser detects an accidental palm touch,
    • the Home button is pressed,
    • the element's CSS touch-action allows the browser to take direct control of a pan or zoom operation,
    • the element's CSS user-select allows the browser to initiate a "COPY|SHARE|SELECT ALL" operation,
    • the element's HTML draggable attribute allows the browser to initiate a drag'n'drop operation.

Pointer events do not contain any information about mouse buttons or keyboard special keys. Also, in contrast to mouse events, two or more finger movements may be monitored and handled simultaneously. Each new finger touching the screen is assigned a pointerId that is used to distinguish one from the other.


Gesture API architecture

The custom gesture API can be architected using an assembly comprising FingerPointer objects, a Gestures delegate and an InteractionHandler.

A FingerPointer object is a data structure that holds everything known about a single finger/mouse interaction: its initial position; latest position; elapsed time; distance moved vertically, horizontally, and diagonally; direction of its movement left/right and up/down; which mouse buttons are depressed; and which special keys are depressed.

The Gestures delegate monitors the life-cycle of each finger pointer from its creation to its removal. When a finger moves or is lifted, it determines which fingers have been stationary, and which have traveled. Depending on how their geometric relationship has changed, it determines which high-level gesture message to broadcast.

The InteractionHandler sets up listeners for incoming raw mouse and touch events: pointerdown events create new FingerPointer instances; pointermove events update those instances with new position, time, distance, direction, and key state; and pointerup events remove those instances.


These are the steps a developer will need to take to recognize gestures:

  1. Capture the starting and ending position of each finger or mouse pointer.
  2. Compute the distance and direction of each pointer's movement.
  3. Calculate the geometric relationship between multiple pointers.
  4. Determine a pointer's speed using the system clock.
  5. Check whether any special touch zones should be applied.
  6. Suppress any automatic browser-generated actions.
  7. Discard any unwanted raw events.

1. Capture the starting and ending position of each finger or mouse pointer

First, the InteractionHandler is a DOM element proxy. For demonstration purposes, consider the case where the element is a full-screen <canvas>. The handler sets up raw event listeners in its constructor.

class InteractionHandler {
constructor(canvas) {
const gestures = new Gestures(canvas);

canvas.addEventListener('pointerdown', (event) => {
gestures.addFinger(event);
gestures.sendInitialGesture();
};
canvas.addEventListener('pointermove', (event) => {
gestures.updateFinger(event);
gestures.sendIntermediateGesture();
};
canvas.addEventListener('pointerup', (event) => {
gestures.updateFinger(event);
gestures.sendFinalGesture();
gestures.removeFinger(event);
};
canvas.addEventListener('pointercancel', (event) => {
gestures.cancelFingers();
};
canvas.addEventListener('mousemove', (event) => {
gestures.sendMouseHoverGesture(event);
};
}
}

Next, the Gestures delegate acts as the intermediate between the canvas element and the finger pointers as they appear and disappear. It contains the logic for determining which gesture message to broadcast at the initial, intermediate, and final stage of each pointer's life. The gesture logic depends heavily on which pointers have been stationary and which have traveled. Helper functions scan the array of finger pointers to make this identification.

class Gestures {
constructor(canvas) {
this.canvas = canvas;
this.fingerPointers = [];
}

// finger pointers appearing and disappearing
addFinger(event) { ... }
updateFinger(event) { ... }
removeFinger(event) { ... }
cancelFingers(event) { ... }

// determining which gesture message to broadcast
sendInitialGesture() { ... }
sendIntermediateGesture() { ... }
sendFinalGesture() { ... }

// helpers to identify stationary versus travelers
fingerCount() { ... }
stationaryCount() { ... }
travelerCount() { ... }
getStationary() { ... }
getSecondStationary() { ... }
getTraveler() { ... }
getSecondTraveler() { ... }
}

Finally, a FingerPointer object is created whenever a pointerdown event is received. Its constructor stores the initial position of the finger or mouse. For mouse events, it also captures the state of the buttons and special keys. For finger events, it captures the dimensions of the finger (width/height) for use as a measure of tolerance when determining whether a finger has moved.

class FingerPointer {
constructor(event) {
this.pointerId = event.pointerId;
this.pointerType = event.pointerType;

this.epsilonX = event.width;
this.epsilonY = event.height;

this.initial = {
x: event.offsetX,
y: event.offsetY,
t: Date.now()
};
this.latest = {
x: event.offsetX,
y: event.offsetY,
t: Date.now()
};

this.deltaT = 0;
this.deltaX = 0;
this.deltaY = 0;
this.deltaXY = 0;
this.directionX = '';
this.directionY = '';

if (event.buttons == undefined) {
this.leftButtonDown = false;
this.wheelButtonDown = false;
this.rightButtonDown = false;
}
else {
this.leftButtonDown =
(event.buttons & 0b0001) ? true : false;
this.rightButtonDown =
(event.buttons & 0b0010) ? true : false;
this.wheelButtonDown =
(event.buttons & 0b0100) ? true : false;
}

this.ctrlKey = !!event.ctrlKey;
this.altKey = !!event.altKey;
this.shiftKey = !!event.shiftKey;
}
}

2. Compute the distance and direction of each pointer's movement

The FingerPointer object is updated with each pointermove event. This is where pointer deltas are calculated and direction of movement is determined.

calculateLatest(event) {
this.latest.t = Date.now();
this.latest.x = event.offsetX;
this.latest.y = event.offsetY;

// time difference
this.deltaT = this.latest.t - this.initial.t;

// distance traveled
this.deltaX = Math.abs(this.latest.x - this.initial.x);
this.deltaY = Math.abs(this.latest.y - this.initial.y);
this.deltaXY = Math.hypot(this.deltaX, this.deltaY);

// direction of movement
this.directionX = (this.latest.x - this.initial.x >= 0) ?
'right' : 'left';
this.directionY = (this.latest.y - this.initial.y >= 0) ?
'down' : 'up';
}

3. Calculate the geometric relationship between multiple pointers

Many gestures use two fingers and thus need to determine how the relationship between the two has changed from initial state to latest state. The deltaSweep computation is the change in the angle between two points over time, and is used to rotate objects on the canvas. It can be calculated like this:

// initial metrics
var initialRise = finger0.initial.y - finger1.initial.y;
var initialRun = finger0.initial.x - finger1.initial.x;
var initialTheta = Math.atan2(initialRise, initialRun);
var initialAngle = 180 - (initialTheta * 180 / Math.PI);
if (initialAngle < 0)
initialAngle += 180;

// latest metrics
var latestRise = finger0.latest.y - finger1.latest.y;
var latestRun = finger0.latest.x - finger1.latest.x;
var latestTheta = Math.atan2(latestRise, latestRun);
var latestAngle = 180 - (latestTheta * 180 / Math.PI);
if (latestAngle < 0)
latestAngle += 180;

// change in the angle between the two points over time
var deltaSweep = Math.abs(initialAngle - latestAngle);

When the deltaSweep is above a threshold value the user's intention is interpreted to be a rotation:

  • If the initial angle is less than the latest angle a counterclockwise gesture is broadcast.
  • If the initial angle is greater than the latest angle a clockwise gesture is broadcast.

The panning functions reuse the just-computed rise and run values to observe whether two fingers have moved across the touchscreen in tandem.

var initialDistance = Math.hypot(initialRise, initialRun);
var latestDistance = Math.hypot(latestRise, latestRun);
var deltaDistance = Math.abs(latestDistance - initialDistance);

When the computed deltaDistance (the change in the distance between the two points over time) is below a threshold value, it will trigger a two-finger panning gesture:

  • When the fingers have traveled mostly in the x-direction, a horizontalpan gesture is broadcast.
  • When the fingers have traveled mostly in the y-direction, a verticalpan gesture is broadcast.

The deltaDistance can also be used to detect when fingers are moving apart — triggering a spread gesture, or moving closer together — triggering a pinch gesture.

Finally, if the deltaDistance is close to zero and their absolute positions haven't changed, a twofingertap gesture is triggered.


4. Determine a pointer's speed using the system clock

When a single finger touches the screen and is released without any secondary finger events intervening, the time difference between the initial touch and final release can be used to differentiate gestures. This is the deltaT value.

When the finger has not traveled across the touch surface it is considered stationary and the deltaT can be used to distinguish between doubletap, tap, and press gestures.

When the finger has traveled a long distance across the touch surface in a short period a time the horizontalflick and verticalflick gestures can be determined and broadcast.

Finally, when deltaT is greater than the flick threshold, simpler gestures can be implied:

  • When the finger has traveled mostly in the x-direction, a swipeleft or swiperight gesture is broadcast.
  • When the finger has traveled mostly in the y-direction, a scrollup or scrolldown gesture is broadcast.

5. Check whether any special touch zones should be applied

Some applications may employ a user interface with hidden panels that are designed to be pulled out by the user. The user does this by touching the outer edge of the device and sliding the finger towards the center.

Applications with features like this can configure the Gestures delegate to evaluate whether or not the finger's initial position is inside one of these special zones, and if so broadcast a slidein gesture.


6. Suppress any automatic browser-generated actions

Android and iOS devices have built-in gestures that might conflict with the the custom gesture API being assembled by the developer. For example, a longpress may trigger a "COPY|SHARE|SELECT ALL" popup on Android devices. When these interfere with an application's needs, they can be disabled with CSS by setting the 'user-select' property.

canvas.style.userSelect = 'none';    

Sometimes the browser may interfere with the developer's intention by automatically panning or zooming. This can be disabled with CSS by setting the touch-action property.

canvas.style.touchAction = 'none';

Links and images are draggable by default. To prevent this from interfering with custom gestures, the element's draggable attribute can be set.

canvas.draggable = 'false';

7. Discard any unwanted raw events

Historically the mouse event API was the only way for developers to interact with pointing devices. In order to automatically support legacy applications on newer mobile devices, the pointer events API emulates the mouse API. So each pointerdown message sent by the browser is followed immediately by an emulated mousedown message. The same goes for pointermove/mousemove and pointerup/mouseup. The touch event API is redundant and can trip up developers not aware of this.

The way to suppress this behavior is simply to prevent the event from performing its default behavior. For example, the pointerdown event listener shown above should be:

canvas.addEventListener('pointerdown', (event) => {
gestures.addFinger(event);
gestures.sendInitialGesture();
event.preventDefault();
};

In addition to the raw events already discussed, there are several related events dealing with user-interface devices. These are for hovering, hit-detection, and drag'n'drop features; they play no part in gestures. For completeness, they are identified here.

A digital tablet's pen generates these pointer events:

  • pointerover/pointerout events arrive when the pen enters and leaves the rectangular area of an element, even when one element is an ancestor of the other.
  • pointerenter/pointerleave events arrive when the pen position enters and leaves the rectangular area of an element, but not when crossing from an outer parent element to an inner child element.

The mouse API has analogous events. All of these can safely be ignored by the custom gesture API and disabled like so:

canvas.addEventListener('pointerover', (e) => {e.preventDefault()});
canvas.addEventListener('pointerout', (e) => {e.preventDefault()});
canvas.addEventListener('pointerenter',(e) => {e.preventDefault()});
canvas.addEventListener('pointerleave',(e) => {e.preventDefault()});
canvas.addEventListener('mouseover', (e) => {e.preventDefault()});
canvas.addEventListener('mouseout', (e) => {e.preventDefault()});
canvas.addEventListener('mouseenter', (e) => {e.preventDefault()});
canvas.addEventListener('mouseleave', (e) => {e.preventDefault()});

Review

Handling the raw mouse and touch events is the key to creating a gesture API.

Key points:

  • Simple gestures like tap, press, and doubletap can be recognized from a single stationary pointer.
  • Gestures like horizontalflick and verticalflick can be distinguished from swipeleft/swiperight and scrollup/scrolldown by monitoring the system clock.
  • Two finger-gestures can recognize a change in their relative distance as a pinch or spread.
  • Two fingers moving in tandem can be recognized as horizontalpan, verticalpan, or a twofingertap.
  • Two fingers with a change in the sweep angle can be recognized as a clockwise or counterclockwise gesture.

For demonstration purposes, many of these have been implemented in the gesture API used by the Simply Earth website. When viewed on the desktop, the mouse plus Ctrl, Alt, Shift combinations are used to initiate gestures. When viewed on mobile devices, two-fingers are used to initiate all of the same gestures.

Take Control of Mouse and Touch Events to Create Your Own Gestures

🔎