Timeline
1989 — NeXTSTEP Generation: Display PostScript
2001 — Apple’s Renaissance: Quartz & OpenGL
2007 — The Modern Era: Core Animation
2014 — The Performance Age: Metal
2019 — The Declarative Revolution: SwiftUI
2014 — The Performance Age: Metal
Fast-forward another seven years.
Steve Jobs is dead. iPhone has eaten the world.
Core Animation — and the UI framework built on top of it, UIKit — has spawned app giants like WhatsApp, Uber, and Candy Crush. For the engineers crafting cutting-edge mobile games, however, there was a creeping problem.
What’s wrong with Core Animation?
The high-level abstractions provided by Core Animation made it trivial to get started and build something great, but abstraction has a cost. Inherently, when you aren’t micromanaging every clock cycle, you’re trading efficiency for ergonomics.
Core Animation is phenomenal for cookie-cutter tasks like animating the opacity of UIButtons
or dragging UITableView
cells about. But what if you’re rendering a space shooter, a breakneck racing game, or a fully immersive farming simulator?
OpenGL ES (Open Graphics Library for Embedded Systems) was the standard for high-performance graphics APIs on iOS. It was a low-level 3D graphics API that gave fine-grained control over exactly what you were rendering to the screen.
OpenGL ES sounds great. What’s the problem?
OpenGL ES is a third-party library. This meant the software interfaces with the graphics hardware — the GPU — through a driver. Drivers are software that creates an interface to hardware via low-level instructions. These graphics library drivers themselves had to run on the CPU.
In the early 2010s, graphics chips were improving at a breathtaking clip. Their performance was finally starting to overtake that of CPUs. This led to a situation where the OpenGL ES driver, running on sluggish CPU hardware, bottlenecks the performance of graphics code running on the GPU.
Graphics engineers — the boffins building graphics engines like Unreal Engine or Unity — needed to get closer to the hardware. Closer to the Metal.
Key concepts of Metal
Before we start coding, let’s ensure we’re all on the same page with a brief conceptual overview of how the Metal framework actually works — it’ll be a little overwhelming if you’ve never worked with graphics pipelines before, and that’s okay.
Shaders are small functions that run on the CPU to handle rendering. The name is historical — back in prehistoric OpenGL days, they just controlled the shading of vector shapes.
Vertex shaders transform 3D coordinates in vector space to 2D coordinates, which can map nicely to a plane — such as your screen.
Fragment shaders, a.k.a. pixel shaders, color the individual pixels on a screen when a shape is rasterised — that is, transformed from a 2D vector image to dots on a screen.
Metal Shading Language is the language we use to write these shaders. It’s based on C++ 2011 — with several language features trimmed down and some graphics-focused features mixed in.
MTLDevice
is an abstraction in MetalKit representing the hallowed hardware of the GPU itself.MTLCommandBuffer
is the place where drawing instructions are stored before they’re piped over to the hardware for massively parallelised execution.MTLCommandQueue
manages the above command buffers while maintaining order of execution.
Does your brain hurt?
Great, that means you’re learning!
Let’s write some code
Since we’re finally in ✨ the future ✨, we can create an iOS app that runs on Swift. Code along with me, or check out the sample project yourself.
While I really wanted to write this section in Swift 1.0, which was the state-of-the-art in 2014, it isn’t possible to get a build of Xcode that both (1) contains the Swift 1.0 runtime and (2) runs on MacOS Ventura.
I could argue that I want to let you run the projects for yourself, but frankly, I’m just a bit lazy.
As you’ll be familiar with by now, let’s start by adding the Metal and MetalKit frameworks to the Xcode project:
Now we come to the main event — adding a Shaders.metal
file to your project. This is the C++-based Metal Shading Language I mentioned before. It’s pretty terse, packing a lot of semantics into a few lines.
#include <metal_stdlib>
using namespace metal;
vertex float4 vertex_main(constant float4* vertices [[buffer(0)]], uint vid [[vertex_id]]) {
return vertices[vid];
}
fragment float4 fragment_main() {
return float4(1, 0, 0, 1);
}
metal_stdlib
contains the helper functions and types that turn out useful for writing shaders.using namespace metal
allows us to namespace and prevents us from needing to prefix all our methods with the Metal framework name.The
vertex_main
function is prefixedvertex float4
, denoting it as the vertex shader, which maps 3D coordinates to a screen and returns a 4D vector — that is four numbers. This tells the GPU where to look for the vector data it wants to draw.The
fragment_main
function is the fragment shader that colors the pixels on-screen— here, it simply returns a red color in RGB format.
The Swift APIs for Metal
You can relax a little. We’re coming back to more familiar-looking Swift APIs for now. First, we set up MetalView.swift
:
final class MetalView: MTKView {
var commandQueue: MTLCommandQueue!
var pipelineState: MTLRenderPipelineState!
override init(frame frameRect: CGRect, device: MTLDevice?) {
super.init(frame: frameRect, device: device)
commandQueue = device?.makeCommandQueue()!
createRenderPipelineState()
clearColor = MTLClearColor(red: 0.8, green: 0.8, blue: 0.8, alpha: 1.0)
}
required init(coder: NSCoder) {
fatalError()
}
// ...
An MTKView
is initialised with a simple rectangular frame
and a MTLDevice
. As I mentioned earlier, this is an object that represents the GPU.
Instantiating this view is dead easy now that we’re working with UIKit — you can add this view into the project template’s ViewController.swift
:
import UIKit
import MetalKit
final class ViewController: UIViewController {
var metalView: MetalView!
override func viewDidLoad() {
super.viewDidLoad()
guard let metalDevice = MTLCreateSystemDefaultDevice() else {
fatalError("Metal is not supported on this device")
}
metalView = MetalView(frame: view.bounds, device: metalDevice)
view.addSubview(metalView)
}
}
The MTLCreateSystemDefaultDevice
method in MetalKit allows the framework to locate the GPU hardware and return a metalDevice
object representing it. With it, we initialise our MetalView
and set up our rendering pipeline.
To get our Metal rendering kicked off, we need to fill out two more non-trivial components in our MetalView
. createRenderPipelineState()
and draw(_ rect: CGRect)
.
First, we’ll set up the render pipeline:
func createRenderPipelineState() {
let library = device?.makeDefaultLibrary()
let vertexFunction = library?.makeFunction(name: "vertex_main")
let fragmentFunction = library?.makeFunction(name: "fragment_main")
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexFunction = vertexFunction
pipelineDescriptor.fragmentFunction = fragmentFunction
pipelineDescriptor.colorAttachments[0].pixelFormat = .bgra8Unorm
pipelineState = try? device?.makeRenderPipelineState(descriptor: pipelineDescriptor)
}
Here, we tell the GPU which functions from Shaders.metal
to use for the vertex and fragment shaders. We then set up a rendering pipeline that takes vertices and colors and processes them into on-screen pixels.
While there’s a lot to cover here, this is a pretty simple example — high-end Metal game engines will also process textures, lighting, tesellation, morphing, visual effects, anti-aliasing, and more in their rendering pipelines.
Finally, we override draw(_ rect: CGRect)
to actually set up our vector drawing.
override func draw(_ rect: CGRect) {
guard let drawable = currentDrawable,
let renderPassDescriptor = currentRenderPassDescriptor else { return }
let commandBuffer = commandQueue.makeCommandBuffer()!
let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: renderPassDescriptor)!
renderEncoder.setRenderPipelineState(pipelineState)
let vertices: [SIMD4<Float>] = [
[-0.8, -0.4, 0.0, 1.0],
[ 0.8, -0.4, 0.0, 1.0],
[ 0.0, 0.4, 0.0, 1.0]
]
renderEncoder.setVertexBytes(vertices, length: vertices.count * MemoryLayout<SIMD4<Float>>.size, index: 0)
renderEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
renderEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
}
This does a few things, but primarily the following:
Ensure there is a
drawable
surface to render on (usually the screen).Create a new command buffer on which to place commands for the GPU.
Set up an array of 4D vectors to represent vertices of a triangle. This matches our vertex shader’s method argument in
Shaders.metal
.Tells the render to draw a
.triangle
based on the vertices given — if you paid attention earlier, this works pretty similarly to our OpenGL rendering process withglBegin(GL_TRIANGLES)
.Finally, we tell the buffer we’re done and tell it to send our instructions to the GPU for execution with
commit
.
This tutorial might have been a little too in-depth, but hey, after 4,500 words, we’re committed now. And look at this incredible result!
That’s right, folks. A red triangle on a grey background.
If you get the impression that we’re taking a step back in terms of abstraction, you’re correct — since Metal is a lower-level API than Core Animation, we have to be a little more specific about what we want the hardware to do, which means more basic, low-level, imperative operations that get specific about exactly what the hardware does.
On the contrary, the declarative syntax of Core Animation states what we want to do and trusts the system to render the state changes in the way it sees fit.
Let’s put the pedal to the Metal
…and introduce some animation.
First, we update the shader functions in Shaders.metal
to handle time in addition to the 4D vertex. This spruced-up version will modulate its color along a series of synchronised sine waves.
#include <metal_stdlib>
using namespace metal;
struct VertexOut {
float4 position [[position]];
float time;
};
vertex VertexOut vertex_main(constant float4* vertices [[buffer(0)]], uint vid [[vertex_id]], constant float &time [[buffer(1)]]) {
VertexOut out;
out.position = vertices[vid];
out.time = time;
return out;
}
fragment float4 fragment_main(VertexOut in [[stage_in]]) {
float r = sin(in.time) * 0.5 + 0.5;
float g = sin(in.time + 2.0) * 0.5 + 0.5;
float b = sin(in.time + 4.0) * 0.5 + 0.5;
return float4(r, g, b, 1.0);
}
Let’s also update our MetalView.swift
to send the time argument to these shaders. Update the properties and the initializer:
final class MetalView: MTKView {
var commandQueue: MTLCommandQueue!
var pipelineState: MTLRenderPipelineState!
var displayLink: CADisplayLink!
var startTime: CFTimeInterval?
var time: Float = 0.0
override init(frame frameRect: CGRect, device: MTLDevice?) {
super.init(frame: frameRect, device: device)
commandQueue = device?.makeCommandQueue()!
createRenderPipelineState()
clearColor = MTLClearColor(red: 0.8, green: 0.8, blue: 0.8, alpha: 1.0)
displayLink = CADisplayLink(target: self, selector: #selector(update))
displayLink.add(to: .current, forMode: .default)
}
//...
Here, we’re revisiting our old friend from 2001, CADisplayLink
, with a slightly more cuddly API this time. The new update
method ensures the display resets and redraws our shape each time the screen wants to refresh:
@objc func update(displayLink: CADisplayLink) {
if startTime == nil {
startTime = displayLink.timestamp
}
let elapsed = displayLink.timestamp - startTime!
time = Float(elapsed)
self.setNeedsDisplay()
}
The boilerplate we wrote to set up our rendering pipeline in createRenderPipelineState
method is entirely unchanged— and the draw
logic doesn’t need much wrangling:
override func draw(_ rect: CGRect) {
guard let drawable = currentDrawable,
let renderPassDescriptor = currentRenderPassDescriptor else { return }
let commandBuffer = commandQueue.makeCommandBuffer()!
let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: renderPassDescriptor)!
renderEncoder.setRenderPipelineState(pipelineState)
let vertices: [SIMD4<Float>] = [
[-0.8, -0.4, 0.0, 1.0],
[ 0.8, -0.4, 0.0, 1.0],
[ 0.0, 0.4, 0.0, 1.0]
]
renderEncoder.setVertexBytes(vertices, length: vertices.count * MemoryLayout<SIMD4<Float>>.size, index: 0)
renderEncoder.setVertexBytes(&time, length: MemoryLayout<Float>.size, index: 1)
renderEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
renderEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
}
The only real difference here is adding the &time
argument to a new call to renderEncoder.setVertexBytes
, which allows the fragment shader to modulate the color with time.
The result? A bona-fide radioactive Dorito. The GPU doesn’t break a sweat.
Going further
Now that we have our CADisplayLink
logic configured in our MetalView
, and send a time reference to the shader via renderEncoder.setVertexBytes(&time, length: MemoryLayout<Float>.size, index: 1)
, we can implement more animation simply by updating the code for the vertex shader:
float4x4 rotationMatrix(float angle, float3 axis) {
float c = cos(angle);
float s = sin(angle);
float3 normalized = normalize(axis);
float3 temp = (1.0 - c) * normalized;
float4x4 rotation = {
{c + temp.x * normalized.x, temp.x * normalized.y + s * normalized.z, temp.x * normalized.z - s * normalized.y, 0.0},
{temp.y * normalized.x - s * normalized.z, c + temp.y * normalized.y, temp.y * normalized.z + s * normalized.x, 0.0},
{temp.z * normalized.x + s * normalized.y, temp.z * normalized.y - s * normalized.x, c + temp.z * normalized.z, 0.0},
{0.0, 0.0, 0.0, 1.0}
};
return rotation;
}
vertex VertexOut vertex_main(constant float4* vertices [[buffer(0)]], uint vid [[vertex_id]], constant float &time [[buffer(1)]]) {
VertexOut out;
float4x4 rotation = rotationMatrix(time, float3(0.0, 0.0, 1.0));
out.position = rotation * vertices[vid];
out.time = time;
return out;
}
Here, we’re creating a rotation matrix that varies with time to display a rotation on the z-axis. Since the triangle’s dimensions are based on the screen’s frame, this results in a surprisingly cool acid trip.
Remember, readers — being an expert at 3D graphics frameworks such as Metal, OpenGL ES, or Vulkan, is generally not required to build great iOS apps. It isn’t even directly used in 3D games. The sacred knowledge of Metal is most often the preserve of game engine creators (and, more recently, machine learning engineers).
If you weren’t familiar with Metal before, the esoteric nature of this segment is a testament to the achievements of Core Animation (and, to a lesser extent, Quartz): The separation of low-level graphics expertise and app development. You don’t need this knowledge today to create amazing software, but you might be building on top of these tools without knowing it.
After our odyssey through the history of Apple’s graphics and animation APIs, it’s only fitting that we look to the future. I speak, of course…