Implement your Data Access Layer with Combine
Write robust and maintainable software using modern language features
Now we can realise the beauty of a Data Access Layer - we can use any and all of these approaches to fetching data, and synthesise a result with the most up-to-date data, as quickly as possible.
We're in the home stretch now. In the penultimate chapter in our async testing odyssey, we're doing something a bit different to prepare for the grand finale.
Part II Mocking like a Pro
Part III Unit Testing with async/await
Part V (interlude) - Implement your Data Access Layer with Combine
Part VI Combine, async/await, and Unit Testing
Part V: (interlude) - Implement your Data Access Layer with Combine
Apple released Combine in 2019. The framework underpins much of the reactive nature of SwiftUI, but is also a powerful tool in its own right. We're going to take a break from testing in this chapter, and I'm going to get us all up to the same level of understanding in preparation for Part VI - Combine, async/await, and Unit Testing.
Today, we're going to:
Explore the benefits of a data access layer.
Learn all about the Repository pattern.
Gain an introduction to the basics of the Combine framework.
Finally, we'll pull these concepts together to show you how you can implement your own data access layer with Combine.
The Data Access layer
Architecture Overview
Let's revisit our nifty architecture diagram for Bev from Part I - Dependency Injection Demystified. In our modular architecture, the key layers are:
UI / Presentation Layer - to display UI and handle events
Data Access Layer - to define what data we want to get
Network Layer - to deal with how we get this data
High level architecture diagram for Bev, showing the direction of data flow from events (i.e. user actions), down through the data access layer, to a network request and the wider internet, back up the layers, to update the UI with model data
Data access as an abstraction
The Data Access Layer is an abstraction to make life easy for the UI layer above it.
The interface of the data access layer could be as simple as this:
public protocol DataAccessLayer {
func getDataModels() async throws -> [Models]
}
The data access layer is an interface that promises, among other things, to give you some data. The key is that it doesn't tell you how it is getting the data you want - it's on a need-to-know basis, and the consumer of the API doesn't need to know.
The innards of the Data Access Layer are non-public; the nuts and bolts are an implementation detail, which the UI layer (and, perhaps, your front-end developer) is freed from worrying about.
Implementation details
Inside the Data Access Layer, however, we do care about the implementation details for fetching the data. This is where it might get interesting - there are lots of ways we might want to retrieve data in an app:
Fetch from network
Makes HTTP request to internet
Very slow speed (0.01 to 10 seconds)
Fetch from local persistence
Reads from disk
Medium speed (0.1 to 1 milliseconds)
Fetch from local cache
Reads from RAM
Very fast speed (10 to 100 nanoseconds)
Now we can realise the beauty of a Data Access Layer - we can use any and all of these approaches to fetching data, and synthesise a result which gives the consumer of our API the most up-to-date data, as quickly as possible.
We might first check our local cache for the data we want, to see if we can deliver it instantly. If it's not already here, we check our local persistence. Finally, as a last resort, we can fetch data from the network. While it's by far the slowest method, an HTTP call ensures the data returned to the user is up-to-date. We can then persist and cache the data for fast retrieval later.
Retrieval strategies
We can even perform all these fetches simultaneously and generate a single result to return to the UI layer. We can update our interface to allow consumers to define a retrieval strategy such as:
Returning any data as quickly as possible
Returning only the most up-to-date data from the network, but using local storage as a fallback
Returning data as fast as possible, but potentially returning more than once (e.g. returning data from the cache, then from the network once it returns)
Here's a basic approach for implementing the interface with this strategy:
public enum DataAccessStrategy {
case fastestAvailable
case upToDateWithFallback
case returnMultipleTimes
}
public protocol DataAccessLayer {
func getDataModels(strategy: DataAccessStrategy) async throws -> [Models]
}
The Repository pattern
The Repository pattern is an approach for implementing a Data Access Layer in your app. The core idea is to manage data access logic in a centralised location. This means a Repository can collect together multiple data access layers with different underlying implementations - for instance, we might fetch from an in-memory cache, a persistence layer, or over the network.
The resulting abstraction allows our Repository interface to return an approximation to in-memory objects, which your UI layer can handle with no trouble at all.
This is only an approximation because while we would love to instantly return in-memory objects every time, sometimes things take longer than expected or go wrong.
Our interface - the API contract we are promising consumers - is hence marked
async throws
.async
ensures the worst-case scenario for retrieval speed over the network is accounted for, allowing the Swift runtime to suspend execution at the call site, andthrows
ensures that consumers know they should plan to handle potential errors.
In this example we will implement both an in-memory cache and a network store, but perhaps I'll upgrade this with a persistence layer and retrieval strategies once I start digging into SwiftData.
Implementing our Repository with Combine
Combine - what you need to know
Combine is a functional reactive programming framework released by Apple in 2019; which provides a declarative API for processing values asynchronously. Much of SwiftUI uses the Combine framework as an implementation detail, most notably @Published
properties you put in your view models.
To avoid massively increasing the scope of this already quite unwieldy 6-parter, read this SwiftLee article for an overview of the basics.
The primary concept we're going to use is CurrentValueSubject
, which does two things:
1. Broadcasts a notification to all its subscribers when its value is updated (just like it's sibling, the PassthroughSubject
)
2. Stores the most recently broadcast value.
You might already start to see how we could utilise this - #2 gives us an in-memory cache out-of-the-box!
Repository Interface
Let's go back to Bev. We've used Combine in the Repository
module to implement our data access layer via the BeerRepository
interface:
public protocol BeerRepository {
var beersPublisher: CurrentValueSubject<LoadingState<[Beer]>, Never> { get }
func loadBeers() async
}
Our BeerRepository
has the async loadBeers()
method with whom, after our journey through Part IV, we should all be very much acquainted. It also has a beersPublisher
property which exposes a public getter for the CurrentValueSubject
, allowing API consumers to read the value and hence subscribe to its broadcasted values via the sink
subscriber.
One thing to appreciate here - this interface is extremely minimal. We can wrap all kinds of complex data access business logic underneath and the UI layer is none the wiser!
If we wanted to add writes so we can update Beers
, we would only need to add something like write(beers: [Beer]) async throws
to the interface and trust the interface to re-publish the updated values via the CurrentValueSubject
after writing.
LoadingState
here is a simple wrapper enum that works like a souped-upResult
type - theidle
andloading
cases let the API users know whether we are waiting for a value or not, as well as returning the success state with the desired values or any errors:public enum LoadingState<T> { case idle case loading case success(T) case failure(Error) }
Repository Implementation
Since we aren't implementing persistence or any clever retrieval strategies here, the full implementation of our BeerRepository
is pretty brief:
public final class BeerRepositoryImpl: BeerRepository {
// 1
public private(set) var beersPublisher = CurrentValueSubject<LoadingState<[Beer]>, Never>(.idle)
// 2
private let api: BeerAPI
// 3
public init(api: BeerAPI = BeerAPIImpl()) {
self.api = api
}
// 4
public func loadBeers() async {
// 5
beersPublisher.send(.loading)
do {
// 6
let beers = try await api.getBeers()
beersPublisher.send(.success(beers))
} catch {
// 7
beersPublisher.send(.failure(error))
}
}
}
Let's step through each piece of this in turn:
To start with, we add the beersPublisher property to ensure protocol conformance. We mark this as
public private(set)
since we need it available to any API consumers outside this module, but don't want anything outside this class to modify the instance of theCurrentValueSubject
we might subscribe to elsewhere. Since it's aLoadingState
, we initialize it with.idle
so consumers know there's nothing to see here to start with.We have an API dependency which is kept
private
- it's an implementation detail which consumers of our interface don't need to know about.Our initialiser is
public
so consumers in other modules can instantiate instances, and offers both aBeersAPIImpl()
instance to use by default or the ability to override and inject a mock version of the API dependency.We complete our protocol conformance with the async
loadBeers()
method.When loading begins, we send a
.loading
state to the beersPublisher. This value is broadcast to all its subscribers, enabling our UI layer to show a loading indicator while the user waits.We ask our API dependency to fetch some beer data over the network. If this works, we can then send the array of Beers to our publisher, which broadcasts the values wrapped in a
.success
state.Finally, we handle the unhappy path. Here, we're simply passing the error to the consumer of our interface to handle, again wrapped in a
.failure
state. The UI layer checks for these errors and handles them in a way that's helpful to the user.
Consuming the values
In our BeerViewModel
, we have a simple subscription set up to read the values broadcast from the publisher.
@MainActor
final class BeerViewModel: ObservableObject {
// ...
// 1
private var cancelBag = Set<AnyCancellable>()
// 2
private let repository: BeerRepository
// 3
init(repository: BeerRepository = BeerRepositoryImpl()) {
self.repository = repository
setupBeerListener(on: repository)
}
// 4
private func setupBeerListener(on repo: BeerRepository) {
repo.beersPublisher
.receive(on: RunLoop.main)
.sink(receiveValue: { [weak self] in
self?.handleBeer(loadingState: $0)
}).store(in: &cancelBag)
}
// ...
}
Let's briefly go over the main moving parts here:
Our
cancelBag
collects all the subscribersOur repository is a private property here
With the same approach used in our Repository, the initializer here takes a repository as an argument to allow easy DI during our tests; and instantiates our standard
BeerRepositoryImpl
in the default argument. The initializer callssetupBeerListener(on: repository)
.This private method takes the
beersPublisher
property on the repo and sets up a subscription to it. It's received on the main RunLoop to ensure it's thread-safe, then the value issink
ed and we handle theLoadingState
in another method. Finally, the subscription is stored in thecancelBag
.
Advantages of this approach
The Combine-based approach is beneficial because when sharing the Repository among multiple potential consumers - that is, modules and view models in your app - the subscriptions you create allow the most up-to-date value to be broadcast everywhere it is needed automatically, every time it loads. It also allows for a logical separation of loading the data and of handling the result.
Almost as a side-effect, you get caching for free via CurrentValueSubject
- but you may want to create a separate cache that doesn't get wiped when setting the state to loading
, and utilising PassthroughSubject
instead.
Conclusion
In this article I hope I've successfully evangelised the benefits of using a Data Access Layer in your own applications to separate the concerns of the data you need to get; and how to get this data. You might have a better idea of how you could utilise this to make your own APIs easier to consume, as well as how to implement retrieval strategies to allow your UI layer to optimise for speed, correctness, or both. Finally, you gained a brief overview of Combine and how you might use it to implement your own Data Access Layer using the Repository pattern.
If you're reading everything sequentially, I hope this article was a nice change of pace away from testing. It was a very purposeful segue however, because the next chapter is all about how to unit test your project when using Combine and async/await together in the same codebase.
We're finally ready to bring all our learning together in Part VI - Combine, async/await, and Unit Testing.
Part II Mocking like a Pro
Part III Unit Testing with async/await
Part V (interlude) - Implement your Data Access Layer with Combine
Part VI Combine, async/await, and Unit Testing
Question: How do you manage the cases where there are several views / view models that need to access the information of one repository? Do you use a shared instance of that repo?