I have been using Core Data for several years now. Although I cannot claim to have mastered it completely, I am proficient in using it and rarely make fundamental errors. Currently, my main challenge and research direction is how to integrate Core Data into popular application architectures, and make it work more smoothly in environments such as SwiftUI, TCA, Unit Tests, and Preview. In the next few articles, I will share some of my ideas, insights, experiences, and practices in this area over the past six months. I also hope to have more discussions with friends who have similar concerns.
Up to the Challenge?
Core Data is a framework with a long history. If we start counting from the first integration of the Core Data framework in MacOS X Tigger, which was released by Apple in 2005, Core Data has only been around for just over a decade. However, considering that much of its design was inherited from the EOF (Enterprise Objects Framework) framework, which was introduced by Next in 1994, its core design philosophy has been around for nearly thirty years. In the code of the Core Data framework, the historical ‘NS’ prefix is still widely used.
Unlike the current mainstream usage of Core Data, EOF was integrated into the application server WebObjects. In the early days of e-commerce, many large companies such as BBC, Dell, Disney, GE, and Merrill Lynch were attracted by its usage. Until recent years, WebObjects still provided power to Apple’s Apple Store and iTunes Store. Therefore, it is not difficult to understand why, unlike other popular mobile persistence solutions, Core Data does not pursue data access efficiency excessively, and stability is its most concerned point. This has long been a consensus among many developers.
Perhaps because the design philosophy was very advanced and the implementation was already perfect enough, or maybe due to the lower investment in recent years by Apple, Core Data has added the following new features in the last five or six years without the need to make too many adjustments to the core code:
- NSPersistentContainer
The official implementation that wraps the coordinator, persistent store, and managed object context. Almost no need to adjust any core code.
- Persistent history tracking
This is the biggest change recently. More trigger operations have been added to the persistent store, and an API that responds to changes has been provided on the coordinator.
- Batch operations on data
Allows developers to bypass the context and perform batch operations directly on the persistent store from the coordinator.
- Core Data with CloudKit
Almost no need to adjust the core code, NSPersistentCloudKitContainer has been added, and a module for network synchronization has been attached to the coordinator.
- async/await support
Provides a new implementation of the perform method.
Although there have been rumors (or fantasies) among developers recently that Apple will introduce a brand-new framework to replace Core Data, a careful examination of the history and code of Core Data reveals that the possibility of a new framework appearing is very low. On the one hand, its excellent architectural design can still meet the needs of adding new features in the future; on the other hand, replacing a framework with such a long history and stable reputation requires great courage. Therefore, developers may continue to use this framework for a long time in the future.
Strictly speaking, excluding the disadvantage of being difficult to learn and master, in an ideal environment, Core Data is quite excellent in terms of stability, development efficiency, scalability, etc. (unstable network synchronization is not a problem of Core Data). It is still the best choice for managing object graphs, object lifecycles, and data persistence in the Apple ecosystem.
However, this does not mean that Core Data can fully adapt to today’s development environment. Although it still has a forward-looking mind and a robust core, its appearance is too outdated and it is difficult to match with new frameworks and new development processes. If we can create a new appearance for it, perhaps it can rejuvenate and fight again for the next decade.
Your Glory, My Annoyance
Interestingly, most of the factors that cause Core Data to not integrate well with new frameworks and development processes are some of the features or advantages that Core Data prides itself on.
Who is in charge of the data structure in Core Data?
The core of Core Data is object graph management, and persistence is just one of its accompanying functions. Compared to other frameworks, Core Data’s ability to describe and handle relationships is its core competitive advantage. Perhaps to facilitate the description of complex relationship logic, developers usually need to create entity descriptions in Xcode’s data model editor before creating data structures (supporting direct definition using code, but this method is less commonly used), and then generate corresponding NSManagedObject definition code automatically or manually. This leads to the following problems:
- In order to maintain compatibility with Objective-C (Core Data’s internal data is still implemented using Objective-C), developers can only use limited data types to describe attributes in the data model editor. This makes it difficult for developers to think and describe a new data structure (corresponding to Core Data’s entity) in the most suitable Swift language style at the first time, and they unconsciously rely on the expression ability of the model editor.
- In the case of using data network synchronization (Core Data with CloudKit), due to the principle of only adding but not reducing or modifying entity or attribute names after the product is launched, no matter how unreasonable the original entity, attribute, and relationship names are defined, developers can only bear it. As versions continue to iterate, these inappropriate names will fill every part of the code, making people want to cry.
- It is difficult to enter the development state of the business process at the first time. When using managed objects as a type of data description, the first code that developers often write is related to the Core Data Stack. In the process of application development, any adjustment of data definition needs to go through layers of processing (model editor, corresponding NSManagedObject definition, relevant code in the Stack), seriously affecting the efficiency of development.
In summary, once Core Data is used in the application, it is difficult for developers to get rid of its shadow at the initial stage of development. From the moment Core Data is imported, it has a negative impact on developers’ creativity, intuition, and enthusiasm.
The Viral-Like Managed Framework
Core Data’s managed mechanism has existed since the EOF era. This mechanism allows Core Data to expose data from the underlying data source as a managed graph of persistent objects (memory data objects), and modify and track the object graph through managed contexts. The lazy loading capability provided by the managed mechanism can help developers balance between reading efficiency and memory usage. It can be said that having a managed mechanism is a long-standing proud feature of Core Data.
However, the managed mechanism means that developers need to build a compliant managed environment before performing any operations. Operating on managed objects requires creating a managed object context first. The prerequisite for making the context work is to create a managed coordinator and a persistent store.
In addition to the complexity of creating a managed environment, the stability of the managed environment in some situations is not reliable. In fact, the Core Data managed environment is currently one of the main reasons for the failure of SwiftUI previews. In addition, preparing and resetting the managed environment will also slow down the speed of unit tests, affecting developers’ willingness to write unit tests. As a result, it will seriously undermine developers’ enthusiasm for adopting modular (SPM) development in their applications.
If the R0 value of Omicron BA.4/5 is 18.6, then the basic reproductive number of code involving managed objects in the application due to the managed mechanism is ∞, once it is involved, it cannot be shaken off.
Thread Binding and Sendable
Although Core Data’s managed objects are not thread-safe, it is safe to develop with multiple threads in Core Data as long as you strictly follow the usage conventions (using managed contexts only for creating managed objects). Although some developers find it tedious to develop with multiple threads in Core Data, it is undeniable that compared to other similar frameworks, using Core Data for multi-threaded development provides a high level of stability.
With the improvement of Swift 5.5 in asynchronous and concurrent capabilities, developers will inevitably use new asynchronous or concurrent mechanisms in their code. For example, TCA’s Reducer is currently evolving towards the Global Actor direction (that is, the Reducer will no longer run on the main thread). To avoid thread safety issues, making data comply with the Sendable protocol is an effective means.
Obviously, managed objects do not have the foundation to comply with the Sendable protocol. How to make Core Data work with frameworks that use new parallel mechanisms is also a new challenge facing developers.
My Longed-for Usage
Although a bit greedy, I still hope to have the fish and bear’s paw at the same time. We will explore together through several articles to try to achieve the following goals:
- Minimize the impact of Core Data on the data definition process (especially in the early development stage)
- After switching the data source to Core Data, no need to modify the existing code
- In preview and unit testing phases, no longer be disturbed by the managed environment, so we can easily modularize the code management
- Retain Core Data’s lazy loading mechanism to avoid excessive memory occupation
- Compatible with new concurrency mechanisms to find the greatest common divisor of Sendable
- Achieve the above goals with the least code and avoid increasing the system’s unstability
A Glimpse Into the Next Article
In the next article, we will start with the definition of data (corresponding to entities and properties in Core Data) and try to remove the managed environment from the definition through generics, type erasure and other methods.
The author discusses the challenges of integrating Core Data into popular application architectures, such as SwiftUI and TCA, and shares their ideas and experiences on the topic. While Core Data is stable and efficient, its design philosophy and managed mechanism can make it difficult to adapt to new frameworks and development processes. The author aims to minimize the impact of Core Data on the data definition process, retain its lazy loading mechanism, and make it compatible with new concurrency mechanisms. The next article will focus on removing the managed environment from the definition of data.