Ask Apple 2022 Q&A Related on Core Data (Part 1)

Ask Apple creates an opportunity for developers to directly communicate with Apple engineers outside of WWDC. This article compiles some of the Q&A related to Core Data from the event, and includes some personal insights. This is Part 1.

Q&A

Can photos be stored in Core Data?

Q: Hi, I saw some websites suggesting that Core Data should not be used to save photos, but maybe they didn’t notice the “use external storage” option? I’m developing an application where users might take photos about once a week. Which is more appropriate: saving to Core Data or saving to a directory? I don’t want to save to the photo library because users might not want others to easily view these photos.

A: Using external storage in Core Data is possible. You can also store a URL in Core Data and manage the file yourself. If you plan to pass the URL to other frameworks, such as a media player, then you should use the latter method.

After enabling “Allows External Storage” in Core Data, the efficiency of binary reading is guaranteed. Core Data will save files larger than a certain size (100KB) in the file system and save only the file name of the file in the BLOB field. The files are saved in a hidden directory (_EXTERNAL_DATA) created at the same level as the SQLite database. Unfortunately, Core Data does not provide an API that directly returns these file URLs (or converts BLOBs into a way to access them with a certain URL), so when data needs to be passed in the form of URLs, the data needs to be written to a temporary directory first before it can be processed. Therefore, whether to save in Core Data depends on your specific use case. For applications that require synchronization, if you save the URL in Core Data and save the data to a directory, you need to implement external data synchronization yourself.

Will switching iCloud clear local data?

Q: In the case of using Core Data with CloudKit, when a user logs out of their iCloud account on their device, NSPersistentCloudKitContainer will receive instructions to delete local data. Is this intentional?

A: Yes. NSPersistentCloudKitContainer enforces strict binding between iCloud account and stored data.

In the article “Switching Core Data Cloud Sync Status in Real-Time”, I introduced an experimental method that may try to retain this data in some cases. But it is still best to keep the application in the original Core Data design pattern. Considering the strong binding strategy between the two, and in order to further save users’ backup space, you can consider setting the isExcludedFromBackup property of the SQLite file of Core Data data to false (cancel file-level cloud synchronization) to avoid multiple backups.

How to Disable/Enable Network Sync

Q: Is there a recommended way for applications to allow users to disable/enable CloudKit storage?

A: No, there isn’t. Users can modify an app’s iCloud sync options from the app’s settings/system settings. You can create an NSPersistentCloudKitContainer without NSPersistentCloudKitContainerOptions, and it won’t sync. However, NSPersistentCloudKitContainer binds data from iCloud to persistent storage files. There’s no way to tell NSPersistentCloudKitContainer to keep local data after the account disappears (which happens when users disable iCloud sync for that app).

In the case of using a single container, developers can control whether the application enables network synchronization at the next cold start by using UserDefaults to set cloudKitContainerOptions.

How to handle Container creation failure

Q: What is the elegant way to handle errors in the container.loadPersistentStores closure? The Apple template (Core Data template provided by Xcode) has a fatalError and suggests that it should not be used in production, but if my Core Data Stack is not properly instantiated, my users cannot do anything with my application.

A: Typically, these errors are caused by untested schema migrations, incorrect file protection levels, insufficient disk space, and other reasons. In these cases, recovery steps should be taken to make the application available again. Another approach is to show the user that there is a problem with the UI and that a reset is required. Our application templates cannot create good UI for your application, which is almost what needs to be done in this closure.

In SwiftUI, we typically use environment to inject view context into the view tree. Once loadPersistentStores encounters an error that prevents the container from being created properly, the injection of the context will fail, preventing entry into the UI interface. To handle this situation, you need to determine the state of the Container in the main view (or in the root view that uses Core Data features), and switch to the failure prompt logic (usually modifying the state in the loadPersistentStores closure).

Error with Shared Data

Q: My question is about Core Data with CloudKit. I have successfully used NSPersistentCloudKitContainer to synchronize data across the user’s devices, but I’m having much less luck with sharing data. I’ve looked at two related sample projects and have gotten as far as creating a new share, but any attempt to manage an existing share (i.e. adding people) seems to always fail. I’m seeing some mysterious messages in the console, such as “Error creating CFMessagePort needed to communicate with PPT”. If I say that the attempt to share data can work if the CKShare doesn’t exist - hurray! But if I try to share again and the CKShare already exists, it just shows the spinning wheel forever. This happens both with UICloudSharingController and with the newer ShareLink/CKShareTransferRepresentation versions. I’ve seen similar issues in the sample code. My question is - are there known issues with this usage? Anything special to keep in mind?

A: Please use sysdiagnose to submit feedback reports and the storage files of the affected devices.

You’re not alone. We’ve had a lot of trouble with CKShare and NSPersistentCloudKitContainer as well. For example, sharing URL instances from Transferable structures doesn’t work at all. ShareLink just shows an empty popup window (another developer’s complaint).

Unfortunately, Apple didn’t improve its performance much after adding data sharing capabilities to Core Data with Cloud. The current experience with sharing data is not satisfactory. To learn how to share data and understand its current limitations, please read the article “Core Data with CloudKit: Sharing Data in the iCloud.”

Is there a maximum sync size or quantity limit?

Q: Is there a maximum sync size limit for Core Data with CloudKit? I tried it in an application that has over 30,000 records, but they couldn’t sync from Mac (development mode) to iPhone (development mode).

A: It’s hard to determine without more details. NSPersistentCloudKitContainer and CloudKit can support data that’s two orders of magnitude larger than some limits (like device storage).

In theory, the number and size that can be synced depend only on the user’s available iCloud capacity. In some cases, developers may need to manually enable the app’s iCloud sync option on macOS (especially during development), otherwise it won’t sync with other devices.

How to reset local data

Q: Imagine that Core Data is using NSPersistentCloudKitContainer to sync my application data across all devices. If one of the devices experiences some kind of failure and needs to reset its data from the cloud (and there is data for that device), is there any way in my application to reset the local cached copy of the data to pretend it is a new device and have CoreData retrieve all data from the cloud again?

A: Use the destroyPersistentStore(at:type:options:) method of NSPersistentStoreCoordinator to completely destroy the local database.

After destroying the database, you also need to create a new database locally. This method is safer than developers using file management to delete SQLite data. In addition, database migration can also be implemented through the migratePersistentStore(_:to:options:type:) method of NSPersistentStoreCoordinator.

How to save enumeration types

Q: What is the recommended method for storing Swift enumerations (with or without associated values) in Core Data?

A: One possible solution is to store the enumeration as Transformable to handle cases with associated values. In cases without enumeration values, it can be converted to any property type supported by Core Data through rawValue.

There are some limitations to using Transformable to handle enumerations with associated values, including: 1. Some performance loss; 2. Cannot be queried through predicates in Core Data. If you have specific query requirements, you can break down the associated data in the enumeration type, define all associated values as properties in the entity, and add a corresponding type property corresponding to the enumeration. Then define a computed property of the enumeration type in the managed object, through which data can be converted. Although this approach wastes some storage space, it has the advantages of high conversion efficiency and queryability.

Can the synchronization progress be displayed and manually triggered?

Q: When using NSPersistentCloudKitContainer, can the current synchronization status be determined or the synchronization be manually triggered? I would like to display a progress view in the UI so that users who start the application for the first time can see that their data is being downloaded from the cloud.

A: NSPersistentCloudKitContainerEvent fills this role. You can bind notification listeners to events as needed to update and display status. It is not possible to trigger synchronization actively.

NSPersistentCloudKitContainer provides an eventChangedNotification notification, which will alert us when switching between the three states of import, export, and setup. Strictly speaking, it is difficult for us to determine the actual status of the current synchronization only by switching notifications. For more information, please refer to”Core Data with CloudKit: Troubleshooting”.

Is it necessary to add a new version of the Model

Q: When do we need to add a new version of the CoreData model? I see conflicting advice about lightweight migration, is it safer to add a new version for each version?

A: Adding a new managed object model for each version will be safer, but if changes from one version to another have been thoroughly tested to indicate suitability for lightweight migration, then a single managed object model is sufficient.

For apps that have already been launched, it is best to manually add a new version of the model. In addition to being safer, it also makes it easier to track changes to the old version model.

How to use FetchedResultsController in SwiftUI

Q: Are there any practices or recommendations for using Core Data in SwiftUI applications? If Core Data is widely used, should UIKit still be used? For example, is there a SwiftUI version of FetchedResultsController?

A: There is no problem using CoreData in SwiftUI. You can retrieve results from storage through @FetchRequest.

@FetchRequest is a love-hate thing. It’s very easy to use and almost the preferred way to get data in views. But for users of Redux-like frameworks, it’s more like a disruptor, leaving a lot of data outside of the single state of the app. It’s still a challenge to better integrate a single-state framework with @FetchRequest. I wrote a series of articles discussing some of my thoughts and ideas.

Timing for running the initializeCloudKitSchema method

Q: When using Core Data with CloudKit, if I edit the persistent store in the Core Data Stack (for example, adding new persistent stores for shared objects) without touching the entities and their attributes, should I run initializeCloudKitSchema?

A: initializeCloudKitSchema is only necessary when making changes to the managed object model. Once it is run against a CKContainer, all databases within that container will have the same schema (public/private/shared).

initializeCloudKitSchema is typically a method used in the development phase and only needs to be used once after creating or modifying the data model. Once the corresponding schema has been created for the CKContainer, this line of code should be removed or commented out in your code. Additionally, initializeCloudKitSchema also provides a dryRun option for checking if the data model meets CloudKit’s requirements (comparison only, no upload) in unit tests.

Debugging Techniques for Multithreading

Q: What is the best way to debug access errors/crashes in Core Data when using multithreading? I’ve been using the -com.apple.CoreData.Logging.stderr 1 and -com.apple.CoreData.ConcurrencyDebug 1 parameters for assistance. Any other suggestions?

A: ASAN will also help capture memory errors caused by concurrent issues.

See Several Tips on Core Data Concurrency Programming for more details.

How to immediately react to changes in App Group

Q: What is the best way to ensure that changes submitted to a store via application extensions (such as SiriKit/AppIntents) are immediately reflected in the main application that may already be running (and vice versa)? Is it safe/recommended to use the viewContext of NSPersistentContainer in both the application and extension, or should I use a background context? In my setup, the store is saved to an application group directory to allow access from both the application and extension, so I assume each process will use its own container to access it.

A: This can be achieved by setting the remote change options in your NSPersistentStoreDescription.

Persistence History Tracking is a solution prepared for similar needs. Refer to the article Using Persistence History Tracking in CoreData for more implementation details.

Avoid performing complex tasks in widgets

Q: We have encountered a series of crashes because we launched the same CoreData stack in a widget process and an application process. Usually this works fine, but once the store needs to be migrated (we perform lightweight migration), there seems to be some race condition that causes either the application or the widget process to crash. After one crash, the migration seems to work fine and there have been no further crashes. Is there a good solution to address these crashes? We are unsure if CoreData is handling this correctly or if we need to detect the migration and address these crash issues.

A: The ability to perform lightweight/inferred migration should not be granted to widgets. Only the application should do so. If a widget encounters a CoreData store that needs to be migrated, the widget should redirect to launch the application. In fact, widgets never get enough resources from the operating system to complete a migration.

Widgets have limited running resources, and operations such as clearing the history of persistent transactions should not be handled in widgets.

Deletion Timing of Persisted Historical Transactions

Q: In “Clearing History” of Consuming Relevant Store Changes, it is mentioned that “because persisted history tracking transactions consume disk space, it is important to establish a cleanup strategy to remove them when they are no longer needed.” However, there is no clear indication of how to safely clear history without affecting CloudKit correctness. The given example is to delete all transactions older than 7 days. But why 7 days? Why not 14 days? It would be very helpful to have a reliable and specific example that explains how to safely clear historical data to prevent wasted disk space.

A: Clearing history is up to the customer to decide. Typically, applications clear history once a year or every six months. The write rate for your particular application may require a different time window, but when using NSPersistentCloudKitContainer to clear history, it may force the storage file data to be fully synchronized to CloudKit, so it is not recommended to do this frequently.

Regardless of the time interval for clearing, I do not recommend developers to clear historical transactions created by CloudKit for automatic synchronization (in most cases, NSPersistentCloudKitContainer will automatically delete them after ensuring synchronization is complete). When performing deletion operations, transactions generated by the system should be ignored in NSPersistentHistoryChangeRequest and only transactions generated by the application or application group should be deleted. For details, please refer to the article “Using Persistent History Tracking in CoreData”.Additionally, you can also directly use the third-party library I wrote - Persistent History Tracking Kit.

How to create a model for NSDictionary

Q: I have an NSDictionary value that needs to be stored in Core Data. Should I use Transformable property or Binary Data property to store it? Which one is better? Binary Data can be stored externally and I don’t trust Transformable. When retrieving data from storage, will both options be loaded into memory? Or do they support lazy loading (fault)? Not sure which one is better.

A: Both will have the same memory situation. Ideally, the answer is “neither is a good choice”. If possible, you should model the dictionary (create two entities using Core Data, and map the dictionary through relationships).

In many cases, traditional data organization should not be copied into the Core Data Model. Try to design data structures that are suitable for the Core Data architecture. Although there may be some performance loss and capacity waste, it is more beneficial to the overall benefit. For example, in the above situation, using a relational approach has the following advantages: 1. Support queries; 2: In the case of enabling synchronization, only the modified part needs to be synchronized each time; 3: No need to worry about conversion performance.

Is it necessary to set up inverse relationships?

Q: How important is it to set up the inverse relationships in the data model (usually done when creating relationships)? Are there any examples where inverse relationships can be omitted?

A: Defining inverse relationships makes managing your graph easier (for example, setting a “parent” automatically adds an object as a “child”), and also allows you to delegate graph cleaning to Core Data (for example, you want to delete an “invoice” while also deleting all its “items”). If you don’t need these semantics, you don’t need the inverse. However, bidirectional traversal is useful in most cases. It is worth noting that if you want to use CloudKit syncing, you need to explicitly define inverse relationships. I strongly recommend setting up inverse relationships for all relationships until it has a significant impact on performance.

Core Data with CloudKit uses bidirectional association to overcome the limit on the number of relationships in the CloudKit API (CKRecord.Reference cannot exceed 750). Therefore, only with explicit inverse relationships, can Core Data with CloudKit create the correct schema in the cloud.

Metadata of NSPersistentStore

Q: Is the metadata of NSPersistentStore saved on the disk? Can it be used to determine if a device has performed cloud migration or other activities?

A: Core Data stores the metadata within the storage file itself. This metadata is owned by Core Data and it is not recommended to modify it. If you wish, you can store your own metadata within the storage file, but be careful not to overlap with the keys owned by Core Data. The metadata is protected with the same data protection as the rest of the storage file.

For some time (especially for document-based applications), developers liked to save some options for cross-device use through custom metadata. See How Core Data Saves Data in SQLite for more information on Core Data metadata.

Is it necessary to synchronize intermediate data?

Q: What is the best way to quickly save thousands of GPS locations when using Core Data with CloudKit? When there is a lot of data, it will reach the server limit.

Lengthy discussion. The asker is developing an exercise application and needs to store all locations (coordinates, speed, route, timestamp) during the user’s exercise so that a line can be drawn. However, it is not necessary to keep this GPS information on all devices (only summary information is needed). Apple engineers suggest creating another Configuration to save this data in local storage (without synchronization) and only saving the summarized information in synchronized storage. Read Syncing Local Database to iCloud Private Database article to learn how to selectively synchronize data by creating multiple Configurations.

How to Encrypt a Database

Q: If I use NSPersistentStoreFileProtectionKey: FileProtectionType.complete to encrypt my database, will it be stored in encrypted format when the user backs up their phone data to iCloud? Or is it only encrypted on the device?

A: NSFileProtection only affects the encryption status of data on the device.

Starting from iOS 15, you can enable the encryption option for properties in the Model Editor (not supported for upgrading old versions of the model). When using Core Data with CloudKit, the value of this property will be saved in iCloud in encrypted form. Currently, Core Data does not support encrypting SQLite.

Bug in NSExpression

Q: How should I view the CAST function in NSExpression? Is this a feature I should actively use? For example, if I write CAST(now(), 'NSNumber') with the intention of doing mathematical operations on the current time, I receive an error message saying “Don’t know how to cast to NSNumber”.

A: This is a good question. We suggest posting it on the developer forum where Apple engineers will monitor it throughout the week and may be able to provide further assistance. This seems worthy of a bug report.

Using NSExpressionDescription, you can perform calculations on records in SQLite and return the calculation result via NSFetchRequestResult. Read the article “Count Queries in Core Data: The Master Guide” for usage examples.

Merge Strategy or Selective Update

Q: Currently, our Core Data Stack uses the NSMergeByPropertyStoreTrumpMergePolicy merge strategy, which essentially replaces an object stored in our store and identified by a unique constraint when pulled from the API. Another approach is to determine whether an object already exists by fetching a request and updating the existing record if it exists or creating a new record if it does not. According to Apple, which approach is the preferred way to handle record creation and updates?

A: Each approach has its advantages and disadvantages. Generally, checking for records first (by checking if data exists in the store via Core Data) can be very expensive. If you must do this, you must do it in batches. Fetching records one at a time in this process will be very slow.

If the built-in merge strategy in Core Data does not meet your needs, creating a custom merge strategy may be a good option.

Creating predicates in many-to-many relationships

Q: My video entity has a many-to-many relationship with tags, and I have an array of tag IDs. I want to retrieve all videos that have at least one tag in this array. How can I create an NSPredicate to represent this?

A: You can try using ANY tag.name IN %@.

%@ corresponds to the tag array. You should use Core Data’s logic to organize the data and create predicates, as Core Data will convert the predicate into the corresponding SQL statement.。

Dynamically modifying the configuration of @FetchRequest

Q: In a SwiftUI application, how can I create a @FetchRequest based on an @AppStorage value? The use case is: when I open the Focus filter, I change the @AppStorage value to a list of tags that the user wants to see in my app. If I can create a @FetchRequest with a predicate associated with the value of this @AppStorage, the predicate will update automatically and update my view. Currently, I cannot achieve this, what is the solution to get similar results?

A: The predicate property of @FetchRequest is a Binding that will redraw the view upon change.

Starting from Swift 3.0, FetchRequest supports dynamically modifying its predicate and sort descriptors in a view. For example, for the above question, the configuration of the request can be changed within task(id:).

uriRepresentation

Q: I am currently implementing a URL scheme for my application and I want to provide a URL to open a specific Core Data object. Is there a better way than using NSManagedObject.objectID.uriRepresentation().absoluteString as an identifier in my URL scheme?

A: I think this is also what I would do.

Using the managedObjectID(forURIRepresentation:) method of NSPersistentStoreCoordinator, you can convert the URL back to the corresponding NSManageObjectID. Read “Showcasing Core Data in Applications with Spotlight” for more details.

How to perform major version migration in sync mode

Q: Hi, I have a question about migration when using the Core Data and CloudKit stack. If we no longer care about local data, can we remove unused entities from the data model that syncs with CloudKit? In our case, we first remove all data from the entity (i.e. migrate the data to a new entity), and then remove the entity from the project, as we can ensure that all users have upgraded.

A: Yes, but what about the old version of the application? From the user’s perspective, the old version will write data that the new version has never seen before, and the new version will write data that the old version has never seen before. How will you explain this difference to your users?

When using Core Data with CloudKit, it is best to adopt the principle of only adding, not changing or reducing the data model. If you do need to make destructive modifications to the data model, it is best to create two Containers (using different Models) and convert the old data to the new Container only after the user has ensured that the original data has been synced locally.

Can individual CKRecordZones be created for shared data?

Q: I have a document-based app where each document is a package that contains a unique Core Data store. I want to sync each document separately using Core Data’s built-in CloudKit sync API. How do I create a unique CKRecordZone for each document?

A: The current NSPersistentCloudKitContainer does not support this usage.

Perhaps consider using the pure CloudKit API to achieve this requirement.

Can @unchecked Sendable be used to annotate NSManagedObjectID?

Q: Is it possible to use @unchecked Sendable to annotate NSManagedObjectID when it is certain that the NSManagedObjectID is not in a temporary state?

A: It should be possible. Please submit a bug report.

In Core Data, NSManagedObjectID is thread-safe. By passing the ID to other contexts and retrieving managed objects with that ID in different thread contexts, it ensures that the application does not crash.

Summary

There shouldn’t be too many questions about Core Data in Ask Apple, and the few questions I asked have received satisfactory answers. I hope that Apple can hold similar events regularly in the future, and everyone should be more actively involved in participating.