Seeding an application with data can be helpful for a wide range of reasons, including unit and performance testing. As a developer, seeding an application is simply convenient during development. You want to see and experience what the application feels like with data in it, without having to enter that data in by hand. How does the application perform with hundreds or thousands of records? What is the user experience like?
It isn't feasible or practical to manually enter in dozens of records, let alone hundreds or thousands. You're a developer. You want to automate this as much as possible. A good developer is a lazy developer.
Strategies
There are many strategies to populate a Core Data persistent store with data. In this series, I discuss three of them.
- hard-coding seed data
- loading seed data
- generating seed data
Before we implement each of these strategies, it's important to highlight what it is that sets these strategies apart. When should you choose one over the other? What are the benefits and drawbacks of each of these strategies?
Hard-Coding Seed Data
While the first strategy is the fastest and easiest, fast and easy often comes at a price. Hard-coding seed data is useful if you want to get up to speed quickly. Speed and convenience are often essential if you're building a prototype or working on a proof of concept.
There are several aspects that I don't like about this approach, though. I'm always skeptical when I hear the words "hard-coded". The most important drawback of this approach is that the seed data is mixed into the project. You could put the seed data in a separate file, but the seed data is still part of the project's source code.
Loading Seed Data
Loading seed data, for example from a file, is what I most often use in my projects. There are several benefits to this strategy.
You can choose the format of the seed data. Even though I almost always default to JSON, it's possible to use CSV or YAML. You can even use XML if that's your thing. The format of the seed data usually depends on your and your team's preferences. I recommend sticking with something you're comfortable with.
Another important benefit is that the seed data doesn't need to be included in your application's bundle. It can come from anywhere. It can even come from a remote backend.
This strategy also introduces flexibility. You can create as many files as you need. Do you need a different data set to unit test a specific scenario? No problem. Add a file with seed data to the test bundle and you're set.
Generating Seed Data
Generating seed data is the most versatile and the most advanced strategy. The idea is simple but powerful. Based on a set of predefined rules, the application generates seed data whenever it's needed.
This strategy has several advantages. The most obvious one is flexibility. What data you're seeding the persistent store with isn't set in stone. You define the requirements the seed data needs to satisfy in a configuration file or object. It's easy to modify the seed data by tweaking the rules you define.
Another important benefit is scalability. There are times you want to put your application through its paces and you want to analyze how performant it is when the user has hundreds or thousands of records stored in your application. You can accomplish this with the previous approaches, but I'm sure you can imagine that it takes time and effort to create large data sets by hand. By generating the seed data of your application, you only need to tweak a few parameters to increase the data set.
There's one more key benefit to this approach that relates to flexibility and adaptability. If the data model of the project changes, and it will change at some point if your application is under active development, you also need to update the seed data. This is easy to do if you choose to generate the data set you're using to seed the persistent store. It takes more time if the seed data is hard-coded in the project or loaded from a file.
But there's an obvious drawback to this approach. It takes more time to implement this strategy. If you're clever, though, you can factor out most of its implementation into reusable components you can use in other projects that are powered by Core Data.
Keep It Simple (Stupid)
You're probably familiar with the KISS principle. KISS stands for Keep It Simple Stupid. In the context of software development, KISS means that you should keep your implementation as simple as possible. That doesn't mean you need to take shortcuts or be naive. It means that you should start with the first strategy if you don't see an immediate benefit of loading seed data from a file or generating it on the fly.
While it's important to keep technical debt to a minimum whenever possible, choose wisely which strategy is most appropriate for your current situation. In the next episode of this series, I show you how to seed a persistent store by hard-coding the seed data into the project.