Converting data to a string is common task in the context of networking or file input/output. In this episode, you learn how to convert a Data object to a String in Swift.

What Are Data and String?

Before we write code, I want you to understand what Data and String (both structs) are. Understanding how these types are designed makes it easier to follow along. A Data object represents a collection of bytes and is is commonly used to work with binary data. A String object represents a collection of characters, Character objects.

Creating a Data Object

You are interested in converting a Data object to a String object, but let me start by showing you how you can convert a String object to a Data object. Fire up Xcode and create a playground. Add an import statement for the Foundation framework.

import Foundation

Declare a string.

import Foundation

let string = "Hello from Cocoacasts!"

You can convert the String object to a Data object by invoking the data(using:) method on the String object. The data(using:) method defines one parameter of type String.Encoding. We pass in utf8, but note that there are plenty more options.

import Foundation

let string = "Hello from Cocoacasts!"
let data = string.data(using: .utf8)

The data(using:) method returns an optional Data object. If the conversion was successful, then the Data object contains a representation of the String object encoded using the encoding you defined, utf8 in this example.

How to Convert a Data to a String In Swift

Let's now focus on converting the Data object back to a String. We create a String object from a Data object by invoking one of the initializers of the String struct, init(data:encoding:). It is important that we use the same encoding we used to encode the string.

import Foundation

let string = "Hello from Cocoacasts!"
let data = string.data(using: .utf8)

if let data, let string = String(data: data, encoding: .utf8) {
    print(string)
}

There are a few things I want to point out. We use optional binding to access the Data object because the data(using:) method returns an optional Data object. We also need to safely unwrap the result of the initializer because the initializer is failable.

How to Convert a Data to a String In Swift

Why UTF-8?

You may be wondering why we opted for the UTF-8 encoding. There are several reasons for choosing UTF-8.

Backward Compatibility with ASCII: UTF-8 is backward-compatible with ASCII. In the past, ASCII was widely used to represent English characters. Any file that is valid ASCII is also a valid UTF-8 file.

Unicode Support: UTF-8 can represent any character in the Unicode standard. This is important if you consider the wide range of characters used in different languages around the globe.

Efficiency for English Text: English is still the most common language used on the web. For English text UTF-8 uses the same amount of space as ASCII, one byte per character. One byte per character is more space-efficient compared to other Unicode encodings, such as UTF-16 or UTF-32.

Self-Synchronization: UTF-8 has a self-synchronization property. Euh ... what? This simply means that if bytes are lost due to errors or corruption, it is easier to find the start of the next character compared to fixed-size encodings. Self-synchronization makes UTF-8 more resilient in environments where errors can occur.

Widespread Adoption: The widespread adoption of UTF-8 in various systems and technologies (e.g., operating systems, programming languages) has strengthened its position as the most common encoding. For example, it is the default charset for XML and HTML 5.

Variable Width Character Encoding: UTF-8 uses a variable width character encoding that can represent every character in the Unicode standard, yet is backward-compatible with ASCII. The benefit is that it can represent characters from a wide range of scripts and languages without wasting space for scripts with fewer characters.

Standardization: Many, many standards and protocols have adopted UTF-8 as the default (or recommended) encoding. As you may have guessed, that further promotes or encourages its use, making it a de facto standard in many areas of computing and technologies.

For the Curious ...

UTF-8 stands for Unicode Transformation Format - 8-bit. Unicode refers to the Unicode standard, a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems.

Transformation Format indicates that the encoding is a transformation format. What does that mean? It simply means that it is a way of transforming a set of Unicode code points into a series of bytes.

The 8 in UTF-8 means that it uses 8-bit blocks to represent a character. UTF-8 is a variable-width character encoding that can represent every character in the Unicode standard. It does this by using one to four 8-bit bytes to encode characters.