Solving the Strings Problem in Swift
In this post I try to port the type based solution to the strings problem ideas by Tom Moertel to Swift.
I started this some days ago as an exercise to continue exploring type safety in Swift and expand my thoughts on how the type system can help us solving domain specific problems. A topic that I already explored in Type Systems and Domain Driven Development.
Before continuing let me give some disclaimers.
I recommend to read the linked post where it talks about the strings problems in deep and proposes the solution that I tried to port to Swift.
The escaping functions that I’ve used are probably for sure not correct at all, as I was not trying to construct production ready XML or URL types. They are used just as a way to see the types being constructed differently.
The resulting solution is not complete at all, and I’m writing this post with that in mind. It serves as an experiment trying to show you how we can leverage the type system even further, but Christmas is already here, Apple has open sourced Swift, and I want to ship one project before the end of the year. So sadly I don’t have more time to improve it, for now. But I would be really happy to receive feedback and posible solutions to what is exposed here.
The Strings Problem
The original post describes it has:
we just plain suck at keeping a bazillion different strings straight in our heads, let alone consistently and reliably rendering their interactions safe whenever they cross paths in a modern web application. It’s easy to say, “just escape the darn things,” but it’s hard to get it right, every single time.
In my words, the type String is one of those types that can represent absolutely anything. Than causes a variety of problems and usually it implies that you have to over defend yourself. The proposed solution relays on accepting that different strings have different meaning so we should treat them differently.
Go and read the original post as it describes why other solutions are not as good as using the type system.
A SafeString
A SafeString
is a type that contains a string representing a specific Language
.
public enum SafeString<T: Language> {
case Empty
case Fragment(fragment: T)
indirect case Concat(left: SafeString, right: SafeString)
}
The protocol Language
is where plain and unsafe strings are converted to a specific type of string that has a specific meaning.
public protocol Language {
// litfrag :: String -> l -- String is a literal language fragment
init(fragment: String)
// littext :: String -> l -- String is literal text
init(plainText: String)
// natrep :: l -> String -- Gets the native-language representation
func toString() -> String
// language :: l -> String -- Gets the name of the language
var name: String { get }
}
The comments in the code reference the original implementation.
As you can see the important part is the two inits. Is at that point where the developer converts a String
to a Language
where the risk disappears. The developer takes one single time the decision of if the string is unsafe, or if it’s already in the language that I want.
Once that decision is taken the framework leverages the type system to ensure that strings with different meaning are not used in an unsafe manner.
This two types are the central part of the safety framework. With this kernel implemented, now one can start creating types for any specific language.
Specific languages
In the example the language represented is XML (or XHTML) to give a safe way of generating the markup of a website.
For that we just have to define the XMLString
type that conforms to the Language
protocol.
public struct XMLString: Language {
let xml: String
public init(fragment: String) {
xml = fragment
}
public init(plainText: String) {
xml = escapeXML(plainText)
}
public func toString() -> String {
return xml
}
public let name = "XML"
}
The implementations of the languages are pretty straightforward and the only interesting part for a proper solution is implementing correctly the escape
method for each language.
We could use Swift protocol extensions to make easier the task of creating languages, as the only difference is the escape function.
A part from that we can create a typealias
for an XML type with a function that creates XML from safe string literals.
public typealias XML = SafeString<XMLString>
public func xml(fragment: String) -> XML {
return XML(fragment: fragment)
}
With this in place, one can start using and creating safe XML strings.
xml("<em>wow!</em>") // <em>wow!</em>
XML(text: "Safety & XML") // Safety & XML
As you can see the user of the library doesn’t have to interact with the XMLString
(or any other Language
conforming type), it only constructs and uses SafeString
instances. And thanks to the typealias
it can even ignore that.
But now, if it tries to use XML
types with String
the compiler is there to save us.
someXmlInstance + "safety & more"
error: binary operator '+' cannot be applied to operands of type 'XML' (aka 'SafeString<XMLString>') and 'String'
To do that we will have to explicitly convert the unsafe string into XML.
someXmlInstance + XML(text: "Safety & More")
// <em>wow!</em>Safety & More
Real world example
In the playground you can see a real example that converts an Article
type to an XML
containing a list of links to share websites.
The only piece that I will show here is the compose
function, as it will serve to illustrate the last point of the post.
func compose(share: Share) -> XML {
let url = XML(text: share.url.render())
let siteTitle = XML(text: share.siteTitle)
// Break it to make the compilar happy or "expression is too complex"
let a = xml("<a href=\"") + url + xml("\"")
let b = xml("title=\"") + siteTitle + xml(": \u{201C};") + title + xml("\u{201D}")
let c = xml(">") + share.imageTag + XML(text: "Image here") + xml("</a>")
let link: XML = a + b + c
return link
}
This function is used to convert the information to share an article to one website into an XML containing the link, title and logo.
The missing pieces
In this real world example you can see one of the tedious parts of this system (ignoring that the compiler freaks out with a too complex expression). To generate a long safe XML concatenating other types of information requires to generate and XML
instance for each part of the string and concatenate them all in a long chain.
Although this is exactly the point of being safe, the fact that an URL
or and unsafe string as the title can not be attached directly into an XML fragment is exactly what we want.
But in this case, any Swift user will try to use String literals and interpolation. And almost every developer will let the laziness win over safety.
To have a complete implementation that solves The String Problem we should have a way to create SafeString
with the native Swift interpolation system.
You can check the Interpolation page in the playground to see some code about this.
The first impression is that it would be nice to have SafeString
conform to StringLiteralConvertible
so we could do:
let xml: XML = "<p>blabla<p>"
But doing this would break the safety feature of the SafeString
, meaning that now this is posible:
someXmlInstance + "totally not a safe string"
To mitigate this we could assume that any string literal is not safe. But at that point maybe is no longer worth it.
The StringInterpolationConvertible
feature in Swift looks really powerful, especially as we can overload the init(stringInterpolationSegment)
with different supported types.
public init<T>(stringInterpolationSegment expr: T) {
self.init(text: String(expr))
}
public init(stringInterpolationSegment expr: String) {
self.init(text: expr)
}
public init<T where T: StringLiteralConvertible>(stringInterpolationSegment expr: T) {
self.init(fragment: String(expr))
}
public init(stringInterpolationSegment expr: SafeString) {
self = expr
}
The problem is that the literal string (which we should assume is already safe) is handled in the same way as any other interpolated string (which we can not assume is safe), so we cannot distinguish between the two. This also breaks the safety when using string literal interpolation.
In the original post this is posible because the Haskell template system is more flexible and allows the user to specify which kind of string is being used when interpolating.
Conclusion
I would highly recommend using a framework like this in any system that has to deal with different kinds of data hidden behind String. The only real missing feature is working better with the literal syntax of the language.
I’m sure this could be possible in Swift so if someone has any idea of to improve the StringInterpolationConvertible
I would be happy to listen.
Update 16/01/2018: Ole Begemann nicely pointed out in Twitter that he wrote a post with a solution for the String Interpolation. Check out Fun with String Interpolation.
There is also a swift-evolution proposal to improve it. Fix
ExpressibleByStringInterpolation
In the other hand same techniques could be applied to other types that rely on numeric data. Is often the case where is also not safe. Think about Money with different currencies or any other unit like kilometers and miles.
Remember to check the Playground and that any feedback is welcomed.