Syntax Highlighting in Swift with OysterKit

One of the most common scenarios for using a tokenizer is to syntax highlight text. In the OysterKit workspace we include a Mac Cocoa project called Tokenizer which lets you validate your OK Script, and we thought it would be nice to add syntax highlighting to it. Apart from working around a beta-bug, it's very straight forward. 

Boiler Plate

The first block of code in the AppDelegate is largely boiler plate

    @IBOutlet var window: NSWindow
    @IBOutlet var tokenView: NSTokenField
    @IBOutlet var scrollView: NSScrollView
    var textString:NSString?
    var inputTextView : NSTextView {
        get {
            return scrollView.contentView.documentView as NSTextView
        }
    }
    var tokens = Array<AnyObject>()
    func applicationDidFinishLaunching(aNotification: NSNotification?) {
        //For some reason IB settings are not making it through
        inputTextView.automaticQuoteSubstitutionEnabled = false
        inputTextView.automaticSpellingCorrectionEnabled = false
        inputTextView.automaticDashSubstitutionEnabled = false
        //Change the font, set myself as a delegate, and set a default string
        inputTextView.textStorage.font = NSFont(name: "Courier", size: 14.0)
        inputTextView.textStorage.delegate = self
        inputTextView.string = "{\n\t\"O\".\"K\"->oysterKit\n}"
    }

A couple of things. You'll note I have a rather strange computed variable to get the NSTextView. At the moment (Beta 2) Xcode isn't binding things correctly with NSTextView so this work-around keeps us moving. 

I also have a NSTokenView to show the tokens that the OK Script tokenizer generates. This takes an array of strings as its value, and the tokens array is bound to it. You'll see how we populate that later. 

Once the application has finished launching I'm having to do a little bit of "should work in interface builder" set-up before we can get to configuring ourselves as the delegate for the NSTextView's NSTextStorage property. 

NSTextStorage is the backing for the text displayed in the text view. Unlike the string you can get from the text view, NSTextStorage manages an attributed string, allowing us to add formatting and coloring without changing the actual value of the string. 

As I delegate there are just two optional functions to implement, textStorageWillProcessEditing and textStorageDidProcessEditing. These are called just before, and just after, any changes to the text are made (by the user, or by code). They provide a great opportunity to update things like syntax colouring!

Colors

Of course, we will need to match our tokens to some colours. In general I can get away with using predefined NSColor's, but I do have to work around the absence of class variables with a computed variable (and I'd rather this was a constant, but we simply can't do that yet in Swift). Once we have that defined we can just define a dictionary which maps token names to a NSColor. If it doesn't map, we will just ignore it. 

    class var stateDefinitionColor:NSColor {
        return NSColor(calibratedRed: 0, green: 0.6, blue: 0, alpha: 1.0)
    }
    let tokenColorMap = [
        "not" : NSColor.redColor(),
        "quote" : NSColor.redColor(),
        "Char" : NSColor.redColor(),
        "single-quote" : NSColor.redColor(),
        "delimiter" : NSColor.redColor(),
        "token" : NSColor.purpleColor(),
        "variable" : NSColor.blueColor(),
        "start-branch" : AppDelegate.stateDefinitionColor,
        "start-repeat" : AppDelegate.stateDefinitionColor,
        "start-delimited" : AppDelegate.stateDefinitionColor,
        "end-branch" : AppDelegate.stateDefinitionColor,
        "end-repeat" : AppDelegate.stateDefinitionColor,
        "end-delimited" : AppDelegate.stateDefinitionColor,
    ]

All we need to do now, is actually tokenize the string as the user edits it.

Applying Colors to Tokens

Luckily OysterKit can do most of the heavy lifting here. We first clear the any old coloring (this approach is quite brutal, we will follow up with a guide on incremental coloring which you would need for large files). 

Once that's done we can create a tokenizer, we'll use the OK Script tokenizer (TokenizerFile). We also prepare an array that we will store the token descriptions in to display in the NSTokenField

Then we call the tokenizer.tokenize method giving it the string value of the NSTextView. Each time a token is generated it calls the supplied closure with it (you can probably spot this would be easy to push into a background thread) and it's in there that we do our coloring. 

    func textStorageDidProcessEditing(aNotification: NSNotification!){
        let old = NSMakeRange(0, inputTextView.textStorage.length)
        inputTextView.textStorage.removeAttribute(
          NSForegroundColorAttributeName, range: old)
        var okFileTokenizer = TokenizerFile()
        var allTokens = Array<String>()
        okFileTokenizer.tokenize(inputTextView.string){
          (token:Token)->Bool in
            let tokenRange = 
              NSMakeRange(token.originalStringIndex!,
                          countElements(token.characters))
            allTokens.append(token.description)
            self.tokens = allTokens
            if let mappedColor = self.tokenColorMap[token.name]? {
                self.inputTextView.textStorage.addAttribute(
                    NSForegroundColorAttributeName, 
                    value: mappedColor, 
                    range: tokenRange)
            }
            return true
        }
    }

We can use the information in every token to determine the range of the text in the original string that the token came from, we store that in tokenRange. Now we  just need to determine which color to use (if any), and providing there is one we can add a foreground color attribute to the text storage. 

One final job, we add the token's description to the tokens field so that the bottom half of the display is updated. Here's the result

Not bad for just 10 lines of code to do the coloring (excluding updating the NSTokenField). OysterKit is open source and released under the MIT license, so you are free to use it in your own projects. You can learn more here