This evening one of my friends sent me the link of iA Writer’s video featuring their latest update to the app.
I was blown away by the syntax highlighter in edit mode, where it highlights words based on selection of parts of speech.
At first, I thought they are doing some sophisticated natural language processing, I’m not expert but supporting multiple languages sounds like a tedious/impossible job to do in an year, unless there is easy way.
Apple provides us
NSLinguisticTagger class as part of the iOS 5 SDK, it allows us to split up natural language text and tag it with information. As per the documentation it can identify languages, scripts and stem forms of words!
NSLinguisticTagger is quite easy to use.
We can start by creating the instance of
NSLinguisticTagger with tag schemes for a language and bunch of
NSLinguisticTaggerOptions, options are used to tell tagger to ignore words, white spaces, punctuations etc.
var options = NSLinguisticTaggerOptions.JoinNames
var tagger = new NSLinguisticTagger (
Once we create the instance, we will assign it a string to analyse
tagger.AnalysisString = "A quick movement of the enemy will jeopardise six gunboats";
And finally, fetch the tags with
EnumerateTagsInRange method and print them out.
tagger.EnumerateTagsInRange (new NSRange (0, statement.Length), NSLinguisticTag.SchemeLexicalClass, options, TaggerEnumerator);
void TaggerEnumerator (NSString tag, NSRange tokenRange, NSRange sentenceRange, ref bool stop)
var word = statement.Substring (tokenRange.Location,
words.Add (new Tuple<string, string> (tag, word));
The result after processing A quick movement of the enemy will jeopardise six gunboats looks something like this:
Noun: movement, enemy, gunboats
Verb: will, jeopardise
it even identifies the number, that’s pretty amazing!
PS: You can find the complete code here – https://gist.github.com/prashantvc/8039121