Natural Language Processing in iOS Swift

Vicky Kumar

19 September 2023

The Natural Language Processing is a process of taking unstructured text and discerning some characteristics about it. The Natural language text is analyzed by the ‘NSLinguisticTagger’ class. The NLP is also used in Machine learning.

It helps to understand text using the following techniques:

Tokenization
Lemmatization (identification of the root form of a word)
Parts of speech identification
Named entity recognition (proper names of people, places, and organizations

To begin, create a new Xcode project and follows the following steps:

First, we will declare a variable to take a string.

let quote = "Here's to the crazy ones. The misfits. The rebels. The troublemakers. The round pegs in the square holes. The ones who see things differently. They're not fond of rules. And they have no respect for the status quo. You can quote them, disagree with them, glorify or vilify them. About the only thing you can't do is ignore them. Because they change things. They push the human race forward. And while some may see them as the crazy ones, we see genius. Because the people who are crazy enough to think they can change the world, are the ones who do. - Steve Jobs (Founder of Apple Inc.)"

let quote = "Here's to the crazy ones. The misfits. The rebels. The troublemakers. The round pegs in the square holes. The ones who see things differently. They're not fond of rules. And they have no respect for the status quo. You can quote them, disagree with them, glorify or vilify them. About the only thing you can't do is ignore them. Because they change things. They push the human race forward. And while some may see them as the crazy ones, we see genius. Because the people who are crazy enough to think they can change the world, are the ones who do. - Steve Jobs (Founder of Apple Inc.)"

In Natural Language Processing, a tagger is basically a piece of software that can read text and “tag” various information to it such as part of speech, recognize names and languages, perform lemmatization, etc. We do this by calling the NSLinguisticTagger

The next step is to add the following line of code, the tagger, and the Options.

Now parsing text is tokenization. it is the process of splitting sentences, paragraphs, or documents. we’ll be splitting the quote above into words.
Let us create a method.

func tokenizeText(for text: String) {
    tagger.string = text
    let range = NSRange(location: 0, length: text.utf16.count)
    tagger.enumerateTags(in: range, unit: .word, scheme: .tokenType, options: options) { tag, tokenRange, stop in
        let word = (text as NSString).substring(with: tokenRange)
        print(word)
    }
}

func tokenizeText(for text: String) {

tagger.string = text

let range = NSRange(location: 0, length: text.utf16.count)

tagger.enumerateTags(in: range, unit: .word, scheme: .tokenType, options: options) { tag, tokenRange, stop in

let word = (text as NSString).substring(with: tokenRange)

print(word)

}

The output of the following method is:-

Here
's
to
the
crazy
ones
The
misfits
The
rebels
The
troublemakers

Here

the

crazy

ones

The

misfits

The

rebels

The

troublemakers

Next is ‘Lemmatization’. It breaks down the word into its most basic form.

func lemmatization(for text: String) {
    tagger.string = text
    let range = NSRange(location:0, length: text.utf16.count)
    tagger.enumerateTags(in: range, unit: .word, scheme: .lemma, options: options) { tag, tokenRange, stop in
        if let lemma = tag?.rawValue {
            print(lemma)
        }
    }
}

func lemmatization(for text: String) {

tagger.string = text

let range = NSRange(location:0, length: text.utf16.count)

tagger.enumerateTags(in: range, unit: .word, scheme: .lemma, options: options) { tag, tokenRange, stop in

if let lemma = tag?.rawValue {

print(lemma)

}

Another one is ‘Parts of Speech’. It identifies the part of speech of the sentence.

func partsOfSpeech(for text: String) {
    tagger.string = text
    let range = NSRange(location: 0, length: text.utf16.count)
    tagger.enumerateTags(in: range, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange, _ in
        if let tag = tag {
            let word = (text as NSString).substring(with: tokenRange)
            print("\(word): \(tag.rawValue)")
        }
    }
}

func partsOfSpeech(for text: String) {

tagger.string = text

let range = NSRange(location: 0, length: text.utf16.count)

tagger.enumerateTags(in: range, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange, _ in

if let tag = tag {

let word = (text as NSString).substring(with: tokenRange)

print("\(word): \(tag.rawValue)")

}

The output of the following method is:-

The: Determiner
troublemakers: Noun
The: Determiner
round: Noun
pegs: Noun
in: Preposition
the: Determiner
square: Adjective
holes: Noun
The: Determiner
ones: Noun
who: Pronoun
see: Verb

The: Determiner

troublemakers: Noun

The: Determiner

round: Noun

pegs: Noun

in: Preposition

the: Determiner

square: Adjective

holes: Noun

The: Determiner

ones: Noun

who: Pronoun

see: Verb

Now we look into Named Entity Recognition. It helps to identify any names, organizations, or places

func namedEntityRecognition(for text: String) {
    tagger.string = text
    let range = NSRange(location: 0, length: text.utf16.count)
    let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]
    tagger.enumerateTags(in: range, unit: .word, scheme: .nameType, options: options) { tag, tokenRange, stop in
        if let tag = tag, tags.contains(tag) {
            let name = (text as NSString).substring(with: tokenRange)
            print("\(name): \(tag.rawValue)")
        }
    }
}

func namedEntityRecognition(for text: String) {

tagger.string = text

let range = NSRange(location: 0, length: text.utf16.count)

let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]

tagger.enumerateTags(in: range, unit: .word, scheme: .nameType, options: options) { tag, tokenRange, stop in

if let tag = tag, tags.contains(tag) {

let name = (text as NSString).substring(with: tokenRange)

print("\(name): \(tag.rawValue)")

}

Conclusion

So please follow the above step to integrate Natural Language Processing, and if you have any issue or suggestion you can leave your query/suggestion in the comment section.