Data Management

Aiming for smarter outcomes in NLP


Natural Language Processing is a key element in the that drives our technology, but it presents its own challenges in ensuring that the outcomes are as ‘smart’ as you would anticipate.

In previously describing what NLP is – a collection of techniques that use linguistics and computer science to process and analyze natural language in a way that computers can ‘understand’ – I missed out an important component at the end.

It should say “in a way that computers can understand the same way as a person does”.

Artificial Intelligence begins with human input, and it takes a lot of work to get that right. When we were building our ingestion pipelines for data coming in from structured and unstructured sources, we spent a long time training the system to understand the structure and nuance of language.

That involved running thousands of sentences past a group of volunteers to categorize texts into their individual Who, What, Where, and When elements and the Actions that link them.

With a background in linguistics, I found the whole process of combining the words with the science fascinating, particularly since our volunteers came from a variety of geographies and cultures, many with English as a second language.

The system will keep learning. And where we need to modify the lexicon for particular use cases, we can.

We’ll be looking into NLP and AI in more detail in a series of webinars we're planning. If you’re interested in joining us, connect, or ping me a message.   

Recent Posts

Michael's Book

Data Harmonization in the Key of C

Book Illustration