Natural Language Processing 101

Natural Language Processing 101

Let’s say you want to set an alarm on your iPhone. You command Siri, and tell it to set an alarm for tomorrow. Siri responds ‘For what time?’, and you specify 9 am. The alarm is set.

In this short interaction, you activated a device, which heard your speech, processed said speech, executed an action, and responded with a sentence. This entire exchange was made possible by Natural Language Processing, or NLP. NLP is the basis behind any machine or program’s ability to process human speech. It’s behind recognizable AI assistants like Siri or Alexa, and also chatbots.

Understanding terms

Natural Language Processing is the umbrella term for any machine’s ability to recognize what is said to it, understand its meaning, determine the appropriate action, and respond in language the user will understand.

A subset of NLP is Natural Language Understanding. NLU goes beyond just basic sentence structure, but attempts to understand the intended meaning of language. Human speech is peppered with nuances, subtleties, mispronunciations, and colloquialisms – NLU is designed to tackle the complexities of human speech. One of the main areas of research in language processing is to transition from NLP to NLU. As Anush Fernandes at Medium writes:

NLU deals with the much narrower facet of how to best handle unstructured inputs and convert them into a structured form that a machine can understand and act upon.

Finally, Natural Language Generation, or NLG,  is what a machine writes itself. In the example above, Siri’s response ‘For what time?’ would fall under NLG.

Process

Let’s use the above example of Siri and the alarm. At a very basic level:

  1. You ask Siri to set an alarm. Siri converts your speech (audio file) to text.
  2. Using Natural Language Processing (turning text into structured data), Siri converts this plain text request into commands for itself.
  3. Siri then processes this data in a decision engine.
  4. Using Natural Language Generation (turning structured data into text), Siri ‘writes back’ ie. ‘For what time?’
  5. You specify ‘9am’ which is then again processed through NLP into the decision engine.
  6. Siri sets the alarm for you
Data Annotation

How does one achieve NLP? The following are a few ways to break down and organize data such that you can train your program to improve its NLP:

Entity Annotation

This refers to extracting ‘units of information’ from sentences or unstructured data – and making it structured. These units can include names, such as people, organizations, and location names, proper nouns. It can also be used to identify numeric expressions such as time, date, money, and percent expressions.

Semantic Annotation

This helps assess search results. Essentially, companies are looking for ways to improve their search relevance so that customers can actually find their products in search engines. The problem is, most product descriptions vary greatly depending on the source, and are often not accurate. Semantic annotation helps improve search results by tagging different product titles and search queries. At Gengo, we can build datasets to help you predict which categories fit best to a given product to make ecommerce process and product classification easier, faster and less error-prone.

Linguistic Annotation

This refers to assessing the subject of any given sentence. Its a broad genre, but essentially it’s anything to do with analysis of text, whether that be sentiment analysis of social media data, or using NLP to answer routine questions. Linguists and translators at Gengo provide a wide variety of services, including part-of- speech tagging, and audio speech analysis. Our team includes over 21,000+ qualified workers fluent in English and 36 other languages.

Use Cases

NLP can be used in a variety of cases, such as:

  1. Personal Assistants: as described above, services like Siri or Alexa.
  2. Chatbots:  since chatbots mimic real conversations, they heavily rely on natural language processing.
  3. Customer Service: many companies transcribe and analyze customer call recordings. NLP helps in analyzing such data, and responding to customer needs faster.
  4. Sentiment Analysis: NLP is used in figuring out the tone of any given piece of writing. This is usually very useful for companies looking to understand how their product is received on social media.
  5. Healthcare: NLP has huge implications for healthcare. This includes healthcare assistant (similar to siri but trained on medical terminology) to image classification (‘understanding’ a medical scan).

Leave a reply

Your email address will not be published. Required fields are marked *