Skip to content

Natural Language and Natural Language Processing (NLP) in Azure AI Solutions

Source: My personal notes from Course AI-102T00-A: Develop AI solutions in Azure - Microsoft Learn with labs from Exercises for Azure AI Language Exercises

Analyze Text and Natural Language Processing (NLP)

Section titled “Analyze Text and Natural Language Processing (NLP)”

Use cases for NLP –> service output in Azure

  • Language detection: with documents, can determine language(s) in text –> detected language(s), confidence scores, identifiers
  • Key phrase extraction: keywords –> key phrase array
  • Sentiment analysis: determine positive, neutral, negative –> sentiment breakdown by sentence, confidence scores
  • Named entity recognition: find and identify specific things in text like locations, people, addresses, food, actions –> entity category, location in text, confidence scores
    • In Azure, can include custom named entities like organization’s terms
  • Entity linking: understand a word (entity) in text and its meaning –> link a keyword in text to its Wikipedia article
  • Summarizing text, content: outline –> Extractive summarization selects key sentences directly from the text. Abstractive summarization rewrites key ideas using new phrasing.
  • Personal identifiable information (PII) detection: detect and/or remove sensitive information like name, phone number, credit card –> label sensitive fields, redacted text array, type of PII, confidence scores

Customizable models are available for each specific case. Generally Azure AI services give JSON responses.

Deployment options include AI Foundry, Azure AI Services or a Language services resource.

Send data for NLP and get response from Azure AI services.

Conversational Understanding And Question Answering

Section titled “Conversational Understanding And Question Answering”

The goal is having conversations between people and machines with machines completing an action. The application is getting user input, determine the user’s intent (semantic meaning), and do the appropriate action like call an API or do a function.

For example, a hotel chat app to help guests with requests like room service. The conversation would need to determine the guest, room number, and food order information and put in the kitchen order.

Interpreting user input is called natural language understanding (NLU).

In Azure, conversational language understanding (CLU) enables building an NLU component in a conversational application.

The NLU model understands utterances from the NL input and maps them to intents that assign semantic meaning. It can recognize entities like quantities, date and time, email, identifiers

Examples:

  • Utterance: what is the time in Toronto
  • Intent: Get Time (function)
  • Entities: Toronto (location)

Specific intents, entities, synonyms can be used to train a model to understand them with utterances. The trained model can be published at an endpoint.

Question Answering: A knowledge base of question and answer pairs with NLP. Content, like an FAQ, is indexed and service is published as a REST endpoint.

Question Answering (QA) vs Language Understanding

Section titled “Question Answering (QA) vs Language Understanding”

QA is focused on static answers from known content. For LLMs, it is similar to Retrieval Augmented Generation (RAG) - Retrieval Augmented Generation (RAG)

NLU focuses on understanding intents and performing actions.

Example are recognizing organization or industry terms and custom locations.

Define entity labels and label documents. Custom named entities will be used for train models.

Minor changes like new named entities could be made in production since the solution architecture stays the same.

Speech use cases include:

  • Speech <–> Text
  • Translation
  • Speaker recognition - who is speaking
  • Pronunciation assessment
  • Intent recognition

Azure Services are available in UI, CLI, SDKs, and REST APIs. See AI Speech and in Azure - AI Speech and in Azure for description of capabilities.

Upload speech data to train a custom model (base model + changes). Model is tested, updated, and accept model gets an endpoint for use

Custom voice combined with speech and NLP services above is possible.

XML- based language with customization options:

  • Speaking styles
  • Pauses and silence
  • Phonetic pronunciation
  • Prosody (pitch, range, rate)
  • Say-as (number, date, time, address)
  • Insert recorded speech, background

Use cases:

  • Detect language and do translation, which can be one language to many languages.
  • Transliteration of a language character set to character set of the target language like Japanese characters to English phonetic words
  • Translation can be text, speech, or between text to speech and speech to text in batch or multiple target languages

Can be applied to document translation, custom translation for a domain or industry, and in single task or batch translation. Options includes accurate grammar, profanity filter.

Audio can be added to apps, like prompts can be given with audio and text and responses given as audio. API calls will include audio media.

Create a language service and send hotel review data using the SDK for processing to return sentiment, entity and phrase analysis, and entity linking.

Exercise: Create a question answering solution

Section titled “Exercise: Create a question answering solution”

Create a language service which used Azure AI Search. Configure the knowledge base with additional questions and follow up answers. Connect to the service and ask questions using the SDK.

Create a language service for custom text analysis and storage account for custom text documents. Label testing and training data and train a new model. Using the SDK, submit new text documents for classification and see categories and confidence scores.

Exercise: Develop an audio-enabled chat app

Section titled “Exercise: Develop an audio-enabled chat app”

Create a Azure AI Foundry model with Phi-4-multimodal-instruct. Use Azure AI Foundry and the Azure AI Inference SDK to create a client application uses a multimodal model to generate responses to audio. The audio was sent to the model and it provided acknowledgement of the requests.