Azure ML Prompt Flow overview

Azure ML Has integrated a new tool called Prompt Flow.

You can build a workflow here for different types of robotic process automation, advanced LLM based flow testing and automations as well.

From the Authoring page, click on prompt flow you will see the below page.

If you click on create new, you will see 3 types of flows,

  1. standard flow
  2. chat flow
  3. evaluation flow

standard flow is a combination of python code, LLM to test the process.

Chat flow is to test, debug the user friendlyness of your process and evaluation flow is for output tesitng, checking, content moderation testing etc.

There are also some available samples there. In this example we have selected the web classificaiton flow and cloned it.

Once you click clone it will take you to below page.

You will see that the entire flow is ready. Before you run this flow you need to configure 2 steps.

  1. LLM connection, you need to configure and connect your LLM models here from open ai to azure ml studio
  2. you need to configure and connect runtime environment.

Above graph shows the steps of activities for this individual flow, it’s taking string as an input which is a web URL, then fetching content from that URL, parsing it, summarizing it, matching it with example for classify use case. later it is using built in function to convert it into dictionary to show the final result.

Here we have provided a link of wikipedia page of a movie as input string. It fetched the content, parsed it, summarized it and then use it for classification.

Once we run the entire flow all the steps are executed in seconds.

This is the end result we got, this movie page type is academic.

It was able to classify application page, sports page and none type page successfully. It is just an example of an LLM flow through promt flow of azure ML.

LLM Key Issues and Applications

Key LLM Issues

•Misinformation – wrong / biased answers, can be because of insufficient prompt engineering.

•Intrinsic Hallucination – fabricated contents.

•Extrinsic Hallucination – unverified content.

Hallucination can be reduced by increasing training data, involving contextual references, RAG.

LLMs are non-deterministic : same prompt different times can give different answers.

Please read this paper to learn more about it, https://arxiv.org/pdf/2311.05232.pdf

Things GenAI Can Do

•It can summarize a lot of information.

•Can perform better with an augmented context – RAG (Retrieval Augmented Generation)

•Can be a great coding assistant.

•Can be supported by other software processes (Backend API communication and services for better performance)

•Can be a great source to augment your own knowledge base to users.

Things GenAI Can’t Do

•Can’t even be certain about anything – Hallucination problem

•It can not figure out new things on its own (RAG helps but, it doesn’t alter underlying model itself)

•Can’t be certain if another content was created by Gen AI or not.

•Can’t properly cite it’s own source of information. (Cited source can be hallucinated, RAG can be used as cited source)

•GenAI can’t take your job yet. Has no way to proactively learn for itself.

LLM Applications

•Writing Assistance – technical, creative writing, general editing, documentation, programming.

•Information Retrieval – search engine support, conversational recommendation, document summarize, text interpretations.

•Commercial – customer support, machine translation, automation of workflow / knowledge tests, business management, medical diagnosis.

•Individual Use – productivity support, Q&A, brainstorming, education, problem solving.

Don’t forget!
GenAI is not the only AI!

LLMOps & MLOps

LLMOps:

Large Language Model Ops (LLMOps) encompasses the process, techniques and tools used for the operational management of the large language models in production environment.

In terms of MLOps, there are 4 key steps,

  • data prep
  • build and train
  • deploy
  • monitor

In terms of LLMOps, those key steps are same as MLOps,

  • data prep
  • build and train
  • deploy
  • monitor

But, key differences are Build and Train part.

under MLOps, it is,

  • Feature Engineering
  • Model & Algorithm
  • Train Model
  • Test Model
  • Hyperparameter tuning
  • For LLMOps, it is,
  • Base model selection
  • Prompt Engineering
  • Fine-tuning
  • RAG
  • Hyperparameter tuning
  • Repeatable pipelines

Why do we need LLMOps?

•LLM Development lifecycle

•Efficiency: faster model and pipeline development

•Scalability: vast scalability and management where tons of models can be managed.

•Risk Reduction: transparency, regulation, monitoring, drifting.

LLMOps Components

•Exploratory data analysis

•Data preparation and prompt engineering

•Model fine-tuning

•Introducing RAG

•Model review and governance

•Model inference and serving

•Model monitoring with human in the loop

Generative AI for Solution Architects Part 1

Generative AI for a solution architect use cases?

Here are the use cases of generative AI for a solution architect,

•Integrating GenAI models in workflow automation.

•Understanding GenAI life cycle.

•Understanding use cases, LLMops, limitations, ways to improve gen AI tools.

•Using GenAI in day-to-day activities.

•Prepare your own version of GPT model for documentation support

Enterprise LLM LifeCycle:

Source: https://azure.microsoft.com/en-us/blog/building-for-the-future-the-enterprise-generative-ai-application-lifecycle-with-azure-ai/

Enterprise LLM Life Cycle:

1.Ideating and exploring loop

•Understanding business requirements

•Using use case specific foundational models

•Preparing data, prompts and identifying limitations.

2.Building and augmenting loop

•Guiding and enhancing the LLM to meet specific needs.

•Base model trained on enterprise local data, real-time data.

•RAG based SQL and Non-SQL data injection to prompt for better context.

•LLM fine-tuning to adapt with the specific response scenarios.

•Combine Prompt, RAG and Fine-tuning for optimal result.

3. Operation loop

•LLM transition from development to production.

•Deployment, monitoring, incorporating content safety and integrating CICD process.

•Azure AI prompt flow for deployment in AML.

•AML collects production data for monitoring, helps with cost planning and cost understanding.

•Drift detection metrics for quality and safety monitoring.

4. Managing Loop

•Enterprise life cycle for governance, management and security.

•Responsible AI principles.

No the question is, How can you make your LLM Smarter?

There are 3 key ways, they are,

•Prompting

•Fine-tuning

•RAG (Retrieval-Augmented Generation)

RAGFine Tune + RAG
PromptFine Tune
Left axis is outside knowledge, right axis is domain knowledge.

•Access to current info (from RAG + fine-tune)

•Use of specific info (from RAG)

•Reduction of hallucination (more Specific context)

•Content traceability ( known Content source)

Email Summarization using OpenAI Text-davinci-003 GPT model

First we have create a cloud flow “When a new email arrives”

We have set the submit string and To address and connect it with our email id.

Whenever we get an email with that defined subject, this action will trigger.

Once this action trigger this should convert HTML to text so that we can process it.

We want to take the email body and summarize it.

From the text body, we will pass it to GPT model through prompt for summarization.

Our HTTP API call will look like this,

Method: POST

URI: https://openaiapitest.openai.azure.com/openai/deployments/test_vinci1/completions?api-version=2023-03-15-preview

Headers: api-ley **********************

Body:

{
“model”: “text-davinci-003”,
“prompt”: “summarize this passage,’@{body(‘Html_to_text’)}””,
“temperature”: 0.9,
“max_tokens”: 25
}

As we want summary thats why we used 0.9 as temperature and max token is 25 words.

I have passed raw TEXT of email body in the prompt along with summarize this passage message.

Overall process looks like this,

When this flow runs,

We receive this HTML from email and after converting it in Text, this looks like this,

Winter is the coldest season of the year in polar and temperate climates. It occurs after autumn and before spring. The tilt of Earth’s axis causes seasons; winter occurs when a hemisphere is oriented away from the Sun. Different cultures define different dates as the start of winter, and some use a definition based on weather.

When it is winter in the Northern Hemisphere, it is summer in the Southern Hemisphere, and vice versa. In many regions, winter brings snow and freezing temperatures. The moment of winter solstice is when the Sun’s elevation with respect to the North or South Pole is at its most negative value; that is, the Sun is at its farthest below the horizon as measured from the pole. The day on which this occurs has the shortest day and the longest night, with day length increasing and night length decreasing as the season progresses after the solstice.

The earliest sunset and latest sunrise dates outside the polar regions differ from the date of the winter solstice and depend on latitude. They differ due to the variation in the solar day throughout the year caused by the Earth’s elliptical orbit (see: earliest and latest sunrise and sunset).

HTTP request looks like this,

{

  “model”: “text-davinci-003”,

  “prompt”: “summarize this passage,’Winter is the coldest season of the year in polar and temperate climates. It\noccurs after autumn and before spring. The tilt of Earth’s axis causes seasons;\nwinter occurs when a hemisphere is oriented away from the Sun. Different\ncultures define different dates as the start of winter, and some use a\ndefinition based on weather.\nWhen it is winter in the Northern Hemisphere, it is summer in the Southern\nHemisphere, and vice versa. In many regions, winter brings snow and freezing\ntemperatures. The moment of winter solstice is when the Sun’s elevation with\nrespect to the North or South Pole is at its most negative value; that is, the\nSun is at its farthest below the horizon as measured from the pole. The day on\nwhich this occurs has the shortest day and the longest night, with day length\nincreasing and night length decreasing as the season progresses after the\nsolstice.\nThe earliest sunset and latest sunrise dates outside the polar regions differ\nfrom the date of the winter solstice and depend on latitude. They differ due to\nthe variation in the solar day throughout the year caused by the Earth’s\nelliptical orbit (see: earliest and latest sunrise and sunset).\n””,

  “temperature”: 0.9,

  “max_tokens”: 100

}

And HTTP Response looks like this,

{

  “id”: “cmpl-70XMO5MDE6dWAvIDUNBFT4BwQrz93”,

  “object”: “text_completion”,

  “created”: 1680362592,

  “model”: “text-davinci-003”,

  “choices”: [

    {

      “text”: “\n\nWinter is the coldest season of the year in polar and temperate climates, occurring after autumn and before spring. It is caused by the tilt of Earth’s axis, meaning that when it is winter in the Northern Hemisphere, it is summer in the Southern Hemisphere. Winter brings low temperatures, snow in many regions, and the shortest day and longest night at the winter solstice. The earliest and latest sunrise and sunset dates vary depending on latitude and are determined by the Earth’s elliptical orbit”,

      “index”: 0,

      “finish_reason”: “length”,

      “logprobs”: null

    }

  ],

  “usage”: {

    “completion_tokens”: 100,

    “prompt_tokens”: 262,

    “total_tokens”: 362

  }

}

Azure open AI model usage using API

In our previous blog, we saw how to create a resource and deploy a model for openAI.

now copy the key and restend point.

You can create the API endpoint from the restful endpoint, your final endpoint will look like this,

https://openaiapitest.openai.azure.com/openai/deployments/test_vinci1/completions?api-version=2023-03-15-preview

where, test_vinci1 is my model name that I got during the deployment process, 2023-03-15-preview is the API version that I am using.

You need to prepare a POST request in Postman to test the API.

Copy the API key that you get during the resource creation process.

In my body, I am adding this part,

{
“model”:”text-davinci-003″,
“prompt”:”Tell me about football world cup 2022″,
“temperature”:0.1,
“max_tokens”:50
}

Here, I have asked a question to text-davinci-003 model, with 0.1 temperature and max 50 words / tokens.

Temperature 0.1 represents no hallucination or, exact answer.

This is the response we will receive,

{
“id”: “cmpl-7BVjovuXRZBC7K19VJuYWlr4MebDl”,
“object”: “text_completion”,
“created”: 1682977964,
“model”: “text-davinci-003”,
“choices”: [
{
“text”: “\n\nThe 2022 FIFA World Cup is scheduled to be the 22nd edition of the FIFA World Cup, the quadrennial international men’s football championship contested by the national teams of the member associations of FIFA. It is scheduled to take place in”,
“index”: 0,
“finish_reason”: “length”,
“logprobs”: null
}
],
“usage”: {
“completion_tokens”: 50,
“prompt_tokens”: 7,
“total_tokens”: 57
}
}

Azure OpenAI Resource creation and model deployment

First step here to visit azure portal.

Once you login into azure portal, click create option.

From here write openAI.

You will see option for Azure OpenAI.

You can apply for openAI access using this link, https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu&culture=en-us&country=us

You will only be able to create open AI resource if your application is approved.

Then select pricing Tier as S0 standard, name, region and select the resource group as well.

You can either create a new RG or, use existing.

Once you create the resource, you will see above window with openAI access.

Go to Keys and endpoint, here you will get detail about the Rest API and api-key.

Then, go do model deployment section.

You can deploy your model following the available model selection, version and proper model name.

Open AI API Test from Power Automate

Method POST

URI: https://openaiapitest.openai.azure.com/openai/deployments/test_vinci1/completions?api-version=2023-03-15-preview

Header: api=key **************************

Body:

{
“model”: “text-davinci-003”,
“prompt”: “Which team was the winner of Indian Premier League 2021′”,
“temperature”: 0.9,
“max_tokens”: 100
}

Response:

{

  “id”: “cmpl-70XHtL9Ec1iJ97VMqcaTxgcIRs3TR”,

  “object”: “text_completion”,

  “created”: 1680362313,

  “model”: “text-davinci-003”,

  “choices”: [

    {

      “text”: “\n\nThe Mumbai Indians were the winners of the Indian Premier League 2021.”,

      “index”: 0,

      “finish_reason”: “stop”,

      “logprobs”: null

    }

  ],

  “usage”: {

    “completion_tokens”: 15,

    “prompt_tokens”: 11,

    “total_tokens”: 26

  }

}

Once we test it, the response looks like this,