Azure OpenAI LLM Connector

This node establishes a connection with an Azure OpenAI Large Language Model (LLM). After successfully authenticating using the Azure OpenAI Authenticator node, enter the deployment name of the model you want to use. You can find the models on the Azure AI Studio at 'Management - Deployments'. Note that only models compatible with Azure OpenAI's Completions API will work with this node.

If you a looking for gpt-3.5-turbo (the model behind ChatGPT) or gpt-4, check out the Azure OpenAI Chat Model Connector node.


Azure Deployment

Deployment name

The name of the deployed model to use. Find the deployed models on the Azure AI Studio.

Model Parameters

Maximum Response Length (token)

The maximum number of tokens to generate.

The token count of your prompt plus max_tokens cannot exceed the model's context length.


Sampling temperature to use, between 0.0 and 2.0. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 for ones with a well-defined answer. It is generally recommend altering this or top_p but not both.

Completions generation

How many chat completion choices to generate for each input message. This parameter generates many completions and can quickly consume your token quota.


Set the seed parameter to any integer of your choice to have (mostly) deterministic outputs. The default value of 0 means that no seed is specified.

If the seed and other model parameters are the same for each request, then responses will be mostly identical. There is a chance that responses will differ, due to the inherent non-determinism of OpenAI models.

Please note that this feature is in beta and only currently supported for gpt-4-1106-preview and gpt-3.5-turbo-1106 [1].

[1] OpenAI Cookbook

Number of concurrent requests

Maximum number of concurrent requests to LLMs that can be made, whether through API calls or to an inference server. Exceeding this limit may result in temporary restrictions on your access.

It is important to plan your usage according to the model provider's rate limits, and keep in mind that both software and hardware constraints can impact performance.

For OpenAI, please refer to the Limits page for the rate limits available to you.

Top-p sampling

An alternative to sampling with temperature, where the model considers the results of the tokens (words) with top_p probability mass. Hence, 0.1 means only the tokens comprising the top 10% probability mass are considered.

Input Ports


Validated authentication for Azure OpenAI.

Output Ports


Configured Azure OpenAI LLM connection.

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found


This node has no views


  • No workflows found



You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.