Back

Azure OpenAI Service API Essential Guide

Aug 7, 20246 minute read

What type of API does Azure OpenAI Service provide?

Azure OpenAI Service provides a REST API. The key points about the API type and related details are:

API Type

  • Azure OpenAI Service provides a REST API.

Key Points

  • The API is divided into three primary surfaces:

    1. Control plane API - for resource management tasks
    2. Data plane authoring API - for fine-tuning, file uploads, etc.
    3. Data plane inference API - for inference capabilities like completions, embeddings, etc.
  • Each API surface has its own preview and stable/GA releases.

  • The inference REST API endpoints are used for interacting with the Azure OpenAI models.

Authentication

Azure OpenAI provides two authentication methods:

  1. API Key authentication

    • Requires including the API key in the api-key HTTP header
  2. Microsoft Entra ID authentication

    • Uses a bearer token in the Authorization header

Other Details

  • The API uses URI parameters like endpoint, deployment-id, and api-version.

  • There are OpenAPI (Swagger) specifications available for the Azure OpenAI REST API.

  • The API can be imported into Azure API Management for management and monitoring.

  • Caching policies are available in API Management to optimize performance of Azure OpenAI API calls.

Does the Azure OpenAI Service API have webhooks?

Based on the search results provided, there is no explicit mention of webhooks or event subscriptions for the official Azure OpenAI Service API. The search results primarily focus on general information about Azure OpenAI Service, its pricing, authentication methods, and available models.

Key Points:

  1. Authentication Methods:

    • Azure OpenAI provides two methods for authentication: a. API Key authentication b. Microsoft Entra ID authentication
  2. API Surfaces:

    • Azure OpenAI has three primary API surfaces: a. Control plane b. Data plane - authoring c. Data plane - inference
  3. Available Models:

    • Azure OpenAI Service provides access to various models, including GPT-3.5-Turbo, GPT-4, DALL-E, Whisper, Babbage, and Davinci
  4. Responsible AI:

    • Microsoft implements several measures to ensure responsible use of the service, including tools for content moderation and limited access to the service
  5. Service Level Agreement (SLA):

    • Microsoft guarantees that Azure OpenAI Service will be available at least 99.9 percent of the time

Conclusion:

While the search results do not provide information about webhooks or event subscriptions for the Azure OpenAI Service API, this doesn't necessarily mean they don't exist. To get a definitive answer, you may need to:

  1. Check the official Azure OpenAI Service documentation for any mentions of webhooks or event subscriptions.
  2. Contact Azure support or consult with an Azure representative for the most up-to-date information on available features.
  3. Look for any recent announcements or updates regarding the Azure OpenAI Service API that might introduce new features like webhooks.

If webhooks or event subscriptions are crucial for your use case, you may need to explore alternative solutions or integration methods with Azure OpenAI Service.

Rate Limits and other limitations

The Azure OpenAI Service API has several rate limits and quotas in place to manage usage and ensure fair access for all users. Here are the key points regarding API rate limits:

Tokens Per Minute (TPM) Limits

  • Azure OpenAI assigns quota on a per-region, per-model basis in units of Tokens-per-Minute (TPM) [1][2].
  • When you onboard a subscription, you receive default quota for most available models [2].
  • You assign TPM to each deployment as it's created, reducing the available quota for that model [2].
  • The TPM limit is based on the maximum number of tokens estimated to be processed by a request at the time it's received [2].

Requests Per Minute (RPM) Limits

  • A Requests-Per-Minute (RPM) rate limit is also enforced [2][3].
  • The RPM limit is set proportionally to the TPM assignment using a ratio of 6 RPM per 1000 TPM [2][3].

How Rate Limits Work

  • As requests come in, an estimated max-processed-token count is added to a running token count that resets each minute [2].
  • If the TPM rate limit is reached within a minute, further requests will receive a 429 response code until the counter resets [2].
  • For RPM limits, Azure OpenAI evaluates the rate of incoming requests over short periods (typically 1 or 10 seconds) [2][3].

Other Limits

  • There's a limit of 30 Azure OpenAI resource instances per region [3].
  • Different models have different maximum prompt token limits [1].
  • There are limits on the number of fine-tuned model deployments, training jobs, and file sizes for fine-tuning [1].

Best Practices to Avoid Rate Limiting

  1. Use minimum feasible values for max_tokens and best_of parameters [3].
  2. Manage quota to allocate more TPM to high-traffic deployments [3].
  3. Avoid sharp changes in workload and increase gradually [3].
  4. Implement retrying with exponential backoff for real-time requests [3].
  5. Consider batching requests to increase throughput [3].
  6. For batch processing, add delays between requests to operate near the rate limit without exceeding it [3].

Monitoring and Alerts

  • Azure OpenAI provides metrics and logs through Azure Monitor [3].
  • You can set up alerts based on these metrics to be notified when approaching limits [3].

Provisioned Throughput Units (PTU)

  • For high-throughput workloads, Azure OpenAI offers Provisioned Throughput Units (PTU) [3].
  • PTUs provide more predictable performance and stable max latency compared to the standard quota [3].

Understanding and managing these rate limits is crucial for optimizing Azure OpenAI usage and ensuring reliable performance for your applications.

Latest API Version

The most recent version of the Azure OpenAI Service API is 2024-06-01. Here are the key points to consider:

Latest GA API Version

  • The latest generally available (GA) API version for Azure OpenAI Service is 2024-06-01 [1].

  • This version replaces the previous GA release 2024-02-01 [1].

Features Supported

  • The 2024-06-01 version supports the latest GA features including:
    • Whisper
    • DALL-E 3
    • Fine-tuning
    • On your data feature [1]

Preview API Versions

  • There are also preview API versions available that support additional features still in preview, such as:
    • Assistants API
    • Text to speech
    • Certain "on your data" datasources [1]

Versioning Format

  • The API versions follow a YYYY-MM-DD date structure [2].

Updating to Latest Version

  • It's recommended to test upgrading to new API versions before making changes globally [1].

  • For the REST API or OpenAI Python client, you need to update the API version directly in your code [1].

  • For Azure OpenAI SDKs (C#, Go, Java, JavaScript), you need to update to the latest SDK version, as each release is hardcoded to work with specific API versions [1].

Best Practices

  • Consult the API version lifecycle documentation to track how long your current API version will be supported [1].

  • Test upgrades in a non-production environment first to check for any impacts [1].

  • Keep your SDK versions up-to-date if using the official Azure OpenAI SDKs [1].

How to get a Azure OpenAI Service developer account and API Keys?

To get a developer account for Azure OpenAI Service and create an API integration, you need to follow these steps:

1. Get an Azure Subscription

  • Sign up for an Azure subscription. You can either sign up for a paid plan or start with a free account.
  • If you don't already have an Azure subscription, you can create one for free.

2. Request Access to Azure OpenAI Service

  • Access to Azure OpenAI Service is currently granted only by applying for access.
  • Complete the form at aka.ms/OAIapply to request access to Azure OpenAI Service for your subscription.
  • You must use a company email address. Applications submitted with personal email addresses (e.g., gmail.com, outlook.com) will be denied.
  • You must apply on behalf of your own company, not your client's company.
  • Agree to the Azure Data Policy and Code of Conduct for the Azure OpenAI Service.

3. Get Necessary Permissions

  • Ensure you have permissions on your account to create Azure OpenAI resources and deploy models.

4. Create Azure OpenAI Service Resource and Deploy a Model

  • Once approved, go to the Azure portal and create an Azure OpenAI Service resource.
  • Use Azure AI Studio to deploy a model.

5. Retrieve Key and Endpoint

  • After creating the resource and deploying a model, you'll need to retrieve the following information:
    • ENDPOINT: Found in the Keys & Endpoint section of your resource in the Azure portal.
    • API-KEY: Found in the Keys & Endpoint section of your resource in the Azure portal.
    • DEPLOYMENT-NAME: The custom name you chose for your deployment.

By following these steps, you should be able to get a developer account for Azure OpenAI Service and create an API integration. Remember to comply with all usage policies and best practices when using the service.

What can you do with the Azure OpenAI Service API?

Here's the markdown text with the trailing list of URLs and citation references removed, and any URLs inside the content formatted correctly for the markdown file format:

GPT-4 and GPT-4 Turbo

  • Latest and most capable Azure OpenAI models
  • Can accept both text and images as input (multimodal versions)
  • Understand and generate natural language and code
  • Support for JSON mode and function calling (except for image input requests in Azure's version)

GPT-3.5

  • Improved version of GPT-3
  • Can understand and generate natural language and code
  • Suitable for a wide range of language tasks

Embeddings

  • Convert text into numerical vector form
  • Facilitate text similarity comparisons
  • Useful for search, clustering, and recommendation systems

DALL-E

  • Generate original images from natural language descriptions
  • Create unique visual content based on text prompts

Whisper (Preview)

  • Transcribe speech to text
  • Translate speech to text
  • Useful for audio processing and language translation tasks

Text to Speech (Preview)

  • Synthesize text to speech
  • Generate human-like voice output from written text

Key Points to Consider:

  1. Model availability varies by region and cloud
  2. Different models have different capabilities and price points
  3. Some models are in preview and may have limited availability or features
  4. Azure OpenAI Service provides tools for responsible AI use, including content moderation
  5. The service offers both Pay-As-You-Go and Provisioned Throughput Units (PTUs) pricing models

Best Practices:

  1. Choose the appropriate model based on your specific use case and requirements
  2. Consider using Azure OpenAI On Your Data to run models on your enterprise data for greater accuracy and insights
  3. Implement proper authentication and security measures when using the API
  4. Use the provided data preparation scripts for optimizing document structure and handling long text
  5. Leverage the different search types (keyword, semantic, vector) based on your data and query needs

By understanding these data models and their capabilities, you can effectively utilize the Azure OpenAI Service API for various AI-powered applications and tasks.