Back

Amazon Redshift API Essential Guide

Aug 7, 20246 minute read

What type of API does Amazon Redshift provide?

The Amazon Redshift Data API is a REST API. It provides a secure HTTP endpoint that can be accessed using AWS SDKs to run SQL statements without managing database connections directly.

Does the Amazon Redshift API have webhooks?

Event Notifications in Amazon Redshift

  1. Amazon Redshift does not have webhooks in its official API. Instead, it uses Amazon Simple Notification Service (Amazon SNS) for event notifications.

  2. You can create event notification subscriptions to be notified when specific events occur for Amazon Redshift resources.

Event Subscription and Types

  1. You can subscribe to events for the following source types:

    • Clusters
    • Snapshots
    • Parameter groups
    • Security groups
  2. Event categories you can subscribe to include:

    • Configuration
    • Management
    • Monitoring
    • Security
    • Pending
  3. Event severities can be specified as INFO or ERROR.

Creating Event Subscriptions

  1. To create an event subscription, you can use the Amazon Redshift console, AWS CLI, or API.

  2. When creating a subscription, you specify:

    • Source type
    • Source ID (optional)
    • Event category
    • Event severity
    • Amazon SNS topic for notifications

Notification Delivery

  1. Amazon Redshift publishes event notifications to an Amazon SNS topic.

  2. Notifications can be delivered in various forms supported by Amazon SNS, such as email, text message, or HTTP endpoint calls.

Managing Event Subscriptions

  1. You can manage event subscriptions using AWS CLI operations or Amazon Redshift API actions.

  2. Some key API actions for managing event subscriptions include:

    • CreateEventSubscription
    • DeleteEventSubscription
    • DescribeEventSubscriptions
    • ModifyEventSubscription

Best Practices

  • Use event notifications to stay informed about important changes or issues in your Amazon Redshift resources.
  • Carefully choose the event criteria to avoid receiving too many or too few notifications.
  • Ensure that you have the necessary permissions to publish to the specified Amazon SNS topic.

In summary, while Amazon Redshift doesn't offer webhooks directly, it provides a robust event notification system through Amazon SNS, allowing you to subscribe to a wide range of events and receive notifications through various channels.

Rate Limits and other limitations

Based on the search results provided, here are the key points regarding API rate limits for the Amazon Redshift API:

API Rate Limits

  1. Each API in the Redshift Data API has a transactions per second (TPS) quota before throttling requests [1][2].

  2. If the rate of requests exceeds the quota, a ThrottlingException with HTTP Status Code 400 is returned [1].

  3. Specific TPS limits for different Redshift Data API operations [2]:

    • BatchExecuteStatement API: 20 TPS
    • CancelStatement API: 3 TPS
    • DescribeStatement API: 100 TPS
    • DescribeTable API: 3 TPS
    • ExecuteStatement API: 30 TPS
    • GetStatementResult API: 20 TPS
    • ListDatabases API: 3 TPS
    • ListSchemas API: 3 TPS
    • ListStatements API: 3 TPS
    • ListTables API: 3 TPS

Other Relevant Limits

While not strictly API rate limits, there are other important limitations to consider:

  1. Maximum duration of a query: 24 hours [1].
  2. Maximum number of active queries (STARTED and SUBMITTED) per Amazon Redshift cluster: 200 [1].
  3. Maximum query result size: 100 MB (after gzip compression) [1].
  4. Maximum query statement size: 100 KB [1].
  5. Maximum retention time for query results: 24 hours [1].

JDBC Driver Limitations

The search results do not provide specific information about limitations imposed on interactions via the AWS Redshift JDBC driver. However, it's important to note that:

  1. The JDBC driver typically doesn't have the same kind of API rate limits as the Redshift Data API, as it operates differently [5].
  2. JDBC connections may be subject to other types of limitations, such as connection pool sizes, query timeouts, or resource constraints on the Redshift cluster itself.

Best Practices

  1. To respond to throttling, use a retry strategy as described in the AWS SDKs and Tools Reference Guide [1].
  2. When using AWS Step Functions with Redshift Data API, include the ClientToken idempotency parameter in your API call to handle retries [1].

Summary

While the Amazon Redshift Data API has well-documented API rate limits, the same level of detail is not provided for the JDBC driver in the search results. The JDBC driver likely operates under different constraints, more closely tied to the Redshift cluster's capabilities and configuration rather than API-specific limits. For the most up-to-date and specific information about JDBC driver limitations, it would be best to consult the official AWS Redshift JDBC driver documentation or contact AWS support directly.

Latest API Version

Based on the search results provided, the most recent version of the Amazon Redshift API appears to be:

1.0.72031 - Current track version - Released on August 1, 2024

This information comes from Source 1, which lists the latest Amazon Redshift patch versions. The 1.0.72031 version is listed as the "Current track version" with the most recent release date of August 1, 2024 [2].

Key points to consider:

  1. Amazon Redshift releases new versions periodically to update clusters with new features and improvements.

  2. There are separate versions for Amazon Redshift provisioned clusters and Amazon Redshift Serverless.

  3. The API version may be different from the cluster version, but the search results do not provide specific information about the API version number.

  4. The Amazon Redshift API documentation (Source 3) does not specify an exact API version number.

Best practice:

When working with the Amazon Redshift API, it's recommended to use the latest available version to ensure access to the most recent features and improvements. Always refer to the official AWS documentation for the most up-to-date information on API versions and compatibility.

How to get a Amazon Redshift developer account and API Keys?

To get a developer account for Amazon Redshift and create an API integration, you need to follow these steps:

Create an AWS Account

  1. If you don't already have an AWS account, you'll need to create one. Go to the AWS homepage and click "Create an AWS Account".

  2. Follow the prompts to set up your account with your email, password, and payment information.

Set Up Amazon Redshift Access

  1. Once you have an AWS account, you'll need to set up access to Amazon Redshift:

    • Sign in to the AWS Management Console
    • Navigate to the Amazon Redshift service
    • Create a cluster if you haven't already
  2. To use the Amazon Redshift Data API, you need to authorize access. This is typically done by adding an IAM policy to your user or role.

Set Up IAM Permissions

  1. The easiest way to get started is to use the AWS-managed policy AmazonRedshiftDataFullAccess. This provides full access to the Data API operations.

  2. Alternatively, you can create a custom IAM policy based on your specific needs. Consider the following requirements:

    • If using AWS Secrets Manager for authentication, allow the secretsmanager:GetSecretValue action
    • If using temporary credentials for cluster authentication, allow the redshift:GetClusterCredentials action
    • If using temporary credentials for serverless workgroup authentication, allow the redshift-serverless:GetCredentials action

Use the Data API

  1. Once you have the necessary permissions, you can start using the Amazon Redshift Data API. You can access it through:

    • AWS SDK in various programming languages (e.g., Python, Java, JavaScript)
    • AWS CLI
    • REST API calls
  2. Here's a simple example using Python and boto3:

import boto3 client = boto3.client('redshift-data') response = client.execute_statement( ClusterIdentifier='your-cluster-identifier', Database='your-database-name', DbUser='your-db-user', Sql='SELECT * FROM your_table LIMIT 10' ) # Check the response and handle accordingly print(response)

What can you do with the Amazon Redshift API?

Based on the provided search results, here's a list of data models you can interact with using the Amazon Redshift API, along with what is possible for each:

Databases

  • List databases in a workgroup [1][3]
  • Run SQL statements (SELECT, DML, DDL, COPY, or UNLOAD) on databases [3]
  • Run multiple SQL statements in a batch as part of a single transaction [3]

Schemas

  • List schemas in a database [3]
  • Filter schemas by a matching schema pattern [3]

Tables

  • List tables in a database [3]
  • Filter tables by schema name pattern, table name pattern, or both [3]
  • Describe detailed information about a table, including column metadata [3]

SQL Statements

  • Execute individual SQL statements [1][3]
  • Batch execute multiple SQL statements [3]
  • Cancel running queries (if not in FINISHED or FAILED state) [3]
  • Describe details of specific SQL statements run [3]
  • List SQL statements executed in the last 24 hours [3]
  • Retrieve temporarily cached query results [3]

Query Results

  • Fetch and format query results [3]
  • Retrieve query results asynchronously [1][2]
  • Paginate through result sets to retrieve entire results as needed [3]

Machine Learning Models

  • Create, train, and deploy Amazon SageMaker models using SQL statements [2]
  • Use models for predictions such as churn detection, financial forecasting, personalization, and risk scoring directly in queries and reports [2]

Data Ingestion and Processing

  • Simplify data access, ingest, and egress from various programming languages and platforms [3]
  • Build serverless data processing workflows [3]
  • Design asynchronous web dashboards [3]
  • Build ETL pipelines with AWS Step Functions, Lambda, and stored procedures [3]
  • Schedule SQL scripts for data load, unload, and refresh of materialized views [3]

Integration with Other AWS Services

  • Interact with Amazon Redshift from AWS Lambda, AWS Cloud9, AWS AppSync, and Amazon EventBridge [4]
  • Access data from Amazon SageMaker and Jupyter notebooks [3][5]
  • Build event-driven applications with Amazon EventBridge and Lambda [3]

This list covers the main data models and interactions possible with the Amazon Redshift API, based on the provided search results. The API offers a wide range of capabilities for managing and querying data in Amazon Redshift, as well as integrating with other AWS services and building machine learning models.