AWS HTTP API and Python based Lambda Integration¶
I have been using AWS API Gateway to AWS Lambda integration for many years, but only recently tried out the deployment with SAM, using the newer HTTP API.
It is surprisingly a lot easier than what it used to be, and in this blog post I hope to give a quick intro into the SAM method of defining and deploying your API. I will also go into a little more detail in the Lambda side as to how to extract POST data as there are several ways in how your Lambda function may receive data.
Starting with a Simple Example¶
API Gateways can quickly become rather complex, so to keep things simple I will explain an example at the hand of handling a HTTP POST request, coming in via the API Gateway and with that request then being routed by means of proxy integration to AWS Lambda.
There are a couple of very important points to note with this example:
- The exposed end-point is HTTPS, but it is an AWS domain. It will work perfectly for demonstration purposes, but in most cases I imagine someone would rather use a custom domain.
- In this example we explicitly trust all incoming requests, so there are no security features other than HTTPS and rate limiting. There is no authorization, no CORS and no WAF.
- Keeping things simple means that there are also no path variables or query strings in this example. You will see an example of a stage variable, but it is not really used at all.
So, at this point you may wonder - why even bother? Well, the problem is there are so many different use cases that it makes it very hard to all these concepts in an example and still call it "a simple example". However, I believe the template and example code does provide at least a good starting point for anyone that wants to get their feet wet and have a really quick example to deploy and play around with. In this context, the example is perfect. It can be deployed in less than 2 minutes and within 5 minutes you should be able to have tested the example and viewed the generated logs.
I have enabled a fair amount of logging so that you would be able to see how a typical request looks like.
Keep in mind that this is an HTTP API that will make a proxy
request. There are various ways in which requests and data can be routed, but I believe this type of integration exposes the lambda function to all the available information that is available at the time of making the request. The function has access to the request headers, information about the client, the query string (if present), the HTTP method as well as any data submitted with the request. As mentioned, stage variables are also available.
SAM Template¶
I used SAM in a previous blog post, so that should provide some additional background if required. Also consult the AWS Documentation if needed.
For this example, you can use the following SAM template:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Example Template for AWS HTTP API to Python Lambda Function
Parameters:
StageNameParameter:
Type: String
Description: The API Gateway Stage Name
Default: sandbox
Resources:
HttpApi:
Type: AWS::Serverless::HttpApi
Properties:
StageName: !Ref StageNameParameter
Tags:
Tag: Value
AccessLogSettings:
DestinationArn: !GetAtt HttpApiAccessLogs.Arn
Format: $context.stage $context.integrationErrorMessage $context.identity.sourceIp $context.identity.caller $context.identity.user [$context.requestTime] "$context.httpMethod $context.resourcePath $context.protocol" $context.status $context.responseLength $context.requestId $context.extendedRequestId
DefaultRouteSettings:
ThrottlingBurstLimit: 200
RouteSettings:
"POST /example":
ThrottlingBurstLimit: 500 # overridden in HttpApi Event
StageVariables:
StageVar: Value
FailOnWarnings: true
HttpApiAccessLogs:
Type: AWS::Logs::LogGroup
Properties:
RetentionInDays: 90
ApiFunctionLogs:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub /aws/lambda/${ApiFunction}
RetentionInDays: 7
ApiFunction: # Adds a GET api endpoint at "/" to the ApiGatewayApi via an Api event
Type: AWS::Serverless::Function
Properties:
Events:
ExplicitApi: # warning: creates a public endpoint
Type: HttpApi
Properties:
ApiId: !Ref HttpApi
Method: POST
Path: /example
TimeoutInMillis: 30000
PayloadFormatVersion: "2.0"
RouteSettings:
ThrottlingBurstLimit: 600
Runtime: python3.8
Handler: function.handler
CodeUri: path/to/src
A note about the Access Log Format: you can find all the fields available for use in the AWS documentation. I have loosely based the configuration on the Apache Common Log Format, but with some additional fields included.
You will have to adapt the CodeUri
to the path of your Lambda function.
Lambda Template (Python)¶
The following Python code is rather long, but provides a good baseline for a template you can adapt to suite your needs:
import json
import logging
from datetime import datetime
import sys
import base64
from urllib.parse import parse_qs
def extract_post_data(event)->str:
if 'requestContext' in event:
if 'http' in event['requestContext']:
if 'method' in event['requestContext']['http']:
if event['requestContext']['http']['method'].upper() in ('POST', 'PUT', 'DELETE'): # see https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods
if 'isBase64Encoded' in event and 'body' in event:
if event['isBase64Encoded'] is True:
body = base64.b64decode(event['body'])
if isinstance(body, bytes):
body = body.decode('utf-8')
return body
if 'body' in event:
body = event['body']
if isinstance(body, bytes):
body = body.decode('utf-8')
else:
body = '{}'.format(body)
return body
return ""
def decode_data(event, body: str):
if 'headers' in event:
if 'content-type' in event['headers']:
if 'json' in event['headers']['content-type'].lower():
return json.loads(body)
if 'x-www-form-urlencoded' in event['headers']['content-type'].lower():
return parse_qs(body)
return body
def get_logger(level=logging.INFO):
logger = logging.getLogger()
for h in logger.handlers:
logger.removeHandler(h)
formatter = logging.Formatter('%(funcName)s:%(lineno)d - %(levelname)s - %(message)s')
ch = logging.StreamHandler(sys.stdout)
ch.setLevel(level)
ch.setFormatter(formatter)
logger.addHandler(ch)
logger.setLevel(level)
return logger
def handler(
event,
context
):
logger = get_logger(level=logging.DEBUG)
result = dict()
return_object = {
'statusCode': 200,
'headers': {
'x-custom-header' : 'my custom header value',
'content-type': 'application/json',
},
'body': result,
'isBase64Encoded': False,
}
logger.info('HANDLER CALLED')
logger.debug('DEBUG ENABLED')
logger.info('event={}'.format(event))
body = extract_post_data(event=event)
logger.info('body={}'.format(body))
data = decode_data(event=event, body=body)
logger.info('data={}'.format(data))
result['message'] = 'ok'
return_object['body'] = json.dumps(result)
logger.info('HANDLER DONE')
logger.info('result={}'.format(result))
logger.info('return_object={}'.format(return_object))
return return_object
There is a couple of steps to walk through...
When the API Gateway send the proxy request to the Lambda function, the data will all be included in the event
dictionary.
A basic return object is then set-up and the structure is what the API Gateway expects. Since this is a JSON API example, the function will also return JSON back.
Next, the initial body data is extracted. It is a little tricky at times, and the extract_post_data()
functions checks a number of things to try it's best to extract any body data that may be available.
Once the body data is available, the decode_data()
function will convert it into a dict. The example shows how to support both JSON and web forms submitted data.
No further processing is done. The data objects are logged so that you can see what the results were. I am pretty sure it is possible to break the function as there is very little in terms of error checking. However, if you do get an error when testing, the API Gateway Access logs should show the error message.
Deployment and Testing¶
Deployment is done with the following commands, but you can adjust to suite your needs:
# You need to set this
export AWS_PROFILE="..."
# You need to set this, for example eu-central-1
export AWS_REGION="..."
# You need to set this - example: sandbox
export DEPLOYMENT_ENV="..."
sam build
sam deploy --profile $AWS_PROFILE \
--region $AWS_REGION \
--stack-name myApiTest \
--capabilities CAPABILITY_IAM \
--parameter-overrides "ParameterKey=StageNameParameter,ParameterValue=$DEPLOYMENT_ENV" \
--disable-rollback \
--no-confirm-changeset \
--config-env $DEPLOYMENT_ENV
Since the Lambda function can parse both JSON and form data, you can call your endpoint using the following example - just use your own API Gateway ID in place of the nnnnnnnn
:
curl -d "param1=value1¶m2=value2" -X POST https://nnnnnnnn.execute-api.eu-central-1.amazonaws.com/sandbox/example
{"message": "ok"}
curl -d '{"Message": "Test123"}' -H "Content-Type: application/json" -X POST https://nnnnnnnn.execute-api.eu-central-1.amazonaws.com/sandbox/example
{"message": "ok"}%
The access log for the requests should look something like this:
sandbox - NNN.NNN.NNN.NNN - - [11/Aug/2022:04:34:24 +0000] "POST - HTTP/1.1" 200 17 AAAAAAAAAAAAAAA= AAAAAAAAAAAAAAA=
The lambda function for the first requests can look something like this:
START RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa Version: $LATEST
handler:133 - INFO - HANDLER CALLED
handler:135 - INFO - event={'version': '2.0', 'routeKey': 'POST /example', 'rawPath': '/sandbox/example', 'rawQueryString': '', 'headers': {'accept': '*/*', 'content-length': '27', 'content-type': 'application/x-www-form-urlencoded', 'host': 'nnnnnnnn.execute-api.eu-central-1.amazonaws.com', 'user-agent': 'curl/7.68.0', 'x-amzn-trace-id': 'Root=xxx', 'x-forwarded-for': 'NNN.NNN.NNN.NNN', 'x-forwarded-port': '443', 'x-forwarded-proto': 'https'}, 'requestContext': {'accountId': '000000000000', 'apiId': 'nnnnnnnn', 'domainName': 'nnnnnnnn.execute-api.eu-central-1.amazonaws.com', 'domainPrefix': 'nnnnnnnn', 'http': {'method': 'POST', 'path': '/sandbox/example', 'protocol': 'HTTP/1.1', 'sourceIp': 'NNN.NNN.NNN.NNN', 'userAgent': 'curl/7.68.0'}, 'requestId': 'AAAAAAAAAAAAAAA=', 'routeKey': 'POST /example', 'stage': 'sandbox', 'time': '11/Aug/2022:04:33:57 +0000', 'timeEpoch': 1660192437357}, 'stageVariables': {'StageVar': 'Value'}, 'body': 'cGFyYW0xPXZhbHVlMSZwYXJhbTI9dmFsdWUy', 'isBase64Encoded': True}
handler:138 - INFO - body=param1=value1¶m2=value2
handler:140 - INFO - data={'param1': ['value1'], 'param2': ['value2']}
handler:144 - INFO - HANDLER DONE
handler:145 - INFO - result={'message': 'ok'}
handler:146 - INFO - return_object={'statusCode': 200, 'headers': {'x-custom-header': 'my custom header value', 'content-type': 'application/json'}, 'body': '{"message": "ok"}', 'isBase64Encoded': False}
END RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa
REPORT RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa Duration: 19.34 ms Billed Duration: 20 ms Memory Size: 128 MB Max Memory Used: 52 MB Init Duration: 254.68 ms
The second request log entry looks like this:
START RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa Version: $LATEST
handler:133 - INFO - HANDLER CALLED
handler:135 - INFO - event={'version': '2.0', 'routeKey': 'POST /example', 'rawPath': '/sandbox/example', 'rawQueryString': '', 'headers': {'accept': '*/*', 'content-length': '22', 'content-type': 'application/json', 'host': 'nnnnnnnn.execute-api.eu-central-1.amazonaws.com', 'user-agent': 'curl/7.68.0', 'x-amzn-trace-id': 'Root=xxx', 'x-forwarded-for': 'NNN.NNN.NNN.NNN', 'x-forwarded-port': '443', 'x-forwarded-proto': 'https'}, 'requestContext': {'accountId': '000000000000', 'apiId': 'nnnnnnnn', 'domainName': 'nnnnnnnn.execute-api.eu-central-1.amazonaws.com', 'domainPrefix': 'nnnnnnnn', 'http': {'method': 'POST', 'path': '/sandbox/example', 'protocol': 'HTTP/1.1', 'sourceIp': 'NNN.NNN.NNN.NNN', 'userAgent': 'curl/7.68.0'}, 'requestId': 'AAAAAAAAAAAAAAA=', 'routeKey': 'POST /example', 'stage': 'sandbox', 'time': '11/Aug/2022:04:34:24 +0000', 'timeEpoch': 1660192464915}, 'stageVariables': {'StageVar': 'Value'}, 'body': '{"Message": "Test123"}', 'isBase64Encoded': False}
handler:138 - INFO - body=
{
"Message": "Test123"
}
handler:140 - INFO - data={'Message': 'Test123'}
handler:144 - INFO - HANDLER DONE
handler:145 - INFO - result={'message': 'ok'}
handler:146 - INFO - return_object={'statusCode': 200, 'headers': {'x-custom-header': 'my custom header value', 'content-type': 'application/json'}, 'body': '{"message": "ok"}', 'isBase64Encoded': False}
END RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa
REPORT RequestId: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa Duration: 10.44 ms Billed Duration: 11 ms Memory Size: 128 MB Max Memory Used: 52 MB
Where to from here...¶
This was literally just scratching the surface. API's have a lot of options and some of the other topics you may want (or need to) consider include:
- Authorization
- Using a custom domain
- Stage variables
- Path based variables (and greedy path variables)
- Handling default routes
- Working with OpenAPI 3 definitions (import and export)
- Versioning of your API's
- Managing new deployments, canary deployments and other strategies
I hope this quick introduction gave you a point of reference to start exploring on your own!
Tags¶
aws, lambda, python, api, proxy, integration