AWS AppSync with Custom Domain GraphQL Subscription issue and workaround

AWS AppSync has supported GraphQL real-time subscription with web socket since April 2020. CloudFront (with Route 53) is suggested to use if you want to use Custom Domain, as AWS AppSync does not support out-of-the-box. GraphQL Query and Mutation operations work without issue, but the Subscription operation is a different story.

TL;DR

AWS AppSync GraphQL Subscription operation does not support Custom Domain. The AppSync Domain is required in the Host attribute in the Web Socket payload to perform correctly

Why do we need Custom Domain?

A URL, which looks like https://XXX.appsync-api.XXX-XXXXXX-X.amazonaws.com/graphql, is assigned to your new shiny AppSync API. Your application or clients can communicate to the AppSync. However, when you want to switch over your GraphQL to the new AppSync instance, getting multiple region deployment or even just fail-over AppSync instance, this random URL is a disease. To hide your AppSync URL behind a Custom Domain can help you to archive this goal, so you can switch over / routing to a different AppSync endpoint without modifying the software source code.

Configure Custom domain for AWS AppSync

Different domain names are used for the Query/Mutation operation and the real-time subscription operation in AppSync, e.g. appsync-api vs appsync-realtime-api. How to configure the AWS CloudFront is already an issue.

AWS AppSync with CloudFront – Multiple Original

My colleague had an initial discussion with AWS support. The suggestion was to use the multiple original settings in the CloudFront. For example, the traffic to /realtime can be redirected to the AppSync real-time endpoint.

However, it does not work. The web socket communication was initialized, but not able to complete the handshake process. It returns the error as follows,

{
  errors: [{
     "errorType":"com.amazon.coral.service.http#HttpNotFoundException",
     "errorCode":400
  }]
},"type":"connection_error"}]

HTTP headers are used for authentication in Query and Mutation operations, but it is not for Subscription, which is based on Web Socket communication. The authentication (e.g. X-API-KEY or Authorization) and the hostname are encoded as Base64 string in the query payload in the initial handshake for Web Socket communication [1].

Without the correct HOST header, AppSync service does not know how to “map” to the correct AWS client (Similar to the Host header for the shared-web hosting). CloudFront only modifies the HTTP HOST header and forwards the HTTP packet to the corresponding service for HTTP communication only. As CloudFront will not modify the query string, so the HTTPNotFound error occurred.

Lambda@Edge

After reported this issue to the AWS Support team, one of the AWS Support suggested that we need to pass the AppSync host URL in the payload. However, there is no way to modify the payload as it is not configurable in AWS SDK. AWS Support had suggested that I should try to use Lambda@Edge to intercept query string to modify it(Man-in-the-middle Attack?).

After understanding the Lambda@Edge [2], I can use a single original setting in the Cloud Front to redirect the packet to the real-time AppSync endpoint and intercept the host parameter in the payload query string successfully if the packet contains the upgrade to WebSocket Header. (Remark, the host-name is the AppSync non-realtime endpoint hostname instead of real-time one) The WebSocket handshake can be completed and the connection is established, but the new issue is raised – the HttpNotFoundException error in the WebSocket communication during the Subscription registration (start) phase.

Subscription Operation Details

In order to successfully initiate, establish, register and process the subscription request, the GraphQL client needs to step through the following process based on AWS documentation [1],

  1. Initialize connection (connection_init)
  2. Connection acknowledgement (connection_ack)
  3. Subscription registration (start)
  4. Subscription acknowledgement (start_ack)
  5. Processing subscription (data)
  6. Subscription unregistration (stop)

The HttpNotFoundException error was reported in the start phase. Web socket communication is using the same authorisation mechanism. Let’s review how the web socket communication looks like, the following example is from AWS Document [1],

{
  "id": "ee849ef0-cf23-4cb8-9fcb-152ae4fd1e69",
  "payload": {
    "data": "{\"query\":\"subscription onCreateMessage {\\n onCreateMessage {\\n __typename\\n message\\n }\\n }\",\"variables\":{}}",
      "extensions": {
        "authorization": {
          "Authorization": "eyEXAMPLEiJjbG5xb3A5eW5MK09QYXIrMTJEXAMPLEBieU5WNHhsQjhPVW9YMnM2WldvPSIsImFsZyI6IlEXAMPLEn0.eyJzdWIiOiJhNmNmMjcwNy0xNjgxLTQ1NDItEXAMPLENjY0MTg2NjlkMzYiLCJldmVudF9pZCI6ImU3YWVmMzEyLWUEXAMPLEY0Zi04YjlhLTRjMWY5M2Q5ZTQ2OCIsInRva2VuX3VzZSI6ImFjY2VzcyIsIEXAMPLEIjoiYXdzLmNvZ25pdG8uc2lnbmluLnVzZXIuYWRtaW4iLCJhdXRoX3RpbWUiOjE1Njk2MTgzMzgsImlzcyI6Imh0dEXAMPLEXC9jb2duaXRvLWlkcC5hcC1zb3V0aGVhc3QtMi5hbWF6b25hd3MuY29tXC9hcC1zbEXAMPLEc3QtMl83OHY0SVZibVAiLCJleHAiOjE1NzAyNTQ3NTUsImlhdCI6MTU3MDI1MTE1NSwianRpIjoiMmIEXAMPLEktZTVkMi00ZDhkLWJiYjItNjA0YWI4MDEwOTg3IiwiY2xpZW50X2lkIjoiM3FlajVlMXZmMzd1EXAMPLE0dG91dDJkMWwiLCJ1c2VybmFtZSI6ImVsb3J6YWZlIn0.CT-qTCtrYeboUJ4luRSTPXaNewNeEXAMPLE14C6sfg05tO0fOMpiUwj9k19gtNCCMqoSsjtQoUweFnH4JYa5EXAMPLEVxOyQEQ4G7jQrt5Ks6STn53vuseR3zRW9snWgwz7t3ZmQU-RWvW7yQU3sNQRLEXAMPLEcd0yufBiCYs3dfQxTTdvR1B6Wz6CD78lfNeKqfzzUn2beMoup2h6EXAMPLE4ow8cUPUPvG0DzRtHNMbWskjPanu7OuoZ8iFO_Eot9kTtAlVKYoNbWkZhkD8dxutyoU4RSH5JoLAnrGF5c8iKgv0B2dfEXAMPLEIihxaZVJ9w9w48S4EXAMPLEcA",
          "host": "example1234567890000.appsync-api.us-east-1.amazonaws.com"
         }
      }
  },
  "type": "start"
}

The reason of HttpNotFoundException is the same as we have discussed before. Each web socket needs to contain the authentication and host information. Unless you do not use the AWS SDK, it is impossible to modify the web socket communication as the Edge@Lamdba only works with HTTP communication and AWS SDK is not supported it.

In addition, even you have a way to modify the payload, how do you know what is the AppSync hostname from Custom Domain?

Workaround

After considering the issues above, I have built a new Query operation connected to the following Lambda function in the AppSync.

import { Handler } from 'aws-lambda';

export const appSyncDomainUrlHandler: Handler = async (event: any) => {
  const host = event.request.headers.host as string;
  return {
    apiUrl: host,
    region: host.split('.')[2],
  };
}

AWS CloudFront replaces HTTP HOST header from Custom domain to the corresponding AppSync domain in all the Query and Mutation. This lambda function returns the HOST header in HTTP Request packet back to the client. Depended on your AWS Route53 and CloudFront configuration, this lambda let the client knows which AppSync domain should subscribe to.

The application should query the AppSync Domain and use this information to config the SDK for the AppSync Subscription.

I have discussed my solution with AWS Support, and seem that is the limitation in the AWS AppSync at this moment.

Reference

  1. Building a Real-time WebSocket Client
  2. Using AWS Lambda with CloudFront Lambda@Edge

A Polyglot Software Engineer and Technical Consultant who is interesting in technology, programming, sports and reading. He is living in Melbourne, Australia and original form Hong Kong.

2 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.