Using the Sync API in Kontent.ai

Kontent.ai Integration

The Sync API from Kontent.ai optimises content synchronisation by providing content deltas from a specific point in time, alleviating API workload. It complements webhooks, enabling multiple applications to monitor content changes at flexible intervals. It allows us to track changes such as new language versions and content items, content model adjustments, workflow updates and basic content updates.

What is the Sync API

The Sync API enables content synchronisation between our applications and Kontent.ai, optimising data retrieval and minimising the strain on API resources.

Before the Sync API came along we relied on webhooks, to notify our application that a change had occurred, then we would pull the changed items and react to it accordingly on-demand. Depending on how often content items were updated, this could be expensive in terms of API use.

The Sync API take s different view and will create deltas from a specific point in time of items that ave changed since we last checked. This then put all of the control with the calling application. It also means that we can have multiple applications checking for changes at different times.

What events can cause a change to be logged? In the production environment, this is an easy question to answer as only three things can impact your content:

A new language version has been published
The content model changes
The content item (not the variant, this is locked) is changed

For the preview environment, we add that any change that causes a language variant to the updated will log a change. This includes changing the workflow step of the language variant.

There are two main parts of the Sync API; initialisation and synchronisation.

How to Initialise the Sync API

Initialisation is how you tell the SyncAPI what content items you're interested in. We do this by providing a set of filters to the Sync API. This set of filters is not as wide-ranging as the delivery API filtering, but it typically does enough in my experience.

The available filters are system.type, system.collection, system.language, and language. As with the delivery API, to ignore language fallbacks, we need to specify both system.language and language to be the language that we're looking for.

We pass the filter params and authentication tokens to the Sync API on the URL https://deliver.kontent.ai/<YOUR_ENVIRONMENT_ID>/sync/init as described in the Sync API documentation. The only thing you're looking for after initialisation call is the continuation token. You don't get any items back from this call. What we're doing in effect is drawing a line in the sand an telling the sync API to notify us of changes when items that match our filter change.

The token that you receive for initialisation is import, and you need to persit is somewhere. You'll be using htis token going forward to check if anything has changed.

Retrieving Sync API Updated

Retrieving updates is achieved by calling the sync endpoint and providing not only your bearer auth token, but also adding in a new HTTP header to the request. This header is going to contain your current continuation token.

Your request in cURL will look something like this:

curl --request GET \
  --url https://deliver.kontent.ai/<ENVIRONMENT_ID>/sync \
  --header 'X-Continuation: <CONTINUATION_TOKEN>'

If successful, the response to this call will be one of two things:

An empty items collection and the same continuation token in the X-Continuation header, or
A populated items collection and a new continuation token in the X-Continuation header.

If the token is the same, you can most likely ignore the response. When is it different however, we can act upon the changed items and store the new token for our next call.

There can be up to 500 items in a single response. This may seem like a lot, but keep in mind that this not a full object graph.

Take note that there is no pagination information other than the API token that is returned in the header will be different when items are returned. You can do two things with this. Either you store this an use the updated value next time you call the Sync API, or you call the Sync API immediately and continue to do so until there are not items in the response and the continuation token no longer changes. Typically, I use the second option to ensure that I don't end up with an ever increasing number of items to process.

An interesting point is also that you can replay continuation tokens to a certain degree. Even if you have already used the next continuation token, if you resend a continuation token, you will go back to that point in time and retrieve all changes since then up to the current moment.

What is an item?

According to the documentation, an item returned by the Sync API can be described as follows:

Metadata specifying a content item in a specific language and the time of the item's last change.

There's quite a lot actually packed in to this metadata. You'll gat a flat view of the content item including elements and system data for each item. With the elements, some of the returned data is restricted, and you need to be aware of that. Because this is only a single layer of data (i.e. depth = 0) things like modular content and linked items are not retrieved. You'll know about their IDs, but not that actual content itself.

In some cases, this is enough to work with. You may for example just be flushing cache for the changed content item. The item is one level only though, so if you need to work with linked items or perhaps modular content, then you will need to make another request to the delivery API to get the data that you want to work with.

Be careful with model updates

As with webhooks in Kontent.ai, you need to be cautious when changing content models. If you change the content model, then all items that match your webhooks and sync API subscription will trigger an update. For webhooks, this means you'll get a lot of individual notifications to process, but for Sync API it means you'll get a lot of pages of updates. Depending on how you process these updates, you could end up using a lot of API calls.

What I've done in the past for larger model changes is to delete my Sync API tokens before the model update to force a new token to be created. To do this, you need to know that you can have a content freeze or be in a position to run something that can manually synchronise the content back in to your application(s).

Example Use Cases

So, you may be asking where you might use the Sync API. There are some good bullet-point examples in the documentation, but hare are a couple of additional ways that I've used it:

Translation Integration

When working on projects that required integration with 3rd party translation services. Creating workflow steps that reflect the translation process meant that we could move an item to be 'Ready for Translation'.

The Sync API will pick up any changes to workflow, to the consumer of the Sync API just needs to check the workflow, and workflow_step properties in the system data in order to know whether to pass the item for translation or not. In and ideal world, we'd be able to filter on all of the system data.

Web Application Routing

On a web implementation project with deep-route we used the Sync API to drive a process that maintained our routing tables. Taking the url slug from new and updated items, we could recalculate the URLs in the application and ensure that all canonicals and language variant URLs were up to date. Once updated, we flushed the routing cache and let the application carry on.

Search Indexes

Working with a static site, we had a web channel that would rebuild itself as content changes. What we also needed was a way to repopulate our search indexes. Using a cron job to poll the Sync API, we can get changes to the pages that we're interested in when they are published (or removed!) and update the index accordingly.

Summary

The Sync API gives more control to the calling applications with regards to content update. We receive mode information in the Sync API that we do for webhooks, and in some cases this can save some time when processing the updated themselves. We still need to use caution with content model updates, as this can register a lot of updates.

In preview, it does seem like we have to do some extra work, as we're not able to filter on the workflow step. SO be prepared to do some extra checking if you are only interested in content items at certain steps in the workflow.

It's a great tool for synchronising content across multiple channels/applications and definitely one to investigate for integration work too.

This is complementary to webhooks, not a replacement. You need to think of the pros and cons for your particular use cases when deciding how to react to changes in the content.