Sebastian Sieczkowski14 min

Building a Contentful-driven Audio Streaming App

EngineeringProductAug 19, 2022



Aug 19, 2022

Sebastian SieczkowskiBackend Engineer

Share this article

Creating an iOS app that serves audio content to users is simple enough. But managing the content without an engineer? Different story. Luckily, we found ways to leverage Contentful.

STRV was tasked with delivering such an iOS app. The client wanted to be able to manage the content without the need of an engineer and without having to push a build every time something was changed on the homepage or tracks — therefore the content, app settings and layouts had to be loaded at runtime.

The initial estimate involved creating an admin panel for managing the content and app settings. However, we found ways to leverage Contentful for this task and removed the need for additional frontend work.

The project can be boiled down to “a heavily customizable, CMS-driven, subscription-based audio streaming app for iOS.” This article focuses on the "CMS-driven" and "heavily customizable” aspects. We’ll outline the decisions around the content model and the lessons learned along the way, as well as provide some tips and potential future improvements.

Heavily Customizable and CMS-driven?

This means that we wanted the content team to have full control over the layout and the content of the app without having to involve engineers in the process.

We needed to enable the content team to achieve the following tasks:

  • Add/edit/remove tracks
  • Add/edit/remove section
  • Order tracks and sections
  • Customize the look of a section
  • Ability to have big and small card sizes, hide section titles, hide the see more button, etc.

The homepage designs had a featured track at the top and several sections with titles and lists of tracks below, which could be scrolled through to view more tracks — with a maximum of 10 tracks on the homepage and a button to see more tracks on another screen. The general layout looked a little like this:

If we break that down into the clear rectangles we see above, we end up with a page, a section and a track. And so we created 3 content types for this purpose:

  • Page
  • PageSection
  • AudioTrack

Here we have a page with several sections. The page number is a unique required field so that we only have one page with a particular name and can fetch by that rather than by ID. This allows us to expand to support multiple pages, which we eventually did for both free and paid users — so that the pages can be customized separately.

We also have the first section with a featured audio track. However, we don’t only have the content for the section here — to make it truly CMS-driven, we have options to set the size of the audio track's preview card and even the icon of the page section. There are also some other options on a section that we’ll cover a bit later.

As seen above, the page sections and the tracks within a section have a drag handle on the left and therefore can be placed in any order that the content team wants, thus curating the content rather than just adding to an endless list of audio tracks.

With this, we've successfully allowed the content team to not only create content but also curate and display it in any desired configuration.

Documenting the Content Model

Documentation is key for the engineers and the content team, allowing them to easily understand the content model and how certain changes will affect the app. Comments in the code are a great way to do this, but it's not always easy to find the right place for them — and the content team doesn't have access to the code.

An alternative is to use a tool like Confluence to create a wiki. But we found an even better option: having the documentation in the context of the task that is carried out. If someone is adding content or otherwise curating it, being able to see tooltips or other helpful information in the same interfaces is a great way to help the content team, and it helps onboard any future engineers on the project.

Content Type Descriptions

It's good practice to add a description to the content type, explaining what it's for and why. This is similar to having comments in code — anyone can read the code and see what it does, but knowing why is far more important, especially when working with a team. Providing a description is like having a comment or documentation in the context of the content type.

A few examples of the above content types:

  • Page
  • Represents the layout of a screen in the mobile application and contains sections
  • PageSection
  • A section for a page containing title, icon and a list of tracks
  • AudioTrack
  • Contains the audio file, the title of the track and other settings like color, subscription requirement, queue visibility, etc.

Content Type Names

The content type name is only used in the contentful UI. It must be unique and descriptive, and it should make sense to both the content team and the engineers. Adding extra information to the name is a good idea, especially since there's no limit on special characters.

The API identifier can be more concise; usually, a slug is used — as there will be more context in the code comments — but, ideally, a slug resembling the name, to avoid confusion. The name and identifier shouldn't be too different.

Field Names

Just like the content type names, the field names inside the content types can be more descriptive and use special characters. So, as in the example below, we can change what would be a boolean variable of isTemporary on a section to something resembling a question.

Field Help Text

We can further assist the content team by adding some help text which will be displayed in the UI, explaining what a temporary section does in the app or what “hide/see more” means. This is far more valuable than having the documentation outside of the context of the content type.


On a project that requires being so dynamic and giving the content team full control over the content and even some of the app settings, it’s important to have great error handling in the API and parse only the expected content which is mentioned in the Content Parsers section.

Contentful has a number of field validations that can help with this.

The validations vary by field. The above screenshot is for an array of references to other entries on a page under the sections field. The page should always have at least one section; otherwise, the homepage would be empty.

We can set the field to be required (or set a minimum number of entries). Always set the accepted entry types to minimize human error and the need to remember which fields or content types can take certain other types — thereby creating a consistent self-documenting model. Setting up the validations early on, especially before any content is created, is the best way to ensure that the content is valid and doesn't have any errors from the get-go. While it is possible to add validations later, any published entries will still be visible to the user before they're edited and republished — so the validations will only trigger when the content is about to be published again.

For example, if a validation is to fix a display issue in the app, such as limiting title lengths, the old entries will have to be manually updated before that can take effect.

Limit the Number of Content Types

This one ties into the Removing Duplication in API Responses section below. Contentful has a hard limit on content types; while resolving all these links is easy due to the SDK having helper functions, it does have a nesting limit of 10 levels. It's good to be mindful of Contentful’s technical limitations and generalize the content types where possible — such as we did with the page content type — rather than just having a mobile page or a home page content type.

Use a Title and a Name

As the number of entries and content types grows in a Contentful space — especially when some content is built from a couple of types — it's good to have both a name and a title in the content model. This way, the entries can be easily identified even with minimal context by adding very descriptive names, while also having a separate field that the user would see.

For example, when we added tags, each tag would have a tag category linked to it. We found that it's easier for us to use [Tag Category]:[Tag Name] e.g. Activity:Create or Content Length:Short in the name of the tag, rather than just Create or Short.

While there is now a way in Contentful to show a column that contains linked entries by name, this only works when a specific content type is selected next to the search input. While looking at a list of all content types is possible, seeing the linked entries in another column at the same time is not — so prefixing the name with the tag category is easier to find and more readable than just adding the tag name, as shown in the screenshot below.

Extending Contentful to Your Needs

Even though it is a ready-made CMS that isn’t as customizable as Strapi or other options, Contentful still allows us to add custom code to the editor to make certain things easier to work with. This is what we’ve taken advantage of on the project.

Preview Mode / Sidebar Extensions

Although we have a development, staging and production environment and the editors have access to the staging app build, sometimes — if the environments have different features currently deployed or the editors want to check a quick change when the environment content is out of sync — the preview mode is a useful tool that allows seeing content in a changed or draft state. The content team can easily see changes in production without it going out to the users, which is especially important since we’ve added caching to the API, because a content change could no longer be reverted as quickly.

And since we also have deep linking in the app, we have implemented a QR code using googleapis charts in the side panel of the editor for each entry. By scanning it, the content team can easily preview the content before it’s published. The image below shows how that looks in the editor.

The code snippet below is the entire UI extension; it simply puts an image in the sidebar from Google Chart API based on the chl parameter. And when the API receives a content request with a source of Contentful and a user is set as an admin, the API fetches preview data from Contentful. This can work in different environments, too.

 This is a contentful UI extension and below are re-creation steps

 1. Goto -> settings -> extensions
 2. Add a new extension and paste the below HTML into the `code` section
 3. Uncheck all field types
 4. Set the extension as legacy by checking `Yes, this is a legacy sidebar extension`
 5. Set it to be hosted by contentful
 6. Save the extension

 7. Goto content model -> page (or others that support preview) -> sidebar
 8. Add the Preview QR extension
 9. save the content type

<!DOCTYPE html>
<script src=""></script>
 window.contentfulExtension.init(function (api) {

	let url = ""
	if (api.ids.environment == "staging") {
		url = ""
	} else if (api.ids.environment == "development") {
		url = ""

	const contentId = api.ids.entry;
	const img = (document.getElementById("qr-code") ?? {}); = 0; = "center";
    img.src = 
<img id=qr-code src="" width=145 height=145 />

Field Extensions

One of the fields controlling the look of the app on some content types is for color. Unfortunately, Contentful doesn’t have any fields with a color picker. For engineers and designers, it’s easy to work with hex codes for color values — but someone who’s managing content in the form of copy or audio doesn’t necessarily know what the hex values mean without context and even for those that do, it’s usually hard to visualize which color a hex value might look like.

So, we set the color fields as normal short text fields and also added an extension that colors the field. And when clicking on the field, it also shows a color picker — making it easier to work with.

Another use case for field extensions that we’ve used on other projects (especially on the web) is when a slug is needed for easier navigation or SEO reasons. An extension automatically generates a slug in another field based on the input of the name or title field.

Content Parsers

While we were initially adding a new endpoint to the API per content type, we've found that if the app consuming the data is aware of the content type to build the screens then we can serve all content from just one endpoint by the content ID.

This was also done to allow for deep links opening any kind of content by ID, such as opening an audio track, a page section, etc. So, with any response from the API — regardless if the response came from a specific or more general content endpoint — we always return the content type.

The contentful API also allows us to fetch by ID without the need to specify the content type. However, our API wouldn't know which parser to use so, for the content endpoint, we first get the raw response from Contentful, then select the correct parser by content type and return the parsed response.

Below is an example of a parser and the main function that would select the parser. It's not a complete example and the interfaces are omitted, but it illustrates how the parsers work.

// This function will parse a tag into a flattened object without all the unnecessary contentful fields.
export async function parseTag(rawTag: IContentfulEntry): Promise {
 assert( === "tag");
const parsedTag: ITag = {
	title: (rawTag.fields?.title ?? "").replace("\\n", "\n"),
	isSearchable: rawTag.fields?.isSearchable ?? false,
	image: sanitizeUrl(rawTag.fields.image?.fields?.file?.url ?? defaultImage("tag")),
	category: await parseTagCategory(rawTag.fields?.category),

return parsedTag;

// This function takes the raw response from contentful and returns the parsed response for any content type by first selecting the correct parser and then calling the parser.
export async function parseContent(content: any): Promise {
 const parsers: {
  [key: string]: (arg0: IContentfulEntry) => Promise
} = {
  audioTrack: parseAudioTrack,
  page: parsePage,
  pageSection: parsePageSection,
  pageSectionSimple: parsePageSection,
  tag: parseTag,
  tagCategory: parseTagCategory,

const contentType: string =;
const parser = parsers[contentType];

if (!parser) {
  throw new Error(`Content parser not found for type: ${contentType}`);

return parser(content, contentfulClient);

Management API

The Contentful Management API is also a very useful tool that we’ve utilized multiple times on this project.

When we started implementing the tags and tag categories, we had a number of defined ones in a spreadsheet and it was much faster for us to insert them programmatically by simply looping over a couple of lists and using the management API to create and publish the entries, rather than the content team manually adding each tag and tag category in the UI.

When we started working on the search endpoint for the audio tracks, we ran into an issue with the search returning inaccurate results. We found it was because previously, when we needed to add line breaks into titles and subtitles of tracks and the page sections, we initially used \n — causing a title like “Calm Stories” to become “Calm\nStories'' because we didn’t want spaces at the start or end of a new line to be rendered in the app. This was a problem because the search would match words only, hence the inaccurate results.

To fix this, we needed to add spaces between the line breaks; but at this point, we already had hundreds of audio tracks in Contentful and it would have taken hours for the content team to manually go into each track and edit the titles. But with the content management API, it only took a few lines of code and several minutes to write it.

We could fetch all the audio tracks, regex replace the titles to contain the spaces around line breaks and then update and publish the entries. After that, it took just a few seconds to update all the content.

Potential Improvements

Removing Duplication in API Responses

We started by sending all the data parsed to the iOS devices. However, that quickly turned out to be a bad idea, especially once we added tags that could be attached to multiple tracks — meaning the same objects would be sent multiple times under different tracks. When the homepage contains hundreds of tracks and each contains a few tags, it quickly adds up.

To limit duplication, we can use the to identify the object, then only send the data once by providing a reference link to the object and send a list of referenced objects in another key on the response.

Updating Cache via Webhooks

There is a limit of 2 million API requests per month in Contentful, which we were approaching quite fast once the app was up and running in production — because the client is already established and has many customers, all of whom started using the app once it was available.

To slow down the rate of requests, rather than fetching fresh data each time, we added caching to the API by content ID (so only the content that is fetched makes it to the cache, instead of holding a copy of contentful in memory). But this was just a minimal in-memory cache, so each replica of the API had its own copy of the content — and the content in the app was no longer being updated as soon as it was published in Contentful but rather when the cache expired and the data was re-fetched every 10 minutes.

Nonetheless, there are webhooks in Contentful that we've already been using to notify a Slack channel when an entry is updated. We’ve been discussing the possibility of using the webhooks to update the cache so that we always have the freshest data.

To combat the issues above, one of our ideas — which hasn't made it to production yet — is to use a distributed cache with a microservice in front of it. The microservice would listen to the webhooks and update the cache, so each of the replicas of the main API could use the distributed cache and serve the same data. Meaning that the most up-to-date content would get updated each time a published entry is updated, rather than by an arbitrary cache expiration time.

Not Everything Needs to Be a Reference

The app had two different modes for the audio content: the standard and the relax mode.

When relax mode was enabled in the app, some tracks would be filtered out and only the calm and relaxing content was left in the app. Originally, we used tags for this; however, we would then iterate over the tags in the API and expose a property on an audio track based as a boolean of isRelaxMode. This additional iteration over the tags — along with the content team needing to remember to tag the content like this while it was used for specific modes in the app — turned out to be quite a hassle and sometimes confusing.

Instead, a boolean in the content type would have sufficed and a bunch of code could be removed. Any important features or modes should have their own specific boolean field, just like we did with whether a track is free or not, whether it’s hidden from the track queue, etc.

A similar issue of trying to use tags in the name of customizability even when they shouldn’t have been used was when we used the tags for credits, since all we needed were the names and jobs of the people involved in the process of creating a given tag. For example, a tag category of Narrator and a tag with the name of a staff member were used. But to expose it on the API, we still iterated over the tags and exposed a new key of credits on the audio track response.

Eventually, we replaced the tags for credits with a credits field on a track that would take an array of staff content types that contained the role and the name of the person. We found that while there is a limit on the number of content types that can be created, trying to minimize the number of content types by any means necessary can make things a lot more confusing for the content management team, while also making the code more complicated.

The lesson here is: Don’t limit the number of content types by any means necessary, and don’t over-engineer. Sometimes, the basic types are just fine.

Share this article

Sign up to our newsletter

Monthly updates, real stuff, our views. No BS.