Optimisation

Summary

It is critically important that client application developers communicate with the OpenTV Platform in the most effective way possible. Each unnecessary query or piece of data that is retrieved slows down the performance of the application and puts unnecessary additional load on the platform. Processing cycles are used up by the application when it processes data, which can have a negative impact on perceived performance if the user is trying to navigate around the application at the same time.

This page provides a number of optimisation techniques when developing a client application that uses the OpenTV Platform.

Metadata retrieval

This section describes some best practices regarding metadata retrieval.

Paging

Several APIs support paging data. Use this functionality to retrieve only the number of items you can display on the screen at once. You can fetch additional items if the user requests them, or use pre-fetching to dynamically retrieve additional items as the user navigates so that they don't perceive any data gaps. Do not retrieve everything at once just because it's easier. Each item retrieved takes up server processing time, bandwidth, client processing time, and memory to store.

Data fields

Several APIs allow you to specify the fields you want returned. Use this functionality to only return the data you use in the application, either to display or perform logic on. A field does not have to be returned even if it is involved in the query.

For example, to get the first 5 root catalogue nodes, but only the id and Title fields:

CODE

http://server:port/metadata/delivery/vod/nodes?filter={"locale":"en_GB","isRoot":"true"}&limit=5&fields=["id","Title"]&sort=[["cmsOrder",1]]&version=20130603134426

Mandatory fields: Some fields will always be returned irrespective of what fields are requested, such as "version" and "total_records".

Limit your date range queries when using multiple values

Specifically, when running a date range query for events, it seems obvious to simply specify a start date < Y, end date > X, where X is the start point of your viewing window, and Y is the end. The MDS will apply these filters in order, so when filtering for events with a start date < Y, it can still leave a large amount of events to then filter on again using end date > X. To improve this, also specify start date > Y - n, where n is a reasonably large value that you can guarantee an event won't start before, for example, 12 or 24 hrs. This means that the end date filter is applied to a much small number of events.

Let's look at a visual example. The following diagram demonstrates how a "start < Y" query with no lower bound will require MDS to consider a large subset of the index for the "end > X" part of the query.

As you can see, the part of the index considered for the second part of the query that does not actually satisfy it is large (and as such has a performance impact).

If we apply the following restriction on start instead, for example, "Y-n < start < Y", the subset of the index considered is much smaller:

Sorts must be backed by an index, and a restrictive filter applied to sorted queries

Sort the data returned only when it is required by the UI. Using a sort function (without any index) means that the MDS must first load all relevant data after a filter into memory to sort it. It is vital that you communicate your queries (as mentioned below) with your system operator so that they can index the sorting functions. This will prevent MDS loading a lot of data into memory, and causing performance issues on the platform. Also, to restrict the amount of sorting that must be done, apply the strictest filter you can to minimise the sort size.

Use caching and versions / if modified since

Several APIs support versions in the requests and responses. A version represents the state of the data, and will change as the data changes.

MDS supports two different methods of interaction with "versions" of the metadata catalogue (If Modified Since and Versions).

If Modified Since

Metadata server natively supports If-Modified-Since semantics, which is ideal for clients or caching proxies to revalidate their caches of metadata.

MDS can check whether data has changed without hitting Mongo, so a 304 response is a very quick operation.

The following diagram shows the general use of the If-Modified-Since flow:

Although the majority of MDS APIs support this workflow, there are some APIs that are dynamically changed at times other than ingest.
The following APIs do not currently respond with 304 at any time: /vod/editorials

Versions

An alternative to "If-Modified-Since" is the built-in version number.

Each response from the appropriate MDS APIs will include the version number. Therefore, the version number can be used by client applications to detect changes in data that they should act on, or otherwise safely cache data.

For example, as a client application downloads parts of a VOD catalogue tree, it could cache the tree in memory. Without versioning, it would either have to not cache, or periodically update the tree to ensure it had the latest version. With versioning, the app knows when something has changed, and therefore can persist the same tree until then, without making expensive calls to get the tree again.

For example, to retrieve the version number for the VOD catalogue in MDS:

CODE

http: //server:port/metadata/delivery/vod/version
  
{ "version" : "20130603134426" }

Any responses will also contain the version number with the JSON response:

CODE

http: //server:port/metadata/delivery/vod/nodes?limit=1&pretty=true
  
{
  "nodes" : [
   {
    "BarkerChannelRef" :  "TEST" ,
    ;...
    "title" :  "Destacados" ,
   }
  ],
  "total_records" :  136 ,
  "version" :  "20130603134426"
}

It also important to use versioning for server-side caching to work properly. Without a version number specified in a request, a server side cache might not update the response to the query for some time after the data has changed. If a version number is supplied, a data change will result in a cache miss that will force the server-side cache to update its data.

For example, if you query the above nodes API again with a version number:

CODE

http: //server:port/metadata/delivery/vod/nodes?limit=1&pretty=true&version=20130603134426

If the data had not changed, you'd get the same response (presumably returned by the cache). But if it had changed, you would get:

CODE

{
  "latest": "20130603135154",
  "msg": "Invalid version of data requested, please use latest",
  "requested": "20130603134426"
}

Be aware that if you're using the /vod/editorials API with the dynamic flags isValid, isPurchaseable, isVisible, or isPreviewScreening, the version might not accurately reflect the state of the API, and these flags might have been flipped on some records between a change in catalogue version.

Optimise queries for cache-hits

Each query to the OpenTV Platform should be optimised to maximise cache hits. This means avoiding sending filter values that frequently change between clients.

Time-based queries

Time values are usually specified by UNIX epoch time, or timestamps. Using time based on a specific second can cause queries to vary greatly between clients as they each generate the time based on the current moment. Instead, round the time up or down to the most coarse value that makes sense for the use case.

For example, if querying for events to populate an EPG guide and you need to provide start and end times for the window, round the times to the closest 15-minute interval. This will mean that each client will send the same query for an interval of time, thereby maximising the server side cache-hit.

Structured pre-fetch

When pre-fetching data, try to keep the pattern of fetching additional data the same across devices. For example, if the user enters the EPG grid at the same position each time, and the navigation is via paging rather than scrolling, you can fetch the events for that channel group, which should be the same for each channel. If the user can scroll down one channel at a time, however, you should fetch the events on a per channel basis, to ensure that only the needed data is returned, and that each query can be effectively cached by the server.

Discuss your queries with the system operator

The services that implement the APIs can often be tuned for specific use cases. A good example is a database index, where adding an index for a key field in a query can dramatically improve the search time for specific records. Client application developers should provide the operator responsible for the platform that they use with a list of the APIs their application uses, including all of the filters, fields, sorts, limits, and so on, so that system can be tuned as much as possible on the server side. This list should be actively maintained as the application remains in progress.