Why did we write an in-house SRT implementation?

SRT is now used regularly as a protocol for transporting video over the internet. Virtually all vendors use the libsrt library as it is easy to integrate, but we took the decision to write our own as part of our low-latency encoders and decoders in order to guarantee the highest reliability and performance for the critical transmissions our customers make. We heard from initial customers who used libsrt that they had occasional issues that they couldn’t explain. But for us, occasional issues that require a reboot are not good enough.

Upon inspecting the libsrt code, it was clear that it was based on legacy programming practices that would be difficult for us to maintain and more importantly, provide prompt support on. So over the past year we decided to write an in-house implementation built around modern 21st century software practices. Let’s find out more about what makes our implementation superior to libsrt:

High-Performance Single Threaded Event Loop

Implementation	RX Threads (lower = better)	TX Threads (lower = better)
Libsrt 1in/1out	5	3
Libsrt 10in/10out	50	30
OBE SRT (upipe) 1in/1out	1	1
OBE SRT (upipe) 10in/10out	1	1

The most important technical difference between libsrt and our SRT implementation (as part of the Open Source Upipe project) is the use of a single threaded event loop. This is in contrast to the old-fashioned threaded design of libsrt which spawns numerous threads. Event loops were made famous by the nginx web server, in contrast to the thread-based design of Apache. Nginx was able to perform substantially better than Apache and is now the dominant web server. Likewise, as shown in the table above, our implementation spawns vastly fewer threads,

Whilst spawning numerous threads is a fast way to market as it hides complexity, this causes problems in sophisticated applications like our products where we need to control the thread priorities of certain components. For example, we need to be sure that the SRT thread(s) can’t override a realtime priority thread such as our ST-2110 transmitter which has exceptionally precise timing requirements. Nor can the SRT thread be too low priority either, otherwise packets will be output late, leading to jitter. When thread priorities are hidden in a library they are out of our control. In contrast, a single threaded implementation makes this substantially easier, we set one thread priority and move on. If multiple threads were needed (e.g on devices with slow cores), upipe allows for multiple threads and setting the priority of those threads.

Thread scheduling issues

In addition to the performance issues created by multiple threads it’s clear that libsrt has challenges with synchronising the numerous threads they create. This can be seen clearly in Bug 1822:

The author of this bug explains the problem eloquently – in libsrt the socket (a destination IP address like 121.122.123.124:5000) is closed in a background thread and takes an indeterminate amount of time to close. This means that in our case, anyone restarting an encoder is effectively “locked out” of that destination without rebooting the machine or asking the remote end to change port. In our case, neither of those are practical as there can be dozens of encoders onboard a machine or the remote end may not be able to change their port. This poses an operational challenge today in many products using libsrt and care is needed to avoid “SRT Port Lock” as it’s often called. The author of the bug raises the valid point that the operator could also change the output format to something other than SRT, such as RTP or Zixi and these protocol implementations rightly expect the socket to be freed.

Our implementation is single threaded, so does not suffer from the same thread synchronisation issues. The pipeline stops immediately after being requested to do so and the socket is free for reuse. Our use of timers for SRT related processes such as keepalive and key-exchange make the implementation substantially simpler too.

It’s also quite easy to cause libsrt to deadlock using a simple fuzzer. We’ll talk more about this below.

Modular design

One of the most important parts of upipe is the notion of “pipes” being modular. The author of the code took great care to separate the handshake codepath from the sending and receiving of data/retransmissions. This modularity is valuable (albeit at a negligible performance cost) as it allows programmers to understand the code and individually test components.

Bug-for-bug compatibility

Haivision, to their credit, has written a largely-understandable specification for the SRT protocol, albeit one which from time to time talks about libsrt specific features and arguably some parts should be left to a networking textbook. However, it’s clear that certain parts of this specification have never been independently implemented. For example, it’s clear that key rotation has only been tested libsrt -> libsrt, as sending a key rotation sequence allowed in the specification would be rejected by libsrt:Similarly, sending values allowed by the specification would still cause libsrt to lock up:

Therefore, if libsrt has a bug, we must also match that bug in our implementation as there are a substantially larger number of libsrt capable devices in the field and we need to be able to interoperate with them. Maintaining, bug-for-bug compatibility allows for confidence that our implementation can be used against numerous different products out there. As demonstrated, we have been submitting fixes in order to improve libsrt but we can’t fix the thousands of incorrect implementations out there.

Old version compatibility

Whilst we haven’t seen anything ourselves, there are reports from the field of issues between different versions of libsrt, with a few vendors actually shipping multiple versions of libsrt. This is something we can test easily with pcaps and tests to make sure we can communicate with all versions.

Stability/Security analysis

Security/stability analysis on a protocol is a complex topic, with many talented researchers spending their careers trying to solve this problem. As we saw above it was quite easy to deadlock libsrt (requiring a reboot). So we decided to create a specific fuzzer for our implementation of SRT handshake and run this tens of billions of times. We hope to publish more about this work in substantially more detail in the future.

It should be noted that a basic unsophisticated fuzzer caused valgrind warnings in libsrt, suggesting deeper underlying issues.

Legacy features

We also decided to not implement features such as rendezvous mode and file transfer mode. This added substantial complexity to our implementation for limited gain. It also helped us reduce the technical debt from the old UDT protocol that SRT was based on. Potentially rendevous mode can be added at a later date.

Conclusion

SRT is an important transmission protocol used increasingly for high-end transmissions like sport. But it’s important to note the limitations of existing implementations for critical transmissions with large numbers of viewers. It’s also important to note that SRT processing doesn’t live alone, we have a whole product doing substantially more complex things like encoding and ST-2110 processing as well. It’s for this reason that an SRT implementation needs to be integrated cleanly into our encoders and decoders and writing a full in-house implementation was the best way to do this.

Why did we write an in-house SRT implementation?

High-Performance Single Threaded Event Loop

Thread scheduling issues

Modular design

Bug-for-bug compatibility

Old version compatibility

Stability/Security analysis

Legacy features

Conclusion

About Open Broadcast Systems

Compact encoders and decoders based on the NUC

QUICK LINKS

Archives

Contact Us

Latest News