
Quantum computers are inherently noisy. Various quantum algorithms that offer an advantage over classical algorithms are going to require hundreds of qubits and billions of operations – all of which are noisy. To tackle the noise in the system and unlock such quantum algorithms, we are going to need quantum error correction (QEC). In QEC, we use many physical qubits to encode fewer logical qubits, introducing redundancy and thus protecting the information. You can read more about QEC in this Riverlane post: Why do quantum computers need QEC?.
A key component to QEC is decoding. During QEC, measurement data (which we call a syndrome) is acquired, which gives clues as to where errors happened during computation. The task of a classical algorithm, known as a decoder, is then, given the syndrome data, to calculate the likely errors that occurred.
In this post, I will discuss the need for the decoding process to happen in real time. At Riverlane, we believe that three aspects are needed to fully demonstrate real-time decoding. These are:
- High-throughput streaming decoding,
-
Low-latency decoding,
-
Fast feedback.
First, let me explain why we need real-time decoding.
Need for real-time decoding
Let us first consider fault-tolerant quantum computation with Clifford gates (e.g. CNOT, H, S and Pauli gates). With Clifford circuits, if we know the errors that happened during computation, we can move these errors through Clifford operations to the end of the circuit and simply reinterpret the measurement outcomes. This means that, for fault-tolerant computation with Clifford circuits, it is sufficient to store the syndrome information and decode in post-processing, after the circuit has been executed. However, Clifford circuits alone are not enough – these can be easily simulated on classical computers and do not give us quantum advantage. Therefore, we need non-Clifford gates to give us a universal gate set.
Leading QEC proposals for implementing non-Clifford gates require logical branching. By logical branching, we mean a logical operation conditional on a corrected logical measurement. For example, below I have shown a commonly referenced proposal for implementing a logical T gate on the surface code. This involves a logical S gate on a logical qubit, conditional on the measurement outcome of a second logical qubit (indicated in green). In order to correctly decide whether to apply the S gate, we need the error-corrected measurement outcome of the second logical qubit. This means, at the green point in the circuit, we need the decoder outcome and so decoding needs to happen during circuit execution (in contrast to how with Clifford gates it was sufficient to decode after circuit has been executed). This raises the need for live decoding feedback and therefore real-time decoding.

The fact that decoding needs to happen in real time, during circuit execution, places constraints on the decoding process - in particular, how fast the decoding and the control and communication latencies in the system (for example, to send the data to the decoder and send the decoder result back through the control system) need to be. This motivates the three conditions that the decoding system must satisfy to fully demonstrate real-time decoding. I will discuss these in the rest of this post.
High-throughput streaming decoding
One of the decoding speed constraints is due to the backlog problem. The backlog problem arises when the rate at which the decoder process data is slower than the rate at which data to be decoded is generated. On superconducting devices data can be generated as fast as in 1μs.
Imagine a factory line with multiple stations to build a toy. The toy has to pass through each station and each station adds a feature to the toy. If one station is slower than the ones before it, you can imagine toys accumulating at the slower station and creating a backlog of toys for this station to deal with.
In quantum circuit terms, consider the figure below. At the first point indicated in green, as discussed above, we will need the decoder result to decide whether to apply the conditional S gate. If the decoder has been too slow to process the incoming data, then the first logical qubit is going to have to sit there, idling and acquiring a backlog of data. This means that, in the next decoder iteration, shown with the second green box, the decoder now not only needs to decode the data acquired during computation, but also that additional backlog of data from the previous decoding iteration. Now that there is more data to decode, the decoding process will take even longer. As a result, during this decoding iteration, the second logical qubit will accumulate an even larger backlog of data while the decoder finishes. This even larger backlog of data would then need to be dealt with in the subsequent decoding iteration. This way, if the decoder is not fast enough, we essentially end up with an ever-growing backlog of data to be decoded on each subsequent decoding iteration, which exponentially slows down quantum computation.


To show that the backlog problem is avoided, it is going to be crucial to demonstrate that the rate at which the decoder can process data (the throughput) is at least as fast as the rate at which data is generated. This will require streaming decoding, where data is decoded in blocks as it is generated (in contrast to decoding all the data at once), and showing that the decoding time per QEC round remains constant as the number of QEC cycles is increased. To achieve this, we will need either a very fast decoder, or a slower decoder parallelized over multiple decoder threads, so that a large amount of data is still processed in a short enough time. Motivated by this, Riverlane's QEC Stack, Deltaflow, includes a fast, parallelized streaming decoder.
Low-latency decoding
Another key parameter to optimize is the full decoding response time, by which we mean the time between the final data extraction and the application of a conditional logical gate. The full decoding response time corresponds to the entire cycle shown below - sending syndrome data to the decoder, decoding and sending the decoder result back through the control stack. Using the factory analogy, this is the full time it takes for a single toy to pass through all the stations in a factory line.

The full decoding response time impacts how fast gates that include logical branching can be executed (for example, the logical T gate on a surface code shown above). This means that the full decoding response time also affects the logical clock rate, which determines how fast fault-tolerant quantum algorithms can be executed.
Craig Gidney and Martin Ekerå showed that 2048-bit RSA integers can be factored in 8h, if you assume a 10μs decoding response time [Quantum 5, 433 (2021)]. If the response time were instead 100μs, this would slow down the algorithm more than six times, likely even more because longer response times would require larger code distances (note that the response time is only one aspect of total runtime, hence the two are not proportional).
Therefore, in order to maintain a fast logical clock rate, we need low decoding response times. Crucially, although high throughput can be achieved by parallelizing a slower decoder, achieving a low full decoding response time still requires a low-latency (or fast) decoder, as the overall latency depends on the time taken by a single decoder thread to finish its computation. Going back to the toy factory analogy – if a station is slow, we can have multiple copies of this slower station work in parallel. This way we can avoid accumulating toys at the station (backlog). But the toy factory will still be slow at making a single toy. Making stations themselves fast means that each toy spends less time in total being made. Similarly, in QEC, to achieve low full decoding response times, it is important to implement fast decoders. Researching fast decoding algorithms and efficiently implementing them in hardware is therefore one of the key focuses at Riverlane.
Fast feedback
To achieve fast full decoding response times, you not only need to optimise decoding latency, but also need to minimize any additional control and communication latencies in the system. Therefore, the final piece to the real-time decoding puzzle is to demonstrate a logical operation conditional on an error-corrected logical measurement. This allows us to account for all the latencies in the decoding process and truly assess the full decoding response time – i.e., how long it takes to repackage measurements in the control system, convert the measurements to the syndrome, send the data to the decoder, decode and then send the error corrected observable back through the control stack to conditionally implement a logical operation.
Together with Rigetti, Riverlane performed an experiment that implements a conditional physical gate based on the decoder outcome. We found that by integrating a fast FPGA decoder into the control system of the QPU, close to the physical qubits, we could minimize the control and communication latencies in the system to achieve a full decoding response time of 9.6μs for a QEC experiment involving 8 physical qubits and 9 measurement rounds. But this is just one of the many experiments that is pushing us along the road to fault-tolerant quantum computing. It’s a long road ahead – and one that, to travel on, we will need real-time QEC.