6 December 2023
Can comparison-based sorting be sped up by using nondeterminism? The answer turns out to be yes, at least when you are sorting large integers and taking their bit-lengths into account when counting operations.
First of all, recall that the merge sort algorithm makes $\Theta(n \lg n)$ comparisons between elements of the input array, which is optimal for a comparison-based sort. Let $m$ be the size in bits of the largest integer of the array. Then, the runtime of merge sort is actually worst-case $\Theta(m n \lg n)$ if we assume that comparisons between integers take $O(m)$ time. The worst case happens when all comparisons always look at all or almost all the $m$ bits, for instance when all integers are identical.^{1}
Here is a simple nondeterministic sorting algorithm in Python^{2}:
from nondeterminism import *
@nondeterministic
def ndsort(A):
n = len(A)
for i in range(n):
j = guess(range(n))
A[i], A[j] = A[j], A[i]
for i in range(n-1):
if A[i] > A[i+1]:
return
return A
This algorithm fills elements A[0]
, A[1]
, …, A[n-1]
left-to-right by swapping each of them with a nondeterministically selected one (position j
). This ensures that all permutations are nondeterministically possible, and uses $n \lg n$ nondeterministic bits (i.e., $\lg n$ bits for each guess).^{3}
Then, we check if we actually guessed a nondecreasing permutation, that is, if we actually sorted the array. If we find a position i
such that the two consecutive elements A[i]
and A[i+1]
are not in nondecreasing order, then we return nothing (None
in Python), which rejects the input and looks for another accepting computation.
Otherwise, well, we guessed the right permutation (actually, one among the right ones) and we can return the sorted array. Since every array of integers has at least one nondecreasing permutation, there will always be at least one accepting computation for the ndsort
algorithm, which will give the expected output. This output is the same for all nondecreasing permutations; that is, even if we are using nondeterminism, the final result is actually deterministic.^{4}
Now, how much time does ndsort
take?
Assuming that guess(range(n))
takes $\Theta(\lg n)$ time in order to obtain $\lg n$ nondeterministic bits, the first loop takes $\Theta(n \lg n)$ time.
The second loop does $O(n)$ iterations which consist in comparing two elements of the array, which takes $O(m)$ time. In the worst case, when the array is actually sorted and the comparisons need to examine all $m$ bits because the integers are similar enough, this loop takes $\Theta(mn)$ time.
This gives a total runtime of $\Theta(n \lg n + mn)$. How does that compare to merge sort?
If the size of the integers is independent of the length $n$ of the array, i.e., if $m$ is $O(1)$, then both algorithms have a $\Theta(n \lg n)$ runtime. Notice that, at least from a theoretical standpoint, in this case it would be better to use a non-comparison-based algorithm such as counting sort, which would take linear time.^{5}
However, when the size $m$ of the integers actually depends on $n$, then ndsort
beats merge sort. An example of a realistic situation where $m$ actually depends on $n$ is if you have a database of items with unique IDs and you want to sort them by ID. In that case, the IDs have at least $m = \lg n$ bits. Merge sorting them then would require $\Theta(n \lg^2 n)$ time, while sorting them with ndsort
only takes $\Theta(n \lg n)$ time.
Since $m$ is multiplied by $n \lg n$ for merge sort, but only by $n$ for ndsort
, the same reasoning applies to any non-constant value of $m$, be it larger or smaller than $\lg n$.
What if we have access to a few nondeterministic bits, but not as many as $n \lg n$? Can we still exploit that in order to do better than merge sort?^{6} Once again, the answer turns out to be yes.
The idea is to guess a sorted prefix of the array A
, say of length $k$ (this requires $k \log n$ nondeterministic bits), then check that it is indeed sorted, and deterministically sort the rest of the array with $O((n-k) \lg(n-k))$ comparisons requiring $O(m)$ time each. Since the two halves A[0:k]
and A[k:n]
of array A
have been sorted separately, we still need to check that by gluing them together we actually obtain a sorted array; luckily, this only requires checking that A[k-1]
≤ A[k]
.
Here is a Python algorithm doing exactly that, and taking the length $k$ of the nondeterministic prefix of the array as input:^{7}
from nondeterminism import *
@nondeterministic
def hybridsort(A, k=0):
n = len(A)
assert k <= n
for i in range(k):
j = guess(range(n))
A[i], A[j] = A[j], A[i]
for i in range(k-1):
if A[i] > A[i+1]:
return
A[k:n] = sorted(A[k:n])
if 0 < k < n and A[k-1] > A[k]:
return
return A
By letting $k = 0$, which is the default value, the algorithm behaves completely deterministically, and has runtime $\Theta(mn \lg n)$, the same as merge sort. By letting $k = n$, we obtain ndsort
as analysed above, with its complexity of $\Theta(n \lg n + mn)$.
But we can play around a bit with the value of $k$ and see what happens. For instance, let $k = \frac{n}{2}$. Then hybridsort
uses $\frac{n}{2} \log n$ nondeterministic bits, and makes $\frac{n}{2}-1$ comparisons in the second loop plus $\frac{n}{2} \lg \frac{n}{2}$ when sorting the second portion of the array plus an extra comparison at the end, which gives a total of
comparisons, which is better than the exact number of comparisons made by merge sort and thus gives a better runtime for large enough values of $n$ and $m$.
We could optimise slightly the algorithm in the case where all elements are identical, or do a more sophisticated analysis showing that we almost never need to look at all $m$ bits when the elements of the array are distinct enough, but I am pretty sure that would still give a $\Theta(m n \lg n)$ runtime. ↩
You can actually run this code on your deterministic hardware by using the nondeterminism
library. Cool, isn’t it? (Running it might actually incur a slight exponential slowdown in real-world time, though. 👀) ↩
We could slightly reduce the number of nondeterministic bits needed by having something like j = guess(range(i, n))
rather than j = guess(range(n))
, but this complicates the analysis without changing the asymptotic number of bits; you still need to be able to guess any of the $n!$ permutations, and that requires at least $\lg(n!)$ bits, which is $\Omega(n \lg n)$. ↩
This is, strictly speaking, only true if the integers we are sorting are the only data. If there is some ancillary data and the sorting key is not unique (for instance, if we are sorting records of people by birthdate) then multiple outputs are possible. In particular, this is not a stable sort. Notice that this happens for some classic algorithms too, for instance quicksort with random pivoting. ↩
From a practical standpoint, you would also need the constant hidden in the $O(1)$ to be small enough, since you need an array of that length in order to implement counting sort. ↩
This question (or, more precisely, a related one about the trade-off between comparisons and nondeterministic bits) was asked by Joshua Grochow on Twitter, where I was brainstorming about all this. ↩
Here the second half A[k:n]
of A
is sorted using the default Python sorting algorithm, which is called Timsort and makes a guaranteed $O(n \lg n)$ number of comparisons. Thus, it is equivalent to merge sort as far as we are concerned here. ↩