Rademacher Functions as Discrete Sines

2020-11-21 arithmetic

This is a short piece on the family of Rademacher functions It is taken from a piece I wrote for a class. As plus, it illustrates how KaTeX gets rendered on this blog.

Defintion

Sines are great, but their continus nature feels out of place for programmers maybe more familiar with digital values. Tasked with finding a discrete alternatives for the sines, we might come up with something that looks like square waves $S_T$,

$$ S_T(x) = \left{\begin{matrix} 1 & 0 \leq x < T/2 \ -1 & T/2 \leq x < T \end{matrix}\right. $$

$S_{2 \pi}$ compared to the regular sine.

defined on some interval $[0, T)$. This definition fulfills our requirement for a digital function when values ${-1, 1}$ are encoded as binary zero and one. To be as close as possible to the familiar trigonometric functions, we would set $T=2 \pi$ for a square wave that looks very much like the familiar sine.

It should not come as a surprise that such functions have been discovered and studied before. Indeed, the set of Rademacher functions originally described by German mathematician Hans Rademacher in 1922 [Original Paper] is in line with our intuition about discrete sines.

(Defintion) The Rademacher functions $r_k : \mathbb{R} \rightarrow { -1, 1 }$ are defined defined as

$$ r_k(x) = {(-1)}^{ \lfloor 2^k x \rfloor } $$

with $k \in \mathbb{N}$.

However, above definition of $r_k$ is based on Donald Knuth’s formulation, rather than the one original proposed by Rademacher himself as Knuth’s definition is closer to what a digital computer might end up evaluating [Knuth, p. 8]. Another equivalent definition of the Rademacher functions is

$$ r_k(x) = \text{sgn}\big(\sin (2^k \pi x)\big) $$

when we fix $\text{sgn}(0) = 1$ [Mathworld]. This formulation is useful because it illustrates the relationship between Rademacher functions and their trigonometric cousins. However, it can only be illustrative; discrete alternatives for the trigonometric functions must not involve evaluating the sine.

Parameters

Similar to how sines can be controlled with amplitude $A$, frequency $\omega$ and phase $\varphi$, it is easy to modify $r_k$ with parameters to control its amplitude $a$ and phase $p$. Less obvious is that intrinsic to the Rademacher functions is an equivalent to frequency, namely parameter $k$.

Let us investigate all these parameters in order. First of all, multiplying $r_k$ with amplitude $a$ changes the image of $r_k$ such that

$$ r_{k,a}(x) = a ; r_k(x) ; \in { -a, a }. $$

This is not really the same as amplitude $A$ of the sines which changes the image to

$$ A ; \sin(x) \in [-A, A], $$

because the sine is a surjective mapping. But for a discrete alternative, that is as far as we should go, maintaining the digital nature of the Rademacher functions. As for the next parameter, introducing a shift $p$ is equality trivial; merely adding $p$ to argument $x$ moves $r_k$ on the horizontal, viz.

$$ r_{k,p}(x) = r_k(x + p). $$

(Theorem) Now it is getting interseting. Intrinsic to each Rademacher function $r_k$ is parameter $k$ which controls the number of oscillations $r_k$ makes in the unit interval $[0, 1)$; function $r_k$ includes exactly $2^{k-1}$ oscillations on $[0,1)$.

(Proof) With each increment of $k$, exponent $\xi(x) := \lfloor 2^k x \rfloor$ grows to twice its previous size. In addition, flooring $2^k x$ acts like a filter that discards everything right of the decimal point. As a result, $\xi(x)$ is a stairway function with its step width controlled by $k$. With each increment of $k$, we get twice as many stairs of equal width. When evaluating $r_k(x)$, each time a new step is reached, the sign of $r_k(x)$ switches. This results in $2^{k-1}$ oscillations on $[0,1)$. □

With parameters $a$, $p$ and $k$, the Rademacher functions could be an alternative to the sines. For example, parameterized Rademacher functions are used to modulate messages in digital communication [Mohanty]. Their advantage compared to the sine is that they are very easy to compute, as we will see now.

Evaluating Rademacher Functions

In practical applications of the Rademacher functions, it is necessary to evaluate $r_k(x)$ in a reasonable amount of time. There are multiple ways to compuote $r_k$ efficiently on digital hardware, this section introduces two approaches.

Fixed Point Arithmetic

(Theorem) [Knuth, p. 8] notes that $r_k(x)$ can be computed quite elegantly if argument $x$ is represented as a fixed point number, that is as a binary string with a decimal point, viz.

$$ x = (\ldots ; c_2 ; c_1 ; c_0 ; . ; c_{-1} ; c_{-2} \ldots). $$

In that particular case we get

$$ r_k(x) = {(-1)}^{c_{-k}} = \left{\begin{matrix} 1 & c_{-k} = 0\ -1 & c_{-k} = 1 \end{matrix}\right. $$

(Proof) Multiplying $x$ with $2^k$ is equivalent to a shift to the left by $k$ bits, even in this decimal representation. We get

$$ \rho(x) := 2^k x = (\ldots c_{-k} ; . ; c_{-(k+1)} \ldots). $$

Taking the floor from $\rho$ means cutting all digits right of the decimal point. As such, exponent $\lfloor \rho \rfloor$ is an even integer in case $c_{-k} = 0$ and odd when $c_{-k} = 1$, resulting the formulation above. □

Fixed width numbers are used when more sophisticated floating point hardware is unavailable, for example on digital signal processors or embedded systems powered by microcontrollers [Kneusel, p. 117]. Knuth’s unorthodox approach to evaluating $r_k$ might be a surprisingly good fit.

Floating Point

On a system that represents fractional values as some kind of floating point number, like the one described by IEEE 754 [IEEE] [Kneusel, pp. 101], evaluating $r_k$ is also fast. The listing below illustrates this in C-like pseudo code, where function rademacher(k, x) computes $r_k(x)$.

float rademacher(int k, float x)
{
    // Compute x = (2^k) * x. Because `x' is represented as a floating
    // point number, it is sufficient to increment its exponent by `k'.
    x.exponent += k;

    // We are only interested in what is left of the decimal point.
    int p = (int) x;

    // Equivalent to computing (-1)^p.
    return (p % 2 == 0) ? 1 : -1;
}

On modern machines, all instructions required to execute rademacher should run in constant time, which should be obvious for incrementing x’s exponent and branching over p. Converting floating point x to integer p, effectively flooring x, can be fast when hardware support is available, for example provided by the fistp instruction on the ubiquitous x84_64 architecture [Intel, Vol. 2A 3-355].

Regardless whether we are using low-powered embedded hardware with fixed point numbers or more sophisticated computers with floating point units, evaluating $r_k$ is a simple operation that should not be a bottleneck for any application.

Orthogonality and Completeness

The process known as Fourier transform allows us to approximate functions on some interval to arbitrary accuracy. Very roughly speaking, it is based around infinite sums of sines and cosines. This is not possible with the Rademacher functions. A deep dive into this topic is out of scope for this article; for now we only wish to investigate why it is not feasible to use the Rademacher functions for this task.

We will define two related concepts and try to gain an intuition on what they mean. What we are looking for when we wish to build something like the Fourier transform is a set of functions $\phi_k$ orthogonal and complete on some interval $(a,b)$ as sums of such functions $\phi_k$ can be used to approximate periodic functions to arbitrary accuracy. Let us begin with the idea behind orthogonality.

Orthogonality

Just like orthonormal vectors make a base for a vector space, orthogonal sets of functions make a base for function spaces.

(Defintion) A set of functions $\phi_k$ is called orthogonal in interval $(a, b)$ if

$$ \int_a^b \lambda ; \phi_n(x) ; \phi_m(x) ; dx = \left{\begin{matrix} \lambda & n = m \ 0 & n \neq m \end{matrix}\right. $$

(Lebesgue integral) and orthonormal when $\lambda = 1$ [Beauchamp, p. 4].

The Rademacher functions make up a set of orthonormal functions on interval $(0,1)$. Indeed, this is why they were interesting to Rademacher in the first place [@rad]. Because the proof is quick and intuitive, it is included here.

(Proof) If $n=m$, we get

$$ \int_0^1 ; r_n(x) ; r_m(x) ; dx = \int_0^1 ; r_n^2(x) ; dx = \int_0^1 ; 1 ; dx = 1. $$

If on the other hand $n \neq m$, the product of two Rademacher functions $r_n$ and $r_m$, $n < m$, is negative just as often as it is positive. We need to realize two things.

$r_k$ has equally many points $x$ at which $r_k(x) = 1$ as it has points at which $r_k(x) = -1$.
$r_{k+1}$ has twice as many oscillations as $r_k$. Each oscillation of $r_n$ is overlaid by some $2^p$ oscillations from $r_m$.

The result is that $r_n(x) r_m(x)$ is negative just as often as it is positive, in consequence the integral is zero. □

Completeness

The second property we are after is completeness. If we wish to approximate functions $f$ with a series of functions $\phi_k$ to arbitrary accuracy, we should be able to sum up $\phi_k$ for $k=0,1,2,\ldots$ in such a way that the error between $f$ and series converges to zero.

(Definition) A set of functions $\phi_k$ orthogonal in $(a,b)$ is called complete in that interval if for any piecewise continuous function $f$, coefficients $c_n$ can be picked such that they minimize the mean-square error $E_N$,

$$ E_N = \int_a^b {\bigg( f(x) - \sum_{n=1}^{N} c_n ; \phi_n(x) \bigg)}^2 dx, $$

(Lebesgue integral) that is $E_N \rightarrow 0$ as $N \rightarrow \infty$ [Mathworld] [Beauchamp, p. 4].

While functions $r_k$ are orthogonal on the unit, they are not complete [Beauchamp, p. 8]. Here, the trigonometric functions are more more capable. The set of sines and cosines is orthogonal and complete on $(-\pi,\pi)$, making the Fourier transform possible [Mathworld]. However, other complete sets of orthogonal functions exist, far beyond the traditional sines. One such set is the set of Walsh functions.

Summary

The Rademacher functions match our intuition about a discrete sine. Functions $r_k$ have the same sign as their trigonometric counterparts and are easy to parameterize. Evaluating $r_k$ is fast, regardless whether the argument is represented as a fixed point or floating point number. The downside is that the Rademacher functions are not complete and as such sophisticated applications such as the Fourier transform are not possible with $r_k$.

Function $r_1$.	Function $r_2$.
Function $r_3$.	Function $r_4$.