# Rademacher Functions as Discrete Sines

arithmeticThis is a short piece on the family of Rademacher functions It is taken from a piece I wrote for a class. As plus, it illustrates how $\KaTeX$ gets rendered on this blog.

## Defintion

Sines are great, but their continus nature feels out of place for programmers maybe more familiar with digital values. Tasked with finding a discrete alternatives for the sines, we might come up with something that looks like square waves $S_T$,

\[ S_T(x) = \left\{\begin{matrix} 1 & 0 \leq x < T/2 \\ -1 & T/2 \leq x < T \end{matrix}\right. \]

defined on some interval $[0, T)$. This definition fulfills our requirement for a digital function when values ${-1, 1}$ are encoded as binary zero and one. To be as close as possible to the familiar trigonometric functions, we would set $T=2 \pi$ for a square wave that looks very much like the familiar sine.

It should not come as a surprise that such functions have been
discovered and studied before. Indeed, the set of *Rademacher
functions* originally described by German mathematician Hans
Rademacher in 1922 [Original Paper] is in line with our intuition
about discrete sines.

**(Defintion)** The *Rademacher functions* $r_k : \mathbb{R}
\rightarrow { -1, 1 }$ are defined defined as

\[ r_k(x) = {(-1)}^{ \lfloor 2^k x \rfloor } \]

with $k \in \mathbb{N}$.

However, above definition of $r_k$ is based on Donald Knuth's formulation, rather than the one original proposed by Rademacher himself as Knuth's definition is closer to what a digital computer might end up evaluating [Knuth, p. 8]. Another equivalent definition of the Rademacher functions is

\[ r_k(x) = \text{sgn}\big(\sin (2^k \pi x)\big) \]

when we fix $\text{sgn}(0) = 1$ [Mathworld]. This formulation is useful because it illustrates the relationship between Rademacher functions and their trigonometric cousins. However, it can only be illustrative; discrete alternatives for the trigonometric functions must not involve evaluating the sine.

## Parameters

Similar to how sines can be controlled with amplitude $A$, frequency $\omega$ and phase $\varphi$, it is easy to modify $r_k$ with parameters to control its amplitude $a$ and phase $p$. Less obvious is that intrinsic to the Rademacher functions is an equivalent to frequency, namely parameter $k$.

Let us investigate all these parameters in order. First of all,
multiplying $r_k$ with *amplitude* $a$ changes the image of $r_k$ such
that

\[ r_{k,a}(x) = a \; r_k(x) \; \in \{ -a, a \}. \]

This is not really the same as amplitude $A$ of the sines which changes the image to

\[ A \; \sin(x) \in [-A, A], \]

because the sine is a surjective mapping. But for a discrete
alternative, that is as far as we should go, maintaining the digital
nature of the Rademacher functions. As for the next parameter,
introducing a *shift* $p$ is equality trivial; merely adding $p$ to
argument $x$ moves $r_k$ on the horizontal, viz.

\[ r_{k,p}(x) = r_k(x + p). \]

**(Theorem)** Now it is getting interseting. Intrinsic to each
Rademacher function $r_k$ is parameter $k$ which controls the number
of oscillations $r_k$ makes in the unit interval $[0, 1)$;
function $r_k$ includes exactly $2^{k-1}$ oscillations on $[0,1)$.

**(Proof)** With each increment of $k$, exponent $\xi(x) := \lfloor 2^k x \rfloor$
grows to twice its previous size. In addition, flooring $2^k x$ acts
like a filter that discards everything right of the decimal point. As a
result, $\xi(x)$ is a stairway function with its step width controlled
by $k$. With each increment of $k$, we get twice as many stairs of equal
width. When evaluating $r_k(x)$, each time a new step is reached, the
sign of $r_k(x)$ switches. This results in $2^{k-1}$ oscillations
on $[0,1)$. □

With parameters $a$, $p$ and $k$, the Rademacher functions could be an alternative to the sines. For example, parameterized Rademacher functions are used to modulate messages in digital communication [Mohanty]. Their advantage compared to the sine is that they are very easy to compute, as we will see now.

## Evaluating Rademacher Functions

In practical applications of the Rademacher functions, it is necessary to evaluate $r_k(x)$ in a reasonable amount of time. There are multiple ways to compuote $r_k$ efficiently on digital hardware, this section introduces two approaches.

### Fixed Point Arithmetic

**(Theorem)** [Knuth, p. 8] notes that $r_k(x)$ can be computed
quite elegantly if argument $x$ is represented as a fixed point
number, that is as a binary string with a decimal point, viz.

\[ x = (\ldots \; c_2 \; c_1 \; c_0 \; . \; c_{-1} \; c_{-2} \ldots). \]

In that particular case we get

\[ r_k(x) = {(-1)}^{c_{-k}} = \left\{\begin{matrix} 1 & c_{-k} = 0\\ -1 & c_{-k} = 1 \end{matrix}\right. \]

**(Proof)** Multiplying $x$ with $2^k$ is equivalent to a shift to the
left by $k$ bits, even in this decimal representation. We get

\[ \rho(x) := 2^k x = (\ldots c_{-k} \; . \; c_{-(k+1)} \ldots). \]

Taking the floor from $\rho$ means cutting all digits right of the decimal point. As such, exponent $\lfloor \rho \rfloor$ is an even integer in case $c_{-k} = 0$ and odd when $c_{-k} = 1$, resulting the formulation above. □

Fixed width numbers are used when more sophisticated floating point hardware is unavailable, for example on digital signal processors or embedded systems powered by microcontrollers [Kneusel, p. 117]. Knuth's unorthodox approach to evaluating $r_k$ might be a surprisingly good fit.

### Floating Point

On a system that represents fractional values as some kind of floating
point number, like the one described by IEEE 754 [IEEE]
[Kneusel, pp. 101], evaluating $r_k$ is also fast. The listing
below illustrates this in C-like pseudo code, where
function `rademacher(k, x)`

computes $r_k(x)$.

```
float rademacher(int k, float x)
{
// Compute x = (2^k) * x. Because `x' is represented as a floating
// point number, it is sufficient to increment its exponent by `k'.
x.exponent += k;
// We are only interested in what is left of the decimal point.
int p = (int) x;
// Equivalent to computing (-1)^p.
return (p % 2 == 0) ? 1 : -1;
}
```

On modern machines, all instructions required to execute `rademacher`

should run in constant time, which should be obvious for incrementing
`x`

's exponent and branching over `p`

. Converting floating point `x`

to integer `p`

, effectively flooring `x`

, can be fast when hardware
support is available, for example provided by the `fistp`

instruction
on the ubiquitous x84_64 architecture [Intel, Vol. 2A 3-355].

Regardless whether we are using low-powered embedded hardware with fixed point numbers or more sophisticated computers with floating point units, evaluating $r_k$ is a simple operation that should not be a bottleneck for any application.

## Orthogonality and Completeness

The process known as *Fourier transform* allows us to approximate
functions on some interval to arbitrary accuracy. Very roughly
speaking, it is based around infinite sums of sines and cosines. This
is not possible with the Rademacher functions. A deep dive into this
topic is out of scope for this article; for now we only wish to
investigate why it is not feasible to use the Rademacher functions for
this task.

We will define two related concepts and try to gain an intuition on
what they mean. What we are looking for when we wish to build
something like the Fourier transform is a set of functions $\phi_k$
*orthogonal* and *complete* on some interval $(a,b)$ as sums of such
functions $\phi_k$ can be used to approximate periodic functions to
arbitrary accuracy. Let us begin with the idea behind orthogonality.

### Orthogonality

Just like orthonormal vectors make a base for a vector space, orthogonal sets of functions make a base for function spaces.

**(Defintion)** A set of functions $\phi_k$ is called *orthogonal* in
interval $(a, b)$ if

\[ \int_a^b \lambda \; \phi_n(x) \; \phi_m(x) \; dx = \left\{\begin{matrix} \lambda & n = m \\ 0 & n \neq m \end{matrix}\right. \]

(Lebesgue integral) and orthonormal when $\lambda = 1$ [Beauchamp, p. 4].

The Rademacher functions make up a set of orthonormal functions on interval $(0,1)$. Indeed, this is why they were interesting to Rademacher in the first place . Because the proof is quick and intuitive, it is included here.

**(Proof)** If $n=m$, we get

\[ \int_0^1 \; r_n(x) \; r_m(x) \; dx = \int_0^1 \; r_n^2(x) \; dx = \int_0^1 \; 1 \; dx = 1. \]

If on the other hand $n \neq m$, the product of two Rademacher functions $r_n$ and $r_m$, $n < m$, is negative just as often as it is positive. We need to realize two things.

$r_k$ has equally many points $x$ at which $r_k(x) = 1$ as it has points at which $r_k(x) = -1$.

$r_{k+1}$ has twice as many oscillations as $r_k$. Each oscillation of $r_n$ is overlaid by some $2^p$ oscillations from $r_m$.

The result is that $r_n(x) r_m(x)$ is negative just as often as it is positive, in consequence the integral is zero. □

### Completeness

The second property we are after is *completeness.* If we wish to
approximate functions $f$ with a series of functions $\phi_k$ to
arbitrary accuracy, we should be able to sum up $\phi_k$
for $k=0,1,2,\ldots$ in such a way that the error between $f$ and series
converges to zero.

**(Definition)** A set of functions $\phi_k$ orthogonal in $(a,b)$ is
called *complete* in that interval if for any piecewise continuous
function $f$, coefficients $c_n$ can be picked such that they minimize
the mean-square error $E_N$,

\[ E_N = \int_a^b {\bigg( f(x) - \sum_{n=1}^{N} c_n \; \phi_n(x) \bigg)}^2 dx, \]

(Lebesgue integral) that is $E_N \rightarrow 0$ as $N \rightarrow \infty$ [Mathworld] [Beauchamp, p. 4].

While functions $r_k$ are orthogonal on the unit, they are not complete [Beauchamp, p. 8]. Here, the trigonometric functions are more more capable. The set of sines and cosines is orthogonal and complete on $(-\pi,\pi)$, making the Fourier transform possible [Mathworld]. However, other complete sets of orthogonal functions exist, far beyond the traditional sines. One such set is the set of Walsh functions.

## Summary

The Rademacher functions match our intuition about a discrete sine. Functions $r_k$ have the same sign as their trigonometric counterparts and are easy to parameterize. Evaluating $r_k$ is fast, regardless whether the argument is represented as a fixed point or floating point number. The downside is that the Rademacher functions are not complete and as such sophisticated applications such as the Fourier transform are not possible with $r_k$.