#### Putting F\"urer Algorithm into Practice with the BPAS Library

##### Sviatoslav Covanov, Davood Mohajerani, Marc Moreno-Maza, Lin-Xiao Wang

Fast algorithms for integer and polynomial multiplication play an important role in scientific computing as well as in other disciplines. In 1971, Sch{\"o}nhage and Strassen designed an algorithm that improved the multiplication time for two integers of at most $n$ bits to $\mathcal{O}(\log n \log \log n)$. In 2007, Martin F\"urer presented a new algorithm that runs in $O \left(n \log n\ \cdot 2^{O(\log^* n)} \right)$, where $\log^* n$ is the iterated logarithm of $n$. We explain how we can put F\"urer's ideas into practice for multiplying polynomials over a prime field $\mathbb{Z} / p \mathbb{Z}$, for which $p$ is a Generalized Fermat prime of the form $p = r^k + 1$ where $k$ is a power of $2$ and $r$ is of machine word size. When $k$ is at least 8, we show that multiplication inside such a prime field can be efficiently implemented via Fast Fourier Transform (FFT). Taking advantage of Cooley-Tukey tensor formula and the fact that $r$ is a $2k$-th primitive root of unity in $\mathbb{Z} / p \mathbb{Z}$, we obtain an efficient implementation of FFT over $\mathbb{Z} / p \mathbb{Z}$. This implementation outperforms comparable implementations either using other encodings of $\mathbb{Z} / p \mathbb{Z}$ or other ways to perform multiplication in $\mathbb{Z} / p \mathbb{Z}$.

arrow_drop_up