Résumés
Exposés longs :
- Factoring integers with ECM on the Kalray MPPA-256 processor
Jérémie Detrey (Inria, LORIA) The Kalray MPPA-256 is a recent low-power chip which embeds 256 32-bit cores. As such, it can be used as an efficient co-processor for accelerating computations, bridging the gap between usual CPUs and the more throughput-oriented GPUs, all the while consuming far less power. In this talk, we will present a fast and energy-efficient implementation of the Elliptic Curve Method (ECM, an algorithm for factoring integers) on this platform. After a brief presentation of the ECM and of its important role in cryptanalysis, especially in the context of factoring RSA keys, we will glance at some of the key architectural features of the MPPA-256. We will then present an optimized library for multiprecision modular arithmetic, on top of which the ECM implementation was built, before concluding with a few benchmarks and comparisons with the state of the art. This is a joint work with Masahiro Ishii, Pierrick Gaudry, Atsuo Inomata, and Kazutoshi Fujikawa.
- Basic arithmetic in Flint and Nemo
Bill Hart (Technische Universität Kaiserslautern) We will discuss the latest developments in polynomial arithmetic and linear algebra in the Flint C library and the new Julia package called Nemo. We will show some benchmarks, including parallel benchmarks and discuss what the future holds.
- Validated numerics for robust space mission design
Mioara Joldes (CNRS, LAAS) In this talk we overview several symbolic-numeric algorithms and software tools developed for robust space mission design. In this domain, certification of computations is at stake and we aim at providing computer-aided proofs of numerical values, with validated and reasonably tight error bounds, without sacrificing efficiency. Applications include the reliable computation of collision probabilities between space objects in low Earth orbits or the computation of validated impulsive control for the rendezvous problem of spacecrafts. We mainly make use of D-finite functions and their truncated Taylor or Chebyshev series, together with rigorously computed error bounds. This talk is based on joint works with D. Arzelier, F. Bréhard, N. Brisebarre, J.-B. Lasserre, C. Louembet, A. Rondepierre, B. Salvy, R. Serra.
Exposés :
- On rounding error resilience, maximal attainable accuracy and parallel performance of the pipelined Conjugate Gradients method for large-scale linear systems in PETSc
Emmanuel Agullo (Inria, LaBRI) Pipelined Krylov solvers typically offer improved strong scaling on parallel HPC hardware compared to standard Krylov methods for large and sparse linear systems. In pipelined Krylov algorithms the traditional synchronization bottleneck is mitigated by overlapping time-consuming global communications with useful computations. However, to achieve this communication hiding strategy, pipelined methods introduce additional recurrence relations for a number of auxiliary variables that are required to update the guess for the solution. This paper aims at studying the influence of rounding errors on the convergence of the pipelined Conjugate Gradient (CG) method. We explain why roundoff effects have a significantly larger impact on the accuracy and convergence of the pipelined CG algorithm in comparison to the traditional CG method. Furthermore, a model for the gap between the true residual and the recursively computed residual used in the algorithm is proposed. Based on this error model we suggest an automated residual replacement strategy to reduce the effect of rounding errors on the final iterative solution. The resulting pipelined CG method with residual replacement improves the maximal attainable accuracy of pipelined CG, while maintaining the efficient parallel performance of the pipelined method. This conclusion is substantiated by numerical results for a variety of benchmark problems.
- Model predictive control for spacecraft rendezvous hovering phases based on validated Chebyshev series approximations of the transition matrices
Paulo Ricardo Arantes Gilz (CNRS, LAAS) During spacecraft proximity missions, the respect of a set of space constraints are decisive to the successful achievement of the mission. However, the capacity of developing algorithms capable of fulfilling these constraints is directly related to how precisely the trajectories developed by each spacecraft involved in the mission are known. Given that precision is a must for these procedures, the prevention and estimation of errors arising from approximations and numerical computations become a critical subject in this context. The objective of this work is to provide a validated model predictive control algorithm for the hovering phases of the orbital spacecraft rendezvous missions. The novel concept is to employ a method for the propagation of approximated relative trajectories that ensures bounds for the committed approximation errors. Such a validated model predictive controller is then based on iterative optimization problems inspired of a two-folded approach: a) a method that generates Chebyshev series approximation for the solutions of linear ordinary differential equations; b) the possibility of converting polynomial non-negativity constraints into linear matrix inequalities (LMIs), allowing the reformulation of the optimization problems as semi-definite programs (SDPs).
- Validation par une méthode de Newton de l'approximation de solutions d'EDO linéaires par des séries de Chebyshev tronquées
Florent Bréhard (ENS Lyon, LIP) Obtenir des bornes d'erreurs rigoureuses pour des solutions numériques d'équations différentielles ordinaires (EDO) est un enjeu de taille tant pour les applications critiques (aéronautique, aérospatial, médecine, etc.) que pour les preuves mathématiques assistées par ordinateur. Souvent, des problèmes non linéaires difficiles à étudier peuvent être linéarisés en équations différentielles ordinaires linéaires (EDOL) autour d'un point ou d'une trajectoire d'équilibre.
Nous présenterons un algorithme de validation a posteriori, qui étant donné une solution approchée d'une EDOL exprimée en base de Tchebychev, renvoie une borne d'erreur rigoureuse. Comme souvent en analyse numérique certifiée, le théorème du point fixe de Banach jouera un rôle clé. Nous verrons ensuite comment cette procédure peut s'étendre au cas vectoriel (système d'EDOL couplées). Enfin, nous illustrerons la méthode avec des exemples issus de la mécanique spatiale, comme le problème de rendez-vous ou le maintien à poste d'un satellite.
- Modified FMA for exact low precision product accumulation
Nicolas Brunie (Kalray) The implementation of the Fused Multiply and Add (FMA) has been extensively studied in the literature on standard and large precisions. We suggest revisiting those studies for low precision (16-bit and under). We introduce a variation of the Mixed precision FMA targeted for applications processing low precision inputs (e.g. machine learning). We also introduce several versions of a fixed point based floating-point FMA which performs an exact accumulation of binary16 numbers. We study the implementation and area footprint of those operators in comparison with standard FMAs.
- Verificarlo : Un outil pour détecter les instabilités numériques à l'aide de l'arithmétique de Monte Carlo
Yohan Chatelain (UVSQ, Li-Parad) Avec l'accroissement de la capacité de calcul, les codes de simulation numériques intègrent de plus en plus de finesse dans leurs modèles de calcul. Cette complexité se traduit par des programmes de plus en plus volumineux et complexes. Pointer les régions responsables d'une détérioration de la qualité numérique dans ce type de programme peut s'avérer coûteuse en temps et en moyens humain.
Dans cette optique, Verificarlo est un compilateur qui permet d'étudier la sensibilité numérique d'un code à travers l'arithmétique de Monte Carlo (MCA), une arithmétique stochastique qui se propose de modéliser les erreurs flottantes par une variable aléatoire. Une propriété intéressante de MCA est qu'il permet de calculer le nombre de chiffres significatifs d'une variable donnée.
Nous avons étendu Verificarlo pour pouvoir visualiser de manière claire le nombre de chiffres significatifs d'une variable donnée au cours du temps. Nous allons voir dans cette exposé comment cette visualisation nous permet d'identifier des instabilités numériques et comment elle nous permet de visualiser l'impacte de méthode compensée sur la stabilité numérique du code dans un code industriel.
- FP-ANR, un format de représentation des flottants pour la gestion à l’exécution des cancellations
David Defour (UPVD, LIRMM) L’utilisation des nombres flottants s’avère délicate dans la mesure où de nombreuses sources d’erreur entachent la qualité numérique des données manipulées. Parmi celles-ci, la cancellation, correspondant à la soustraction de deux nombres flottants proches peut produire des effets désastreux. Or il n’existe aucun moyen de quantifier ce type d’erreur a posteriori, si ce n’est par l’utilisation d'outils logiciels ad-hoc. Dans cet exposé, nous présenterons un nouveau format de représentation des nombres flottants appelés « Floating-Point Adaptive Noise Reduction ». Ce format, permet d’intégrer directement dans la représentation des nombres flottants l’information sur les incertitudes initiales ou générées lors de cancellation. Il est ainsi possible de propager les incertitudes tout au long des calculs jusqu’au résultat final.
- Erreurs d'arrondi des méthodes d'intégration numérique explicites
Florian Faissole (Inria, LRI) Les équations différentielles ordinaires sont fréquemment étudiées dans le domaine du calcul scientifique. À l'exception de cas particuliers, il est généralement impossible de résoudre ces équations de façon exacte. C'est pourquoi des méthodes numériques sont utilisées pour obtenir une solution discrétisée. Nous nous intéressons à certaines de ces méthodes, comme la méthode d'Euler ou les méthodes de Runge-Kutta. Leur implémentation utilise des nombres à virgule flottante, ce qui peut conduire à une accumulation d'erreurs d'arrondi. Afin de garantir la précision de ces méthodes, nous proposons une analyse très fine des erreurs d'arrondis pour en expliciter des bornes raisonnables. Nous prenons en compte les propriétés mathématiques de telles méthodes et notamment la stabilité linéaire, une propriété garantissant leur bon comportement du point de vue mathématique.
- A lattice basis reduction approach for the design of quantized FIR filters
Silviu Filip (University of Oxford, Mathematical Institute) Digital filtering tasks are important to many signal processing applications. In a lot of cases, they are used in an embedded setting where only limited precision, fixed-point operators are available.
I will focus on a novel approach for the design of optimized FIR digital filters with finite precision coefficients. Our approach combines ideas from algorithmic number theory (euclidean lattice basis reduction) and approximation theory (good families of points for doing polynomial approximation). To show the effectiveness of our method, we compare it with other state of the art alternatives.
This is a joint work with Nicolas Brisebarre and Guillaume Hanrot.
- Multiple precision floating point arithmetic on SIMD processors
Joris van der Hoeven (CNRS, LIX) Current general purpose libraries for multiple precision floating point arithmetic such as MPFR suffer from a large performance penalty with respect to hard-wired instructions. The performance gap tends to become even larger with the advent of wider SIMD arithmetic in both CPUs and GPUs. We present efficient algorithms for multiple precision floating point arithmetic that are suitable for implementations on SIMD processors.
- Hierarchical Approach for Deriving a Reproducible LU factorization
Roman Iakymchuk (KTH Royal Institute of Technology, Suède) In this talk, I will present a reproducible variant of the unblocked LU factorization for graphics processor units (GPUs). For this purpose, we build upon Level-1/2 BLAS kernels that deliver correctly-rounded and reproducible results for the dot (inner) product, vector scaling, and the matrix-vector product. In addition, we draw a strategy to enhance the accuracy of the triangular solve via iterative refinement. Following a bottom-up approach, we finally construct a reproducible unblocked implementation of the LU factorization for GPUs, which accommodates partial pivoting for stability and can be eventually integrated in a high performance and stable algorithm for the (blocked) LU factorization.
- Randomized Mixed-Radix Scalar Multiplication
Laurent Imbert (CNRS, LIRMM) Computing the scalar multiple of a point of an elliptic curve is a fundamental operation in curve-based cryptography. In this talk, I’ll present a new randomized yet constant-time algorithm for performing this operation with built-in countermeasures against a large class of side-channel attacks. Our algorithm uses a mixed-radix representation of the scalar.
- MPDI : Une Bibliothèque Arithmétique Décimale Multiple-Précision par Intervalles
Clothilde Jeangoudoux (UPMC, LIP6) Nous décrivons les mécanismes et l'implémentation d'une bibliothèque définissant une arithmétique décimale multiple-précision par intervalles. Le but de cette bibliothèque est de fournir un résultat précis et certifié en décimal. Elle contient des opérations arithmétiques élémentaires correctement arrondies, ainsi que quelques fonctions transcendantes, telles que sin, cos et exp. Les domaines d'application sont nombreux et variés, allant du calcul financier à l'ingénierie aéronautique. Dans ces milieux, les logiciels sont conçus de façon à répondre à un ensemble d'exigences, manipulant les nombres décimaux, mais réalisés sur des machines effectuant des opérations binaires. Des conversions entre nombres décimaux et binaires sont inévitables, et induisent des erreurs qui peuvent être amplifiées lors des calculs. Il y a donc un réel besoin pour une bibliothèque arithmétique décimale multiple-précision par intervalles correctement arrondie. De plus, la représentation décimale choisie est compatible avec le standard IEEE 754-2008 pour l'arithmétique flottante et conforme avec le format par intervalles décrit dans le standard IEEE 1788-2015 pour l'arithmétique d'intervalles.
- Conversions simultanées entre représentation classique et modulaire à l'aide d'algèbre linéaire
Romain Lebreton (UM2, LIRMM) Le système modulaire de représentation des entiers est très utilisé, notamment de par sa capacité à réduire des calculs sur de grandes valeurs à des calculs menés en parallèle sur des nombres de taille choisie. Nous présentons un nouvel algorithme de conversion de représentation d'entiers entre le système positionnel classique et le système modulaire. Cet algorithme ramène le gros des calculs de conversion à de l'algèbre linéaire sur des mots machines.
Notre algorithme est naturellement conçu pour convertir simultanément un certain nombre d'entiers. Dans ce cas, sa complexité théorique améliore celle des méthodes naïves sans toutefois atteindre la quasi-linéarité des algorithmes rapides. Mais c'est en pratique que notre algorithme tire le mieux son épingle du jeu : parce qu'il bénéficie des implémentations optimisées d'algèbre linéaire, notre programme fournit le meilleur temps de calcul des algorithmes de conversions pour une large plage intermédiaire de taille de matrice et d'entiers.
La principale application de notre algorithme est la multiplication de matrices à coefficients entiers. Nos expériences montrent des améliorations de temps de calculs pour une large plage intermédiaire de taille de matrice et d'entiers.
- Enclosures of Roundoff Errors using SDP
Victor Magron (CNRS, Verimag) Roundoff errors cannot be avoided when implementing numerical programs with finite precision. The ability to reason about rounding is especially important if one wants to explore a range of potential representations, for instance in the world of FPGAs. This problem becomes challenging when the program does not employ solely linear operations as non-linearities are inherent to many computational problems in real-world applications.
We present two frameworks which can be combined to provide interval enclosures of upper bounds for absolute roundoff errors.
The framework for upper bounds is based on optimization techniques employing semidefinite programming (SDP) and sparse sums of squares certificates, which can be formally checked inside the Coq theorem prover.
The framework for lower bounds is based on a new hierarchy of convergent robust SDP approximations for certain classes of polynomial optimization problems. Each problem in this hierarchy can be exactly solved via SDP.
Our tool covers a wide range of nonlinear programs, including polynomials and transcendental operations as well as conditional statements. We illustrate the efficiency and precision of this tool on non-trivial programs coming from biology, optimization and space control.
- Validating Semidefinite Programming Solvers for Polynomial Invariants
Pierre Roux (ONERA) Semidefinite programming (SDP) solvers are increasingly used as primitives in many program verification tasks to synthesize and verify polynomial invariants for a variety of systems including programs, hybrid systems and stochastic models. On one hand, they provide a tractable alternative to reasoning about semi-algebraic constraints. However, the results are often unreliable due to ``numerical issues'' that include a large number of reasons such as floating-point rounding errors, ill-conditioned problems, failure of strict feasibility, and more generally, the specifics of the algorithms used to solve SDPs. These issues influence whether the final numerical results are trustworthy or not.
In this talk, we report on the perils of using SDP solvers for common invariant synthesis tasks, characterizing the common failures that can lead to unreliable answers. Next, we demonstrate existing tools for guaranteed semidefinite programming that often prove inadequate to our needs. Finally, we present a solution for verified semidefinite programming that can be used to check the reliability of the solution output by the solver. This verification being based on floating-point computations, it only incurs a very small overhead to the running time of the solver.
We will eventually demonstrate an implementation in the COQ proof assistant to obtain increased confidence in the non trivial arguments involved, among which error bounds on a floating-point matrix algorithm (a Cholesky decomposition).
- Bridging High-Level Synthesis and Application-Specific Arithmetic: The Case Study of Floating-Point Summations
Yohann Uguen (INSA Lyon) FPGAs are well known for their ability to perform non-standard computations not supported by classical microprocessors. Many libraries of highly customizable application-specific IPs have exploited this capablity. However, using such IPs usually requires handcrafted HDL, hence significant design efforts. High Level Synthesis (HLS) lowers the design effort thanks to the use of C/C++ dialects for programming FPGAs. However, high-level C language becomes a hindrance when one wants to express non-standard computations: this languages was designed for programming microprocessors and carries with it many restrictions due to this paradigm. This is especially true when computing with floating-point, whose data-types and evaluation semantics are defined by the IEEE-754 and C11 standards. If the high-level specification was a computation on the reals, then HLS imposes a very restricted implementation space. This work attempts to bridge FPGA application-specific efficiency and HLS ease of use. It specifically targets the ubiquitous floating-point summation-reduction pattern. A source-to-source compiler transforms selected floating-point additions into sequences of simpler operators using non-standard arithmetic formats. This improves performance and accuracy for several benchmarks, while keeping the ease of use of a high-level C description.
- Reliable verification of digital implemented filters against frequency specifications
Anastasia Volkova (UPMC, LIP6) Reliable implementation of linear digital filters in finite precision is based on accurate error analysis. However, a small error in the time domain does not guarantee that the implemented filter verifies the initial band specifications in the frequency domain. We propose a novel certified algorithm for the verification of a filter’s transfer function, or of an existing finite-precision implementation. We show that this problem boils down to the verification of bounds on a rational function, and further to the positivity of a polynomial. Our algorithm has reasonable runtime efficiency to be used as a criterion in large implementation space explorations. We ensure that there are no false positive answers. For negative answers we give a tight bound on the margin of acceptable specifications. We demonstrate application of our algorithm to the comparison of various finite-precision implementations of FIR and IIR filters that have already been fully designed, e.g. coefficients are quantized.
|