Embedded Systems

Efficient Machine Learning in Hardware

Lecturer Oliver Bringmann
Head
Oliver Bringmann

Lecture Mittwoch, 10 c.t. - 12 Uhr
Sand 1, A301
17.04.24-24.07.24
Instructors Christoph Gerum
Researcher
Christoph Gerum

Moritz Reiber
Researcher
Moritz Reiber

Julia Werner
Researcher
Julia Werner

Tutorial Mittwoch, 12 c.t. - 14 Uhr
Sand 1, A301
24.04.24-24.07.24
Type of course Lecture + Exercises (6 LP)
Course ID ML4420
Entry in course catalog Alma
Learning Platform Ilias

Topic

The recent breakthroughs in using deep neural networks for a large variety of machine learning applications have been strongly influenced by the availability of high performance computing platforms. In contrast to its biological origin, however, high performance of artificial neural networks critically relies on much higher energy demands. While the average energy consumption of the entire human brain is comparable to that of a Laptop computer (i.e. 20W), artificial intelligence often resorts to large HPCs with several orders of magnitude higher energy demand. This lecture will discuss this problem and show solution how to build energy and resource efficient architectures for machine learning in hardware. In this context, the following topics will be addressed:

  • Hardware architectures for machine learning: NPUs, GPUs, FPGAs, overlay architectures, SIMD architectures, domain-specific architectures, in/near memory computing, training vs. inference architectures
  • Efficient deep neural networks and tensor operator implementation
  • Network compression: pruning, quantization, knowledge distillation, conditional computing
  • Neural network optimization and deployment: hardware-aware neural architecture search (NAS), autotuning, transformation of tensor operations, tensor reshaping, hardware mapping, pipelining, latency hiding
  • AI system hardware/software codesign and energy-efficient machine learning
  • Emerging architectures: new switching devices to implement neural networks (Memristors, PCM), neuromorphic computing, quantum computing

Students gain in-depth knowledge about the challenges associated with energy-efficient machine learning hardware and respective state-of-the-art solutions. Different hardware architectures will be compared regarding the trade-off between their energy consumption, complexity, computational speed and the specificity of their applicability.

The main goals of the course are learning what kinds of hardware architectures are used for machine learning, understanding the reasons why a particular architecture is suitable for a particular application and how to efficiently implement machine learning algorithms in hardware.

Literature

  • Sze, Vivienne, et al. “Efficient processing of deep neural networks.” Synthesis Lectures on Computer Architecture 15.2 (2020): 1-341.