Embedded Systems

Mapping Transformer Models on Systolic Array Architectures

Master’s Thesis / Student Research Project

Abstract

Transformer models such as large language models or generative AI heavily rely on the attention layer. Due to the different mapping parameters for AI hardware accelerators, finding the best parameter set for an efficient execution remains a significant challenge, especially when only an AI hardware accelerator prototype is available. Therefore, mapping parameter-agnostic performance estimators for AI hardware accelerators are needed.

The Abstract Computer Architecture Description Language (ACADL) allows for modeling arbitrary AI hardware accelerator architectures and subsequent performance evaluation of DNNs mapped onto the modeled computer architectures. Goal of this thesis project is to map the attention layer onto a systolic array architecture for the purpose of performance evaluation.

References

Requirements

  • Python
  • PyTorch
  • C/C++
  • Successfully atteded the lecture “Grundlagen der Rechnerarchitektur” and/or “Parallele Rechnerarchitekturen” (optional)
  • Linux (optional)

Contact

Lübeck, Konstantin

Bringmann, Oliver