Transferring Fine-grained DNN Accelerator Architecture Performance Models to Coarse-grained Models
Bachelor’s Thesis / Master’s Thesis / Student Research Project
Abstract modeling of HW/SW systems is a relatively new research topic. This technique aims to capture only the essential parameters of software and hardware that influence their timing behavior.
Currently, performance models for deep learning accelerator architectures are either find-grained on the scalar operation level (multiplication, addition, etc.) or coarse-grained on the tensor operation level (vector, matrix-matrix multiplication, etc.). While fine-grained models provide high accuracy, they are often inflexible and complex to compute. Coarse-grained models are fast to compute and provide a lot of flexibility. This student project’s goal is to develop a methodology to transfer fine-grained models on the scalar operation level to coarse-grained tensor level models and evaluate the trade-off between performance and accuracy.
- Konstantin Lübeck, Alexander Louis-Ferdinand Jung, Felix Wedlich, Oliver Bringmann - Work-in-Progress: Ultra-fast yet Accurate Performance Prediction for Deep Neural Network Accelerators
- Successfully atteded the lecture “Grundlagen der Rechnerarchitektur” and/or “Parallele Rechnerarchitekturen” (optional)
- Linux (optional)