Embedded Systems

GOURD: Tensorizing Streaming Applications to Generate Multi-Instance Compute Platforms

by Patrick Schmid, Paul Palom­ero Bernardo, Christoph Gerum, and Oliver Bring­mann
In IEEE Trans­ac­tions on Com­puter-Aided De­sign of In­te­grated Cir­cuits and Sys­tems 43(11): 4166-4177, 2024.

Key­words: In­te­grated cir­cuits, Ac­tu­a­tors, De­sign au­toma­tion, Shape, Mem­ory ar­chi­tec­ture, Fo­cus­ing, Edge AI, Hard­ware, Sen­sors, Data or­ches­tra­tion, hard­ware plat­form gen­er­a­tion, mul­ti­di­men­sional data flow, syn­chro­niza­tion

Ab­stract

In this ar­ti­cle, we re­think the dataflow pro­cess­ing par­a­digm to a higher level of ab­strac­tion to au­to­mate the gen­er­a­tion of multi-in­stance com­pute and mem­ory plat­forms with in­ter­faces to I/O de­vices (sen­sors and ac­tu­a­tors). Since the dif­fer­ent com­pute in­stances (NPUs, CPUs, DSPs, etc.) and I/O de­vices do not nec­es­sar­ily have com­pat­i­ble in­ter­faces on a dataflow level, an au­to­mated trans­la­tion is re­quired. How­ever, in mul­ti­di­men­sional dataflow sce­nar­ios, it be­comes in­her­ently dif­fi­cult to rea­son about buffer sizes and it­er­a­tion order with­out know­ing the shape of the data ac­cess pat­tern (DAP) that the dataflow fol­lows. To cap­ture this shape and the plat­form com­po­si­tion, we de­fine a do­main-spe­cific rep­re­sen­ta­tion (DSR) and de­vise a tool­chain to gen­er­ate a syn­the­siz­able plat­form, in­clud­ing ap­pro­pri­ate stream­ing buffers for plat­form-spe­cific ten­soriza­tion of the data be­tween in­com­pat­i­ble in­ter­faces. This al­lows plat­forms, such as sen­sor edge AI de­vices, to be eas­ily spec­i­fied by sim­ply fo­cus­ing on the shape of the data pro­vided by the sen­sors and trans­mit­ted among com­pute units, giv­ing the abil­ity to eval­u­ate and gen­er­ate dif­fer­ent dataflow de­sign al­ter­na­tives with sig­nif­i­cantly re­duced de­sign time.