DSAGEN: Democratizing Spatial Accelerator Research

Organizers

Jian Weng, Sihao Liu, Vidushi Dadu, Tony Nowatzki
PolyArch Research Group
University of California, Los Angeles
Date/Time: Saturday. Oct. 17th. 4PM - 7PM PST

Resources

Please download and build the binary release of our infrastructure or build it from the source. These include a full-stack implementation, including the extended ISA, binary linker, compiler, hardware simulator, RTL generator, Chipyard integration, and benchmarks. Please do get the infrastructure executable before the tutorial.

To set up the environment, you need to:

Use this Dockerfile to instantiate a container that has all the dependent packages installed. If you are not familiar with the usage of Docker please refer the quick tour below.
The commands below are for the purpose of tutorial. If you want to use our framework for your daily research, it is highly recommended to follow the instructions on our project wiki.
$ zsh # Please DO USE zsh or the behavior of source setup.sh may be undesirable

$ cd ~

$ wget "[the binary download link]" -O dsa-release.zip # note: DO NOT omit the quotes

$ unzip dsa-release.zip

$ source dsa-framework/setup.sh

$ git clone https://github.com/polyarch/dsa-examples # Examples for programming

$ git clone https://github.com/polyarch/dsa-cgra-gen # For hardware generation

$ cd dsa-examples/manual/01_vector_add && ./run.sh answer.out # Verify your installation

~~NOTE: If the link for binary release is too slow to download an alternative link is here.~~

Overview

Fig. 1: Synthesizing Programmable Accelerators

Because of the wanning benefit of transistor scaling, significant research has emerge for specialized accelerators, becuase of their promising performance and energy saving. While effective, the require intensive engineering for the hardware and software, and this efforts will be repeated when the underlying application domain shifts.

Ideally, one will be able to generate the accelerators based on the behaviors of the applications, and where these applications can be specified in a set of stead and user-friendly programming interfaces. In other words, we require a high-level synthesis flow for programmable accelerators. Figure 1 shows the paradigm of synthesizing programmable accelerators. In this tutorial, we will present our approach for programmable accelerator along with a research framework: DSAGEN, a full-stack infrastructure includes compilation, simulation, and RTL implementaion.

The first principle of our approach is to define a useful but restricted design space. Specifically we use decoupled-spatial accelerators, where memory accesses are decoupled from computation pipelines, and the underlying hardware network/storage/timing is exposed in the ISA. The second principle is to enable a rich accelerator design space by specifying architectures as a composition of simple primitives, including memories, processing elements, and network/synchronization components. An architecture instance can be represented as graph – the architecture description graph (ADG) – where each node is a hardware primitive. The ADG is an abstraction for the compiler (it is used to derive the ISA) as well as RTL generation.

DSAGEN Framework: This approach is embodied in our framework, DSAGEN, which is overviewed in Figure 2. DSAGEN targets C programs with custom, but application neutral pragmas. The compiler infrastructure uses Clang and LLVM as a frontend, and ultimately represents programs as a decoupled dataflow graph + memory streams. A low-level assembly-level interface is provided for ninja programmers. We include a custom spatial-architecture compiler and backend. The hardware design space includes many spatial architecture optimizations from prior works [1]–[4]. The compiler backend generates programs embedded in a RISCV ISA for control. DSAGEN supports multicore simulationi in gem5, and it uses Chisel for hardware generation.

Syllabus and Schedule

Fig. 2: The stack of DSAGEN

Introduction (20min): [slides]

The Decoupled-Spatial Programming Paradigm
The Principle of Composing Hardware Primitives
DSAGEN: A Framework for Decoupled-Spatial Research

Basic Programming of DSAGEN (40min): [slides]

Vector Add

Hardware/Software Interface Overview
Writing a Dataflow Graph
Writing the Control Intrinsics

Vector Normalization

Signaled Accumulation
Concurrent DFG's
Additional Control Intrinsics

5-Min Break

Advanced Programming of DSAGEN (50min): [slides] [videos] code: #CWsm3wS

An Introduction to Data-Dependent Specialization
Hands on Exercise: Sparse Dot-Product
Multicore Implementation for SpMSpV

Automated and Modular Compilation (20min): [slides]

Pragma-Hinted Compilation
The Compilation Pipeline
Modular Compilation

5-Min Break

Composing your own architecture (40min): [slides]

The Scala-embeded DSL for composing your own architecture
Integrate the spatial accelerator to Chipyard!

Hacking DSAGEN for your own research (60min): [slides]

Adding a New Instruction Capability to the PEs

Extending the RISCV ISA

Though I really really like this section, it is deleted for the sake of time. If you are interested, you can find reference in the slides.

Docker Usage

Install Docker according to its official guide.
$ mkdir docker-build && cd docker-build
$ wget [the link to the Dockerfile]
$ docker build . # You may see successfully built a hashtag
$ docker run -tid --privileged=true --hostname=dsagen --name=dsagen [the hashtag]
$ docker attach dsagen # Now you are at the / directory of this container
$ cd ~
$ <ctrl-p><ctrl-q> # Temporarily leave the container
$ exit # Terminate the container