Routers

Constellation’s router generators have several microarchitectural configuration knobs. The standard router microarchitecture follows the standard design pattern. A brief description of the operation of a router follows:

  1. The head flit of the packet queries the router’s RouteComputer to determine the next set candidate virtual channels it may allocate. This is the RC stage.

  2. The head flit of the packet queries the router’s VirtualChannelAllocator to allocate a virtual channel from the candidate set of next virtual channels, in the VA stage

  3. Flits ask the SwitchAllocator for access to the crossbar switch in the router in the SA stage. This stage also checks that the next virtual channel has an empty buffer slot to accomodate this flit.

  4. A flit traverses the crossbar switch in the ST stage.

Flow-control is credit based. Matching InputUnit and OutputUnit pairs exchange credits over a separate narrow channel to indicate the availability of buffer entries. A flit departing a InputUnit sends a credit backwards to the OutputUnit on its source router.

Note

Constellation treats terminal Ingress and Egress points as special instances of InputUnits and OutputUnits.

../_images/router.svg

The standard router micro-architecture

The router field of the NoCParams case class returns per-router parameters, enabling heterogeneous designs with different router configurations.

Pipelining

The standard pipeline microarchitecture is a 4-hop router, with RC, VA, SA, and ST on separate stages. Two flags exist in the base router generator to reduce the hop count.

  • combineRCVA performs route-compute and virtual-channel-allocation in the same cycle. Routers which implement near-trivial routing policies may benefit from this setting.

  • combineSAST performs switch allocation and switch traversal in the same cycle. This is useful for low-radix routers.

combineRCVA = false
combineSAST = false
combineRCVA = true
combineSAST = true

pipeline_base

pipeline_fast

In the base microarchitecture, stalls can occur due to a delay in reallocating an output virtual channel to a new packet. In the left diagram below, observe that no flit traverses the switch in cycle 4, due to the delay for the virtual channel to be freed in cycle 3.

When coupleSAVA is enabled, the freed virtual channel is immediately made available on the same cycle. However coupleSAVA can introduce long combinational paths on high-radix routers.

coupleSAVA = false

coupleSAVA = true

credit_stall

credit_stall_free

Payload Width

A router can specify its internal payloadWidth. When routers with different payload widths are connected by a channel, Constellation will autogenerate width-adapters on the channels if the widths are multiples of each other.

NoCParams(
  topology = Mesh2D(2, 2)
  routerParams = (i) => UserRouterParams(payloadWidth =
    if (i == 1 or i == 2) 128 else 64),
)

router_widths

Virtual Channel Allocator

The has of the Virtual Channel Allocator has significant implications on the resulting performance of the NoC. Currently, there are two categories of allocators implemented

  • Single VC allocators allocate only a single VC per cycle. These are useful in networks where most packets are multi-flit, as only the head flit needs to query the allocator

  • Multi VC allocators attempt to allocate multiple VCs per cycle.

The following allocator implementations are provided.

  • PIMMultiVCAllocator implements parallel-iterative-matching for a separable allocator

  • ISLIPMultiVCAllocator implements the ISLIP policy for a separable allocator

  • RotatingSingleVCAllocator rotates across incoming requests

  • PrioritizingSingleVCAllocator prioritizes certain VCs over others, according to the priorities given by the routing relation