Routers
------------------

Constellation's router generators have several microarchitectural configuration knobs.
The standard router microarchitecture follows the standard design pattern. A brief
description of the operation of a router follows:

 1. The head flit of the packet queries the router's ``RouteComputer`` to determine the next set
    candidate virtual channels it may allocate. This is the **RC** stage.
 2. The head flit of the packet queries the router's ``VirtualChannelAllocator`` to allocate
    a virtual channel from the candidate set of next virtual channels, in the **VA** stage
 3. Flits ask the ``SwitchAllocator`` for access to the crossbar switch in the router in the
    **SA** stage. This stage also checks that the next virtual channel has an empty buffer slot
    to accomodate this flit.
 4. A flit traverses the crossbar switch in the **ST** stage.

Flow-control is credit based. Matching ``InputUnit`` and ``OutputUnit`` pairs exchange credits over
a separate narrow channel to indicate the availability of buffer entries. A flit departing a ``InputUnit``
sends a credit backwards to the ``OutputUnit`` on its source router.

.. Note:: Constellation treats terminal Ingress and Egress points as special instances
	  of ``InputUnits`` and ``OutputUnits``.

.. Figure:: ../diagrams/router.svg
	    :width: 600px

	    The standard router micro-architecture

The ``router`` field of the ``NoCParams`` case class returns per-router parameters, enabling
heterogeneous designs with different router configurations.


Pipelining
^^^^^^^^^^^^^^^^

The standard pipeline microarchitecture is a 4-hop router, with RC, VA, SA, and ST on separate stages.
Two flags exist in the base router generator to reduce the hop count.

 - ``combineRCVA`` performs route-compute and virtual-channel-allocation in the same cycle. Routers
   which implement near-trivial routing policies may benefit from this setting.
 - ``combineSAST`` performs switch allocation and switch traversal in the same cycle. This is useful
   for low-radix routers.

 .. |pipeline_base| image:: ../diagrams/pipeline_base.svg
    :scale: 100%


 .. |pipeline_fast| image:: ../diagrams/pipeline_fast.svg
    :scale: 100%

+----------------------------------------+--------------------------------------------+
| | ``combineRCVA = false``              | | ``combineRCVA = true``                   |
| | ``combineSAST = false``              | | ``combineSAST = true``                   |
+========================================+============================================+
| |pipeline_base|                        | |pipeline_fast|                            |
+----------------------------------------+--------------------------------------------+

In the base microarchitecture, stalls can occur due to a delay in reallocating an output virtual
channel to a new packet. In the left diagram below, observe that no flit traverses the
switch in cycle 4, due to the delay for the virtual channel to be freed in cycle 3.

When ``coupleSAVA`` is enabled, the freed virtual channel is immediately made available on the
same cycle. However ``coupleSAVA`` can introduce long combinational paths on high-radix
routers.

 .. |credit_stall| image:: ../diagrams/credit_stall.svg
    :scale: 100%


 .. |credit_stall_free| image:: ../diagrams/credit_stall_free.svg
    :scale: 100%

+----------------------------------------+--------------------------------------------+
| ``coupleSAVA = false``                 | ``coupleSAVA = true``                      |
+========================================+============================================+
| |credit_stall|                         | |credit_stall_free|                        |
+----------------------------------------+--------------------------------------------+

Payload Width
^^^^^^^^^^^^^^^^

A router can specify its internal ``payloadWidth``. When routers with different payload
widths are connected by a channel, Constellation will autogenerate width-adapters
on the channels if the widths are multiples of each other. 


 .. |router_widths| image:: ../diagrams/router_widths.svg
    :scale: 200%

+-------------------------------------------------------------------------+--------------------+
| .. code:: scala                                                         | |router_widths|    |
|                                                                         |                    |
|    NoCParams(                                                           |                    |
|      topology = Mesh2D(2, 2)                                            |                    |
|      routerParams = (i) => UserRouterParams(payloadWidth =              |                    |
|        if (i == 1 or i == 2) 128 else 64),                              |                    |
|    )                                                                    |                    |
|                                                                         |                    |
+-------------------------------------------------------------------------+--------------------+


Virtual Channel Allocator
^^^^^^^^^^^^^^^^^^^^^^^^^

The has of the Virtual Channel Allocator has significant implications on the resulting
performance of the NoC. Currently, there are two categories of allocators implemented

 - **Single** VC allocators allocate only a single VC per cycle. These are useful in networks
   where most packets are multi-flit, as only the head flit needs to query the allocator
 - **Multi** VC allocators attempt to allocate multiple VCs per cycle.

The following allocator implementations are provided.

 - ``PIMMultiVCAllocator`` implements parallel-iterative-matching for a separable allocator
 - ``ISLIPMultiVCAllocator`` implements the ISLIP policy for a separable allocator
 - ``RotatingSingleVCAllocator`` rotates across incoming requests
 - ``PrioritizingSingleVCAllocator`` prioritizes certain VCs over others, according to the
   priorities given by the routing relation