https://vengineer.hatenablog.com/entry/2020/02/28/060000

@Vengineerの戯言 : Twitter
SystemVerilogの世界へようこそ、すべては、SystemC v0.9公開から始まった

あと一つ、Poplarの例題があった。

examples/code_examples/tensorflow/custom_op

この例題は、TensorFlowのコードの中で、PoplarのCustom Opをどのようにして使うかというもの。

Custom Opのコードは、これで、2つのTensorの各要素を加算するもの。

Poplarのコードは、、これ。2つのメソッド’IsElementWise, Build)を定義する必要がある模様。Buildメソッドの戻り値は、poplar::program::Program となっていて、コンパイル済みのコードになる模様。

// If an operation takes one or more tensors of the same shape,
// and performs an expression on only corresponding elements in
// the input tensors, and produces a tensor of the same shape,
// then it is elementwise.
extern "C" bool IsElementWise() { return true; }

// The Build function constructs the Poplar graph that computes the custom op.
extern "C" poplar::program::Program Build(
poplar::Graph& graph, const std::vector<poplar::Tensor>& inputs,
std::vector<poplar::Tensor>& outputs, const std::string& debugPrefix) {

プログラムの中では、その1とその2で見てきたAPI以外のものとして、

// Get the tile mapping for the complete tensor. We will map the vertices so
// that they match the layout of the 'x' input tensor (input[0]). If the 'x'
// tensor was layed out differently to the other ones, then Poplar will
// insert code to move the data in the other tensors to the mapped tile. So
// ideally we would choose the best mapping for the vertices by analysing
// all of the tensor mappings.
auto tileMapping = graph.getTileMapping(inputs[0]);

このコードは、入力(iputs[0])をTileにマッピングするもの。複数個のTileにマッピングすることもある。ただし、下記のコードにあるように、何も入っていないときもある模様。

for (unsigned tile = 0; tile != tileMapping.size(); ++tile) {
// If a tile contains no elements of the tensor then do not create any
// vertices for it.
if (tileMapping[tile].empty()) {
continue;
}

splitRegionsBetweenWorkersにて、Worker間でなんか領域を分けるみたいですね。

// Split up the regions of the inputs tensors so that they are evenly
// distributed between the workers on the tile.
auto vertexRegions = poputil::splitRegionsBetweenWorkers(
target, tileMapping[tile], vectorWidth, 2 * vectorWidth);

分割した後に、その領域が空じゃないときは、Vertexを追加し、Tileにマッピングしていますね。

for (const auto& regions : vertexRegions) {
// If a region has no elements, then there is no need to add a vertex for
// it.
if (regions.empty()) {
continue;
}

// Add codelets to tiles which work over the regions in the input
// tensors.
auto v = graph.addVertex(cs, poputil::templateVertex("VectorAdd", dType),
{{"z", xOutputFlat.slices(regions)},
{"x", xFlat.slices(regions)},
{"y", yFlat.slices(regions)}});

// Map the vertex onto the appropriate tile.
graph.setTileMapping(v, tile);

// Provide a bogus cycle count estimate for the profiler.
graph.setCycleEstimate(v, 1);
}

入力データのVectorAddをいっぺんに処理するので、入力データの要素文のAdderをTileにマッピングするって感じですね。