Concepts Design Doc

This document describes core concepts of ElasticDL.

An ElasticDL model consists of two kinds of parameters:

dense parameters, and
embedding tables.

A dense parameter is a dense tensor with a name. An embedding table is a map from some ID to embedding vectors. An embedding table also has a name. Formalizing the concepts, we have

model = {dense parameter} + {embedding table}
dense parameter = tensor + name
embedding table = {id, tensor} + name

where the curly braces denote zero or more.

To update the model, workers compute and report gradients. Accordingly, we have two kinds of gradients:

dense gradient, and
embedding table gradient

The content of dense gradient is the same as that of the dense parameter. The content of embedding table gradient is the same as that of the embedding table.

On the parameter server, we’d prefer to maintain each embedding table as a map from ID to embedding vectors. With such a data structure, it is efficient to allocate memory for new embedding vectors. On the contrary, we’d concatenate embedding vectors into protobuf messages for parameter pulling and gradient pushing. We cannot use concatenated embedding vectors as the in-memory data structure on the PS, because allocating new embedding vectors involves resize the space of concatenated embedding vectors.

Let’s make a short summary, following is all the core concepts of ElasticDL include:

model = {dense parameter} + {embedding table}
dense parameter = tensor + name
embedding table = tensor + ID + name

Message Representation

There is a tensor proto message defined in TensorFlow, which meets our needs. We could reuse it directly.

We introduce an IndexedSlices proto message to represent the concatenated embedding vectors pulled from PS, and the concatenated embedding vectors of gradient waiting to be pushed to PS.

The definition of elasticdl.proto:

import "tensorflow/tensorflow/core/framework/tensor.proto"

message IndexedSlices {
  tensorflow.Tensor concat_tensors = 1;
  repeated int64 ids = 2;
}

message Model {
  int32 version = 1; // model updated times
  map<string, tensorflow.Tensor> dense_parameters = 2;
  map<string, IndexedSlices> embedding_tables = 3;
}

For in-memory part, we introduce an EmbeddingTable data structure.

type EmbeddingTable struct {
    Name            string
    Dim             int64
    Initializer     string
    EmbeddingVector map[int64]*tensorflow.Tensor
}

type Model struct {
    Version           int32
    InitStatus        bool
    DenseParameters   map[string]*tensorflow.Tensor
    EmbeddingTables   map[string]*EmbeddingTable
}

RPC Service

Following is some auxiliary messages needed by RPC services.

message PullDenseParametersRequest {
  int32 version = 1;
}

message PullDenseParametersResponse {
  bool initialized = 1;
  map<string, tensorflow.Tensor> = 2;
}

message PullEmbeddingTableRequest {
  string name = 1;
  repeated int64 indices = 2;
}

message EmbeddingTableInfo {
  string name = 1;
  int64 dim = 2;
  string initializer = 3;
}

message EmbeddingTableInfos {
  repeated EmbeddingTableInfo embedding_table_infos = 1
}

message PushGradientsResponse {
  bool accepted = 1;
  int32 version = 2;
}

Following is RPC services between PS and worker.

service Pserver {
  rpc pull_dense_parameters(PullDenseParametersRequest) returns (PullDenseParametersResponse);
  rpc pull_embedding_table(PullEmbeddingTableRequest) returns (IndexedSlices);
  rpc push_dense_paramters(Model) returns (google.protobuf.Empty);
  rpc push_embedding_table_infos(EmbeddingTableInfos) returns (google.protobuf.Empty);
  rpc push_gradients(Model) returns (PushGradientsResponse);
}