58 lines
2.6 KiB
Markdown
58 lines
2.6 KiB
Markdown
|
## Native GraphServer
|
||
|
|
||
|
Native GraphServer is a minimalistic binary capable of serving inference requests to SameDiff graphs via gRPC.
|
||
|
Idea is simple: you start GraphServer, optionally providing path to serialized graph, and you can immediately start sending inference requests.
|
||
|
You can also update graphs in runtime (i.e. if you've got new model version, or want to do some A/B testing).
|
||
|
|
||
|
## Configuration
|
||
|
|
||
|
There's not too much to configure:
|
||
|
|
||
|
```
|
||
|
-p 40123 // TCP port to be used
|
||
|
-f filename.fb // path to flatbuffers file with serialized SameDiff graph
|
||
|
```
|
||
|
|
||
|
## gRPC endpoints
|
||
|
|
||
|
GraphServer at this moment has 4 endpoints:
|
||
|
- RegisterGraph(FlatGraph)
|
||
|
- ReplaceGraph(FlatGraph)
|
||
|
- ForgetGraph(FlatDropRequest)
|
||
|
- InferenceRequest(FlatInferenceRequest)
|
||
|
|
||
|
#### RegisterGraph(FlatGraph)
|
||
|
This endpoint must be used if you want to add graph to the serving process. GraphServer instance can easily handle more then 1 graph for inference requests.
|
||
|
|
||
|
#### ReplaceGraph(FlatGraph)
|
||
|
This endpoint must be used if you want to update model used for serving in safe way. However, keep in mind, if new graph expects different structure of inputs/outputs - you might want to add it with new ID instead.
|
||
|
|
||
|
#### ForgetGraph(FlatDropRequest)
|
||
|
This endpoint must be used if you want to remove graph from serving for any reason.
|
||
|
|
||
|
#### InferenceRequest(FlatInferenceRequest)
|
||
|
This endpoint must be used for actual inference requests. You send inputs in, and get outputs back. Simple as that.
|
||
|
|
||
|
## Models support
|
||
|
Native GraphServer is suited for serving of SameDiff models via flatbuffers and gRPC. It means that anything importable into SameDiff will work just fine for GraphServer. I.e. TensorFlow models.
|
||
|
We're also going to provide DL4J ComputationGraph and MultiLayerNetwork export to SameDiff, so GraphServer will be also able to server DL4J and Keras models.
|
||
|
|
||
|
## Clients
|
||
|
At this moment we only provide Java gRPC client wrapper suitable for inference. We'll add support for other languages and APIs (like REST API) over time.
|
||
|
|
||
|
## Requirements
|
||
|
|
||
|
GraphServer relies on gRPC (provided by flatbuffers), and is supposed to work via TCP/IP, so you'll have to provide an open port.
|
||
|
|
||
|
## Docker & K8S
|
||
|
|
||
|
We provide basic Dockerfile, which allows to build Docker image with GraphServer. Image is based on Ubuntu 18.04, and has reasonably small footprint.
|
||
|
|
||
|
## Roadmap
|
||
|
|
||
|
We're going to provide additional functionality over time:
|
||
|
- JSON-based REST serving
|
||
|
- Clients for other languages
|
||
|
- Extended DL4J support: DL4J -> SameDiff models conversion, which will also allow Keras -> DL4J -> SameDiff scenario
|
||
|
- Full ONNX support via SameDiff import
|
||
|
- RPM and DEB packages for simplified use out of Docker environment
|