Building models is just a small part of ML. A production solution requires so much more.
Data ingestion -> Data validation -> Data transform -> Model training -> Model analysis -> Productin Model -> Model serving.
Centralizing model on a server (or servers with a load balancer server - cloud-based).
TensorFlow serving is a part of TFX.
tf.saved_model.simple_save(
keras.backend.get_session(),
export_path,
inputs={'input_image': model.input},
outputs={t.name: t for t in model.outputs}
)
!saved_model_cli show --dir {export_path} --all
os.environ["MODEL_DIR"] = MODEL_DIR
%%bash --bg
nohup tensorflow_model_server \
--rest_api_port=8501 \
--model_name=helloworld \
--model_base_path="${MODEL_DIR}" > server.log 2>&1
(Outputs are sent to server.log)
Using json to parse data to a tensor-like datashape.
import json
xs = np.array([[9.0], [10.0]])
data = json.dumps({
"signature_name": "serving_default",
"instances": xs.tolist(),
})
print(data)
!pip install -q requests
import requests
headers = {"content-type": "application/json"}
json_response = requests.post(
"http://localhost:8501/v1/models/helloworld:predict",
data=data,
headers=headers
)
print(json_response.text)
predictions = json.loads(json_response.text)["predictions"]
Question | Answer |
---|---|
1. What’s the name of the package you install to get TensorFlow Serving? | tensorflow-model-server |
2. What Unix command is used to start TensorFlow serving in a way that will run it and continue running even if the session is disconnected? | nohup |
3. What’s the name of the production-scale ML platform for TensorFlow? | TFX |
4. What advantages do you get by running inference on a server instead of distributing the model to all your clients? | All of the above |
5. How do you prepare your model for serving? | Use TensorFlow SavedModel to save it, and then deploy it to the server |
6. If you want to inspect the inputs and outputs for your model, what command do you use? | saved_model_cli |
7. If you want to start the model server on port 8501, what parameter do you use? | –rest_api_port |
8. I want to pass a list of values (i.e. 8, 9, 10) to the server and have it perform inferences on them, what’s the correct syntax for this data? | [[8], [9], [10]] |
9. If I publish V1 a model called ‘helloworld’ and run it with a REST API on port 8501. What’s the URL of the endpoint used to run inference on localhost? | http://localhost:8501/v1/models/helloworld:predict |
10. After running inference using a model hosted on TF Serving, the following is returned. Can you explain what data was sent to the model, and what these return values mean? [ [5.77123615e-07, 2.66907847e-08, 4.7217938e-08, 1.97792871e-09, 5.31984341e-08, 0.00734644197, 3.1462946e-07, 0.0439051725, 0.000500570168, 0.948246837], [0.00227244, 6.12080342e-09, 0.967876315, 3.0579281e-06, 0.0183339939, 3.18483538e-11, 0.011510049, 1.38639566e-14, 4.19033222e-06, 4.40264526e-11], [1.45221502e-05, 0.999841571, 3.96758715e-08, 0.000131023204, 1.22008023e-05, 1.18227668e-08, 5.97860179e-08, 1.31281848e-08, 5.49047854e-07, 2.97885189e-10] ] |
You passed three items to a model that recognizes 10 classes, and it returned the probabilities for each item in each class |