How to call the API during online inference?

ti1bur · March 27, 2024, 11:41am

I want to know whether online inference must be streaming inference. Does each piece of data have to return a string after calling model.predict()? How can I modify it to receive all the data and then perform batch inference at once?

yilun_jin · March 27, 2024, 6:45pm

At present we only support streaming inference. We are looking into the possibility of supporting batch inference. Will let you know soon.