How to call the API during online inference?

I want to know whether online inference must be streaming inference. Does each piece of data have to return a string after calling model.predict()? How can I modify it to receive all the data and then perform batch inference at once?

At present we only support streaming inference. We are looking into the possibility of supporting batch inference. Will let you know soon.