I want to know whether online inference must be streaming inference. Does each piece of data have to return a string after calling model.predict()? How can I modify it to receive all the data and then perform batch inference at once?
How to call the API during online inference?
Amazon KDD Cup 2024: Multi-Task Online Shopping Ch
Amazon KDD Cup 24: Understanding Shopping Concepts
At present we only support streaming inference. We are looking into the possibility of supporting batch inference. Will let you know soon.