Did somebody try deploy three llama3-8b on four T4?

Did somebody try deploy three llama3-8b on four T4? I wonder if it’s possible to use three llama3-8b models

How large is each llama3-8b model? If it is 16GB each, it’s possible.

I tested it on my machine, and each model is about 20GB, so I wonder if it can be deployed online. :face_with_raised_eyebrow:

There are 4 T4 cards. But I see that the total GPU memory is < 60GB.