Did somebody try deploy three llama3-8b on four T4?

Barianc · June 17, 2024, 8:21am

Did somebody try deploy three llama3-8b on four T4? I wonder if it’s possible to use three llama3-8b models

liberifatali · June 17, 2024, 2:51pm

How large is each llama3-8b model? If it is 16GB each, it’s possible.

Barianc · June 18, 2024, 5:02am

I tested it on my machine, and each model is about 20GB, so I wonder if it can be deployed online.

liberifatali · June 18, 2024, 2:04pm

There are 4 T4 cards. But I see that the total GPU memory is < 60GB.