You know, that is probably enough to run the smaller Granite models purely in vRAM :) 350M for sure, 1B as well but it might need to be quantised, depending on how many tokens you're feeding it.
Which operating system? Do you have CUDA running on it?