I'm currently folding thanks to Knish great guide on a temporary (free credit) Azure VM instance with a Tesla V100 GPU PCIe 16GB. This GPU works great and went as high as 6.5 mio PPD on projects like 13434, 13437, 13438, 17316, with an average PPD above 5mio.
Despite, it has just folded a few WUs from project 16928 where the PDD was circa 1.6 mio. I've checked, and the GPU utilisation was around 80%, and the power draw 100W out of 250W (vs 200W on other projects). To sum up, I have the impression this GPU is widely underused by such WUs.
I had a look at GPUs.txt, and this GPU appears like that :
"0x10de:0x1db4:2:7:GV100GL [Tesla V100 PCIe 16GB] M 14028"
I understand the "7" is the "category" of GPU and has an impact on the assignment process. I also noticed my personal modest RTX 3060 Ti is classified as "8", although it has an average 3mio PPD on equivalent WUs.
So, I just wonder whether this GPU is properly identified and should not be "upgraded" to "8" to avoid, provided there's no complex WUs shortage, being assigned WUs which don't optimize its computing power.
I hope it makes sense.
Happy folding all !
Tesla V100 classification
Moderators: Site Moderators, FAHC Science Team
Tesla V100 classification
Nvidia RTX 3060 Ti & GTX 1660 Super - AMD Ryzen 7 5800X - MSI MEG X570 Unify - 16 GB RAM - Ubuntu 20.04.2 LTS - Nvidia drivers 460.56
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: Tesla V100 classification
7 is Volta like your card, Ampere is 8.
They are numbered by features, not performance.
https://en.wikipedia.org/wiki/Volta_(microarchitecture)
https://en.wikipedia.org/wiki/Ampere_(m ... hitecture)
The wiki on Ampere lists the differences.
I have an older Pascal card that outperforms my newer Turing cards, so newer is not always better.
They are numbered by features, not performance.
https://en.wikipedia.org/wiki/Volta_(microarchitecture)
https://en.wikipedia.org/wiki/Ampere_(m ... hitecture)
The wiki on Ampere lists the differences.
I have an older Pascal card that outperforms my newer Turing cards, so newer is not always better.
Last edited by JimboPalmer on Sun Jan 03, 2021 12:40 pm, edited 2 times in total.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Re: Tesla V100 classification
Thanks for the reply, I hadn't found any recent explanation about this classification, again I have learnt something
Nvidia RTX 3060 Ti & GTX 1660 Super - AMD Ryzen 7 5800X - MSI MEG X570 Unify - 16 GB RAM - Ubuntu 20.04.2 LTS - Nvidia drivers 460.56
Re: Tesla V100 classification
Two more facts:
* Productivity is NOT linear. Take two projects and two GPUs and benchmark the 4 combinations and the productivity figures don't quite make sense. One project may be much more productive that the other while on another project, the differences may be small. There are many factors, but the most obvious is that a protein with a small number of atoms running on a GPU with a large number of shaders performs poorly in comparison. Nothing can really be done about that except to assign big proteins to GPUs with larger numbers of shaders ... but with the variations in the projects which happen to be active, that's not always possible.
* There's an ongoing project to revamp the CPU Specie structure. GPUs are being benchmarked and the plan is to restructure everything based on performance instead of GPU Generation. It's a complex project and there's no predicting when it will alter production assignments. Given the facts in the previous paragraph, it's nearly an impossible task.
* Productivity is NOT linear. Take two projects and two GPUs and benchmark the 4 combinations and the productivity figures don't quite make sense. One project may be much more productive that the other while on another project, the differences may be small. There are many factors, but the most obvious is that a protein with a small number of atoms running on a GPU with a large number of shaders performs poorly in comparison. Nothing can really be done about that except to assign big proteins to GPUs with larger numbers of shaders ... but with the variations in the projects which happen to be active, that's not always possible.
* There's an ongoing project to revamp the CPU Specie structure. GPUs are being benchmarked and the plan is to restructure everything based on performance instead of GPU Generation. It's a complex project and there's no predicting when it will alter production assignments. Given the facts in the previous paragraph, it's nearly an impossible task.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.