
Error was:
ERROR:exception: Error downloading array energyBuffer: clEnqueueReadBuffer (-36)
and I had to reboot to get it folding again!
A question- if (like this one) I get a "bad WU" that is probably my own fault, should I still report it?
Moderators: Site Moderators, FAHC Science Team
Yup, that's me- I don't fold using the same name as I post. No ulterior motive, it just happened that way.bruce wrote:In this case, there's only one error report and it's from "lbford"
So I'm discoveringbruce wrote:Stanford has done stability testing on consumer grade GPUs and has determined they're pretty good but not 100%
I didn't know that. And, presumably, why the client keeps "sanity checking" the results as it goes so it can (mostly!) back up to the last known good state… it's a more efficient paradigm in my opinion.bruce wrote:Other DC projects routinely reassign every WU and require multiple completions of the same analysis to match. I guess part of that depends on whether you have more work to be assigned than Donors or you have more Donors than it takes to do all the work twice (plus).
If I were testing I'd be running a lot more over-clock to find out where it broke at normal room temperatures… now I know to keep an eye on the room thermometer (and the weather forecast!) and back off a bit at ambients above about 26ºC.7im wrote:Please tell us you are not stability testing on live data!
bruce wrote:It's currently 95F / 35C in this room and I'm not comfortable either, but my GPUs are happy.
I'd guess that a) those conditions aren't particularly unusual where you live and b) you have other priorities than pandering to your GPUs (like earning a living!), so they're set up to cope with anything but the most extreme conditions they're likely to encounter.P5-133XL wrote:The past couple of days, outside has peaked in the low 90's F. It is extremely uncomfortable inside. I still do not have any problems keeping the GPU's less than 80C which is my temp goal.
AFAIK, only a single file is uploaded upon WU completion/failure. IIRC, it would be WUresults.dat (or something similar).billford wrote:I noticed that an assortment of files were uploaded, including a log file, so I assumed an error report would be in there somewhere so the WU could be re-issued...