Search found 206 matches
- Fri Nov 08, 2024 12:40 am
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hmmm... I don't really watch utilization numbers. I think that may have (more?) to do with the number of atoms in a project - lower numbers won't keep all cuda cores busy.
- Thu Nov 07, 2024 7:13 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hey Nicolas If you frequently see 'attempting to restart from last good checkpoint' and then see the job continue, that may indicate the rig needs some maintenance (e.g. cleaning), or a possible hardware issue. I saw those messages now and then on another rig than my main one, but after a folding pa...
- Thu Nov 07, 2024 5:10 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Oh wow... this is coincidence. A similar issue with 16780/17/0/107, and I'm not the first who encoutered this either: https://apps.foldingathome.org/wu#project=16780&run=17&clone=0&gen=107. I literally ran thousands of jobs on this machine since the last time jobs blew up (and that was n...
- Thu Nov 07, 2024 5:07 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
See below the full log for this particular job. It appears the potential energy is off at the starting point for this job. 12:30:27:WU00:FS01:Connecting to assign1.foldingathome.org:80 12:30:27:WU00:FS01:Assigned to work server 158.130.118.23 12:30:27:WU00:FS01:Requesting new work unit for slot 01: ...
- Thu Nov 07, 2024 2:56 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hola,
So far I've done 120 jobs from P18237 succesfully, this particular one is the first one failing.
I don't know the science behind the cores and the jobs. If this project fails more often on your machine, while other projects all run fine, it makes me wonder what it's doing that's so special...
So far I've done 120 jobs from P18237 succesfully, this particular one is the first one failing.
I don't know the science behind the cores and the jobs. If this project fails more often on your machine, while other projects all run fine, it makes me wonder what it's doing that's so special...
- Tue Nov 05, 2024 7:56 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 14
- Views: 276
Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hi,
Job https://apps.foldingathome.org/wu#proje ... e=0&gen=71 is failing all the time on different systems, please pull it
Job https://apps.foldingathome.org/wu#proje ... e=0&gen=71 is failing all the time on different systems, please pull it
- Mon Nov 04, 2024 3:28 am
- Forum: Discussions of General-FAH topics
- Topic: Effects of Running FAH on Laptop w/SSD
- Replies: 1
- Views: 120
Re: Effects of Running FAH on Laptop w/SSD
Technically, an SSD wears by write actions. Fah writes a bit of logging and job/checkpoint/result data, but not excessively so. Pesonally I don't worry about it. When running Fah on a laptop I'd worry more about cooling, and with an older laptop if it's fairly clean inside (too much dust with cause ...
- Tue Oct 08, 2024 3:36 pm
- Forum: GPU Projects and FahCores
- Topic: Failing to give GPU a job. May be problem with linux.
- Replies: 5
- Views: 5246
Re: Failing to give GPU a job. May be problem with linux.
If it's not related to patching, but to (re)boots, the next question will be: what version of F@h are you running? Can you post the top 100 or 200 lines of the log (also see viewtopic.php?t=26036)?
- Tue Oct 08, 2024 12:17 am
- Forum: V8.3.xx Open Beta
- Topic: Multiple powerful GPUs in a single rig?
- Replies: 3
- Views: 3876
Re: Multiple powerful GPUs in a single rig?
Each card will work on it's own unit ~ having them work together on the same unit would be both more difficult and really inefficient. Each step in the process needs the entire status of the previous, so the GPU cards would spend more time talking with each other than actually calculating if they'd ...
- Tue Oct 08, 2024 12:11 am
- Forum: Issues with a specific server
- Topic: Work server: 128.104.69.82, Collection server: 128.174.73.74
- Replies: 15
- Views: 13260
Re: Work server: 128.104.69.82, Collection server: 128.174.73.74
All credits are in now Many thanks, guys!
- Sat Oct 05, 2024 6:31 pm
- Forum: V8.3.xx Open Beta
- Topic: Upload Of Work Units
- Replies: 2
- Views: 3240
Re: Upload Of Work Units
That points to a server issue. Can you look up in the log to which server it tried to upload? I can't help with alerting an admin/researcher to check the host, maybe someone else can.
- Sat Oct 05, 2024 4:52 am
- Forum: GPU Projects and FahCores
- Topic: Failing to give GPU a job. May be problem with linux.
- Replies: 5
- Views: 5246
Re: Failing to give GPU a job. May be problem with linux.
On Debian based Linux (including Ubuntu and Pop) first install dkms, then (re)install the Nvidia driver and have it register with dkms. After that, when the kernel is updated, the driver is automatically recompiled for that kernel. dkms is a standard package, so you can install that with apt-get ins...
- Fri Oct 04, 2024 9:31 am
- Forum: Issues with a specific server
- Topic: Work server: 128.104.69.82, Collection server: 128.174.73.74
- Replies: 15
- Views: 13260
Re: Work server: 128.104.69.82, Collection server: 128.174.73.74
The job results are most likely logged, just not collected by the central server. It'll be a matter of restoring that connection, and then the points will be assigned. It has happened before. Someone just has to nitofy the people involved... As I understand there is monitoring, but different parties...
- Thu Oct 03, 2024 7:28 pm
- Forum: Issues with a specific server
- Topic: Work server: 128.104.69.82, Collection server: 128.174.73.74
- Replies: 15
- Views: 13260
Work server: 128.104.69.82, Collection server: 128.174.73.74
Hiya, It looks like job results for these servers aren't picked up: Work server: 128.104.69.82 Collection server: 128.174.73.74 In the logs I see: 14:51:33:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:16770 run:35 clone:0 gen:55 core:0x23 unit:0x370000000000000023000000824...
- Thu Oct 03, 2024 3:44 pm
- Forum: The Science of FAH -- questions/answers
- Topic: I Have a questions
- Replies: 1
- Views: 3425
Re: I Have a questions
F@h uses what it can as for GPU capabilities, and what it needs with regards to RAM. To give a couple examples: - A modern GPU offers logic / tools for gaming (like raytracing) that can't be used by for the calculations F@h is doing. On Nvidia F@h only can use the CUDA cores. - A GPU may offer more ...