Page 1 of 1

WS 171.64.65.124 server didn't like [Continuing Issues]

Posted: Fri Jul 24, 2015 2:16 pm
by ChristianVirtual
What are reasons that the server back in PG don't like a returned WU ? I fresh installed a EC2 and folded with my regular passkey the mentined WU. 100% per log, no issues, but server don't like it ... Very emotional server PG has :roll:

But serious: what could have been wrong with it ?

Code: Select all

*********************** Log Started 2015-07-24T12:49:07Z ***********************
12:49:07:************************* Folding@home Client *************************
12:49:07: Website: http://folding.stanford.edu/
12:49:07: Copyright: (c) 2009-2014 Stanford University
12:49:07: Author: Joseph Coffland <Joseph@cauldrondevelopment.com>
12:49:07: Args: --child --lifeline 1840 /etc/fahclient/config.xml --run-as
12:49:07: fahclient --pid-file=/var/run/fahclient.pid --daemon
12:49:07: Config: /etc/fahclient/config.xml
12:49:07:******************************** Build ********************************
12:49:07: Version: 7.4.4
12:49:07: Date: Mar 4 2014
12:49:07: Time: 12:02:38
12:49:07: SVN Rev: 4130
12:49:07: Branch: fah/trunk/client
12:49:07: Compiler: GNU 4.4.7
12:49:07: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
12:49:07: -fno-unsafe-math-optimizations -msse2
12:49:07: Platform: linux2 3.2.0-1-amd64
12:49:07: Bits: 64
12:49:07: Mode: Release
12:49:07:******************************* System ********************************
12:49:07: CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
12:49:07: CPU ID: GenuineIntel Family 6 Model 62 Stepping 4
12:49:07: CPUs: 16
12:49:07: Memory: 29.44GiB
12:49:07:Free Memory: 28.96GiB
12:49:07: Threads: POSIX_THREADS
12:49:07: OS Version: 3.13
12:49:07:Has Battery: false
12:49:07: On Battery: false
12:49:07: UTC Offset: 0
12:49:07: PID: 1842
12:49:07: CWD: /var/lib/fahclient
12:49:07: OS: Linux 3.13.0-48-generic x86_64
12:49:07: OS Arch: AMD64
12:49:07: GPUs: 0
12:49:07: CUDA: Not detected
12:49:07:***********************************************************************
12:49:07:<config>
12:49:07: <!-- Folding Slot Configuration -->
12:49:07: <client-type v='beta'/>
12:49:07: <gpu v='false'/>
12:49:07:
12:49:07:
12:49:07: <!-- Slot Control -->
12:49:07: <power v='full'/>
12:49:07:
12:49:07: <!-- User Information -->
12:49:07: <!-- Folding Slots -->
12:49:07: <slot id='0' type='CPU'/>
12:49:07:</config>
12:49:07:Switching to user fahclient
12:49:07:Trying to access database...
12:49:13:Successfully acquired database lock
12:49:13:Enabled folding slot 00: READY cpu:16
12:49:13:WU00:FS00:Starting
12:49:13:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/beta/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 1842 -checkpoint 15 -np 16
12:49:13:WU00:FS00:Started FahCore on PID 1853
12:49:13:WU00:FS00:Core PID:1857
12:49:13:WU00:FS00:FahCore 0xa4 started
12:49:13:WU00:FS00:0xa4:
12:49:13:WU00:FS00:0xa4:*------------------------------*
12:49:13:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
12:49:13:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
12:49:13:WU00:FS00:0xa4:
12:49:13:WU00:FS00:0xa4:Preparing to commence simulation
12:49:13:WU00:FS00:0xa4:- Looking at optimizations...
12:49:13:WU00:FS00:0xa4:- Files status OK
12:49:13:WU00:FS00:0xa4:- Expanded 923022 -> 1534204 (decompressed 166.2 percent)
12:49:13:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=923022 data_size=1534204, decompressed_data_size=1534204 diff=0
12:49:13:WU00:FS00:0xa4:- Digital signature verified
12:49:13:WU00:FS00:0xa4:
12:49:13:WU00:FS00:0xa4:Project: 9014 (Run 84, Clone 2, Gen 92)
12:49:13:WU00:FS00:0xa4:
12:49:13:WU00:FS00:0xa4:Assembly optimizations on if available.
12:49:13:WU00:FS00:0xa4:Entering M.D.
12:49:19:WU00:FS00:0xa4:Completed 0 out of 250000 steps (0%)
12:50:01:WU00:FS00:0xa4:Completed 2500 out of 250000 steps (1%)
12:50:43:WU00:FS00:0xa4:Completed 5000 out of 250000 steps (2%)
12:50:44:WARNING:Command server access denied
12:51:24:WU00:FS00:0xa4:Completed 7500 out of 250000 steps (3%)

14:00:09:WU00:FS00:0xa4:Completed 242500 out of 250000 steps (97%)
14:00:53:WU00:FS00:0xa4:Completed 245000 out of 250000 steps (98%)
14:01:38:WU00:FS00:0xa4:Completed 247500 out of 250000 steps (99%)
14:01:39:WU01:FS00:Connecting to 171.67.108.200:8080
14:02:23:WU00:FS00:0xa4:Completed 250000 out of 250000 steps (100%)
14:02:23:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
14:02:33:WU00:FS00:0xa4:
14:02:33:WU00:FS00:0xa4:Finished Work Unit:
14:02:33:WU00:FS00:0xa4:- Reading up to 915792 from \00/wudata_01.trr\: Read 915792
14:02:33:WU00:FS00:0xa4:trr file hash check passed.
14:02:33:WU00:FS00:0xa4:- Reading up to 838976 from \00/wudata_01.xtc\: Read 838976
14:02:33:WU00:FS00:0xa4:xtc file hash check passed.
14:02:33:WU00:FS00:0xa4:edr file hash check passed.
14:02:33:WU00:FS00:0xa4:logfile size: 23123
14:02:33:WU00:FS00:0xa4:Leaving Run
14:02:35:WU00:FS00:0xa4:- Writing 1780379 bytes of core data to disk...
14:02:35:WU00:FS00:0xa4:Done: 1779867 -> 1721350 (compressed to 96.7 percent)
14:02:35:WU00:FS00:0xa4: ... Done.
14:02:36:WU00:FS00:0xa4:- Shutting down core
14:02:36:WU00:FS00:0xa4:
14:02:36:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
14:02:37:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
14:02:37:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:9014 run:84 clone:2 gen:92 core:0xa4 unit:0x0000006cab40417c554e991fd660e145
14:02:37:WU00:FS00:Uploading 1.64MiB to 171.64.65.124
14:02:37:WU00:FS00:Connecting to 171.64.65.124:8080
14:02:37:WU00:FS00:Upload complete
14:02:37:WU00:FS00:Server responded WORK_QUIT (404)
14:02:37:WARNING:WU00:FS00:Server did not like results, dumping
14:02:37:WU00:FS00:Cleaning up
Found two more on my CPU:3
donation 00 cpu:3 0xa4 9007 226 1 95
donation 00 cpu:3 0xa4 9011 101 4 99

GPU slots seems ok ...

Re: 9014 (R 84, C 2, G 92) and others, server didn't liked

Posted: Fri Jul 24, 2015 2:37 pm
by 7im
One example, the servers do a CRC-like data check on work units when returned. If the results are not within expected parameters, the WU is dumped.

Server did not like results, dumping- 171.64.65.124

Posted: Fri Jul 24, 2015 2:42 pm
by billford
I wasn't going to report this, thought it was just a local glitch, but [URL edited out as posts have been merged - mod] (in the Beta section so I can't reply to it) made me think that perhaps it wasn't.

Code: Select all

04:53:39:WU00:FS00:0xa4:Completed 250000 out of 250000 steps  (100%)
04:53:39:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
04:53:49:WU00:FS00:0xa4:
04:53:49:WU00:FS00:0xa4:Finished Work Unit:
04:53:49:WU00:FS00:0xa4:- Reading up to 922512 from "00/wudata_01.trr": Read 922512
04:53:49:WU00:FS00:0xa4:trr file hash check passed.
04:53:49:WU00:FS00:0xa4:- Reading up to 845368 from "00/wudata_01.xtc": Read 845368
04:53:49:WU00:FS00:0xa4:xtc file hash check passed.
04:53:49:WU00:FS00:0xa4:edr file hash check passed.
04:53:49:WU00:FS00:0xa4:logfile size: 23417
04:53:49:WU00:FS00:0xa4:Leaving Run
04:53:52:WU00:FS00:0xa4:- Writing 1793785 bytes of core data to disk...
04:53:53:WU00:FS00:0xa4:Done: 1793273 -> 1732801 (compressed to 96.6 percent)
04:53:53:WU00:FS00:0xa4:  ... Done.
04:55:09:WU00:FS00:0xa4:- Shutting down core
04:55:09:WU00:FS00:0xa4:
04:55:09:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
04:55:18:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
04:55:18:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:9013 run:44 clone:0 gen:44 core:0xa4 unit:0x0000003cab40417c554e970599d9d6e9
04:55:18:WU00:FS00:Uploading 1.65MiB to 171.64.65.124
04:55:18:WU00:FS00:Connecting to 171.64.65.124:8080
04:55:20:WU00:FS00:Upload complete
04:55:20:WU00:FS00:Server responded WORK_QUIT (404)
04:55:20:WARNING:WU00:FS00:Server did not like results, dumping
04:55:20:WU00:FS00:Cleaning up
Configs:

Code: Select all

11:54:30:************************* Folding@home Client *************************
11:54:30:    Website: http://folding.stanford.edu/
11:54:30:  Copyright: (c) 2009-2014 Stanford University
11:54:30:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
11:54:30:       Args: --child --lifeline 1552 /etc/fahclient/config.xml --run-as
11:54:30:             fahclient --pid-file=/var/run/fahclient.pid --daemon
11:54:30:     Config: /etc/fahclient/config.xml
11:54:30:******************************** Build ********************************
11:54:30:    Version: 7.4.4
11:54:30:       Date: Mar 4 2014
11:54:30:       Time: 12:02:38
11:54:30:    SVN Rev: 4130
11:54:30:     Branch: fah/trunk/client
11:54:30:   Compiler: GNU 4.4.7
11:54:30:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
11:54:30:             -fno-unsafe-math-optimizations -msse2
11:54:30:   Platform: linux2 3.2.0-1-amd64
11:54:30:       Bits: 64
11:54:30:       Mode: Release
11:54:30:******************************* System ********************************
11:54:30:        CPU: Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz
11:54:30:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
11:54:30:       CPUs: 4
11:54:30:     Memory: 3.82GiB
11:54:30:Free Memory: 3.50GiB
11:54:30:    Threads: POSIX_THREADS
11:54:30: OS Version: 3.13
11:54:30:Has Battery: false
11:54:30: On Battery: false
11:54:30: UTC Offset: 1
11:54:30:        PID: 1554
11:54:30:        CWD: /var/lib/fahclient
11:54:30:         OS: Linux 3.13.0-37-generic x86_64
11:54:30:    OS Arch: AMD64
11:54:30:       GPUs: 1
11:54:30:      GPU 0: NVIDIA:3 GK110 [GeForce GTX 780 Ti]
11:54:30:       CUDA: 3.5
11:54:30:CUDA Driver: 7000
11:54:30:***********************************************************************
11:54:30:<config>
11:54:30:  <!-- Client Control -->
11:54:30:  <fold-anon v='true'/>
11:54:30:
11:54:30:  <!-- Folding Slot Configuration -->
11:54:30:  <gpu v='false'/>
11:54:30:
11:54:30:  <!-- HTTP Server -->
11:54:30:  <allow v='127.0.0.1 192.168.1.0/24'/>
11:54:30:
11:54:30:  <!-- Network -->
11:54:30:  <proxy v=':8080'/>
11:54:30:
11:54:30:  <!-- Remote Command Server -->
11:54:30:  <command-allow-no-pass v='127.0.0.1 192.168.1.0/24'/>
11:54:30:
11:54:30:  <!-- Slot Control -->
11:54:30:  <power v='full'/>
11:54:30:
11:54:30:  <!-- User Information -->
11:54:30:  <passkey v='********************************'/>
11:54:30:  <user v='[redacted]'/>
11:54:30:
11:54:30:  <!-- Work Unit Control -->
11:54:30:  <next-unit-percentage v='100'/>
11:54:30:
11:54:30:  <!-- Folding Slots -->
11:54:30:  <slot id='1' type='GPU'>
11:54:30:    <client-type v='advanced'/>
11:54:30:    <pause-on-start v='true'/>
11:54:30:  </slot>
11:54:30:  <slot id='0' type='CPU'>
11:54:30:    <client-type v='advanced'/>
11:54:30:    <cpus v='3'/>
11:54:30:  </slot>
11:54:30:</config>

Re: Server did not like results, dumping- 171.64.65.124

Posted: Fri Jul 24, 2015 2:45 pm
by ChristianVirtual
Thanks, at least I'm not alone with that one ...and its indicate a real issue

Re: Server did not like results, dumping- 171.64.65.124

Posted: Fri Jul 24, 2015 2:49 pm
by billford
Found another one-

Code: Select all

13:06:33:WU00:FS00:0xa4:Completed 247500 out of 250000 steps  (99%)
13:08:38:WU00:FS00:0xa4:Completed 250000 out of 250000 steps  (100%)
13:08:38:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
13:08:48:WU00:FS00:0xa4:Finished Work Unit:
13:08:48:WU00:FS00:0xa4:- Reading up to 872880 from "00/wudata_01.trr": Read 872880
13:08:48:WU00:FS00:0xa4:trr file hash check passed.
13:08:48:WU00:FS00:0xa4:- Reading up to 800900 from "00/wudata_01.xtc": Read 800900
13:08:48:WU00:FS00:0xa4:xtc file hash check passed.
13:08:48:WU00:FS00:0xa4:edr file hash check passed.
13:08:48:WU00:FS00:0xa4:logfile size: 23346
13:08:48:WU00:FS00:0xa4:Leaving Run
13:08:53:WU00:FS00:0xa4:- Writing 1699614 bytes of core data to disk...
13:08:53:WU00:FS00:0xa4:Done: 1699102 -> 1647270 (compressed to 96.9 percent)
13:08:53:WU00:FS00:0xa4:  ... Done.
13:10:02:WU00:FS00:0xa4:- Shutting down core
13:10:02:WU00:FS00:0xa4:
13:10:02:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
13:10:11:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
13:10:11:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:9009 run:126 clone:1 gen:57 core:0xa4 unit:0x00000049ab40417c554e8e5df5888194
13:10:11:WU00:FS00:Uploading 1.57MiB to 171.64.65.124
13:10:11:WU00:FS00:Connecting to 171.64.65.124:8080
13:10:13:WU00:FS00:Upload complete
13:10:13:WU00:FS00:Server responded WORK_QUIT (404)
13:10:13:WARNING:WU00:FS00:Server did not like results, dumping

Re: 9014 (Run 84, Clone 2, Gen 92), server didn't liked it

Posted: Fri Jul 24, 2015 2:50 pm
by Joe_H
ChristianVirtual wrote:
But serious: what could have been wrong with it ?
Not sure, but I have had a few WU's from this WS also get the same message upon being returned. There have been a couple of issues with this WS over the last week or so, and it appears to have just come back online yesterday evening. This problem might be related to that. I will let the person responsible for the server know it is still having trouble.

My system has dumped the following WU's with the same error message:

Project: 9016 (Run 77, Clone 14, Gen 91)
Project: 9013 (Run 143, Clone 5, Gen 84)
Project: 9008 (Run 35, Clone 1, Gen 83)
Project: 9009 (Run 24, Clone 1, Gen 72)

The WU's that have records in the database show successful returns by other folders much earlier this year.

Re: WS 171.64.65.124 server didn't like [Continuing Issues]

Posted: Fri Jul 24, 2015 3:16 pm
by billford
Thanks Joe.

For both reporting it and the topic merge!

Re: WS 171.64.65.124 server didn't like [Continuing Issues]

Posted: Fri Jul 24, 2015 4:53 pm
by cxh
Thanks for reporting this! It appears that 171.64.65.124 was taken down while trying to set up another WS. I've restarted it, and it should be operational now. Sorry for any inconvenience!

Re: WS 171.64.65.124 server didn't like [Continuing Issues]

Posted: Fri Jul 24, 2015 11:50 pm
by ChristianVirtual
Thanks, some recent A4s got successfully uploaded and credited