Project: 6881 (Run 333, Clone 11, Gen 93)

Moderators: Site Moderators, FAHC Science Team

Post Reply
brooknet
Posts: 11
Joined: Sat Apr 23, 2011 5:27 am

Project: 6881 (Run 333, Clone 11, Gen 93)

Post by brooknet »

There I was, thinking that I'd hit my limit for bad WUs this year (I had one a few months ago), when along comes another one...
This log has been edited to remove verbose, repetitive sections (marked with '...'). As stated before, the 'usb_disk' referred to is an ATA hard drive on a USB adaptor.

Code: Select all

--- Opening Log file [June 28 15:41:38] 


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /media/usb_disk/Spatula/home/lex/FAH/run
Executable: ./fah6
Arguments: -verbosity 9 

[15:41:38] - Ask before connecting: No
[15:41:38] - User name: telexl (Team 86565)
[15:41:38] - User ID: 5D6EBFC80B535272
[15:41:38] - Machine ID: 1
[15:41:38] 
[15:41:38] Loaded queue successfully.
[15:41:38] 
[15:41:38] + Processing work unit
[15:41:38] Core required: FahCore_78.exe
[15:41:38] Core found.
[15:41:38] - Autosending finished units...
[15:41:38] Trying to send all finished work units
[15:41:38] + No unsent completed units remaining.
[15:41:38] - Autosend completed
[15:41:38] Working on Unit 04 [June 28 15:41:38]
[15:41:38] + Working ...
[15:41:38] - Calling './FahCore_78.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 2264 -version 602'
[15:41:38] 
[15:41:38] *------------------------------*
[15:41:38] Folding@Home Gromacs Core
[15:41:38] Version 1.90 (March 8, 2006)
[15:41:38] 
[15:41:38] Preparing to commence simulation
[15:41:38] - Ensuring status. Please wait.
[15:41:55] - Looking at optimizations...
[15:41:55] - Working with standard loops on this execution.
[15:41:56] - Previous termination of core was improper.
[15:41:56] - Files status OK
[15:41:56] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[15:41:56] 
[15:41:56] Project: 6881 (Run 333, Clone 11, Gen 93)
[15:41:56] 
[15:41:56] Entering M.D.
[15:42:16] (Starting from checkpoint)
[15:42:16] Protein: ALZHEIMERS DISEASE AMYLOID
[15:42:16] 
[15:42:16] Writing local files
[15:42:33] Completed 17500 out of 250000 steps  (7%)
[15:57:34] Timered checkpoint triggered.
[15:59:50] Writing local files
[15:59:50] Completed 20000 out of 250000 steps  (8%)
[16:14:50] Timered checkpoint triggered.
[16:16:32] Writing local files
[16:16:32] Completed 22500 out of 250000 steps  (9%)
[16:31:32] Timered checkpoint triggered.
[16:33:13] Writing local files
[16:33:13] Completed 25000 out of 250000 steps  (10%)

...

[19:36:48] Completed 52500 out of 250000 steps  (21%)
[19:44:12] CoreStatus = 0 (0)
[19:44:12] Client-core communications error: ERROR 0x0
[19:44:12] Deleting current work unit & continuing...
[19:44:30] Trying to send all finished work units
[19:44:30] + No unsent completed units remaining.
[19:44:30] - Preparing to get new work unit...
[19:44:30] + Attempting to get work packet
[19:44:30] - Will indicate memory of 1000 MB
[19:44:30] - Detect CPU. Vendor: AuthenticAMD, Family: 6, Model: 8, Stepping: 1
[19:44:30] - Connecting to assignment server
[19:44:30] Connecting to http://assign.stanford.edu:8080/
[19:44:31] Posted data.
[19:44:31] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[19:44:31] + News From Folding@Home: Welcome to Folding@Home
[19:44:31] Loaded queue successfully.
[19:44:31] Connecting to http://171.67.108.33:8080/
[19:44:33] Posted data.
[19:44:33] Initial: 0000; - Receiving payload (expected size: 376432)
[19:44:38] - Downloaded at ~73 kB/s
[19:44:38] - Averaged speed for that direction ~180 kB/s
[19:44:38] + Received work.
[19:44:38] + Closed connections
[19:44:43] 
[19:44:43] + Processing work unit
[19:44:43] Core required: FahCore_78.exe
[19:44:43] Core found.
[19:44:43] Working on Unit 05 [June 28 19:44:43]
[19:44:43] + Working ...
[19:44:43] - Calling './FahCore_78.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 2264 -version 602'
[19:44:43] 
[19:44:43] *------------------------------*
[19:44:43] Folding@Home Gromacs Core
[19:44:43] Version 1.90 (March 8, 2006)
[19:44:43] 
[19:44:43] Preparing to commence simulation
[19:44:43] - Looking at optimizations...
[19:44:43] - Created dyn
[19:44:43] - Files status OK
[19:44:44] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[19:44:44] - Starting from initial work packet
[19:44:44] 
[19:44:44] Project: 6881 (Run 333, Clone 11, Gen 93)
[19:44:44] 
[19:44:44] Assembly optimizations on if available.
[19:44:44] Entering M.D.
[19:44:50] Protein: ALZHEIMERS DISEASE AMYLOID
[19:44:50] 
[19:44:50] Writing local files
[19:44:58] Extra SSE boost OK.
[19:44:59] Writing local files
[19:44:59] Completed 0 out of 250000 steps  (0%)
[19:54:35] Writing local files
[19:54:35] Completed 2500 out of 250000 steps  (1%)
[20:04:10] Writing local files
[20:04:10] Completed 5000 out of 250000 steps  (2%)
[20:13:46] Writing local files
[20:13:47] Completed 7500 out of 250000 steps  (3%)
[20:23:22] Writing local files
[20:23:23] Completed 10000 out of 250000 steps  (4%)
[20:32:58] Writing local files
[20:32:58] Completed 12500 out of 250000 steps  (5%)

...

[21:30:35] Completed 27500 out of 250000 steps  (11%)
[21:39:48] CoreStatus = 0 (0)
[21:39:48] Client-core communications error: ERROR 0x0
[21:39:48] Deleting current work unit & continuing...
[21:40:05] Trying to send all finished work units
[21:40:05] + No unsent completed units remaining.
[21:40:05] - Preparing to get new work unit...
[21:40:05] + Attempting to get work packet
[21:40:05] - Will indicate memory of 1000 MB
[21:40:05] - Connecting to assignment server
[21:40:05] Connecting to http://assign.stanford.edu:8080/
[21:40:06] Posted data.
[21:40:06] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[21:40:06] + News From Folding@Home: Welcome to Folding@Home
[21:40:07] Loaded queue successfully.
[21:40:07] Connecting to http://171.67.108.33:8080/
[21:40:08] Posted data.
[21:40:08] Initial: 0000; - Receiving payload (expected size: 376432)
[21:40:09] - Downloaded at ~367 kB/s
[21:40:09] - Averaged speed for that direction ~218 kB/s
[21:40:09] + Received work.
[21:40:09] + Closed connections
[21:40:14] 
[21:40:14] + Processing work unit
[21:40:14] Core required: FahCore_78.exe
[21:40:14] Core found.
[21:40:14] Working on Unit 06 [June 28 21:40:14]
[21:40:14] + Working ...
[21:40:14] - Calling './FahCore_78.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 2264 -version 602'
[21:40:14] 
[21:40:14] *------------------------------*
[21:40:14] Folding@Home Gromacs Core
[21:40:14] Version 1.90 (March 8, 2006)
[21:40:14] 
[21:40:14] Preparing to commence simulation
[21:40:14] - Looking at optimizations...
[21:40:14] - Created dyn
[21:40:14] - Files status OK
[21:40:15] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[21:40:15] - Starting from initial work packet
[21:40:15] 
[21:40:15] Project: 6881 (Run 333, Clone 11, Gen 93)
[21:40:15] 
[21:40:15] Assembly optimizations on if available.
[21:40:15] Entering M.D.
[21:40:21] Protein: ALZHEIMERS DISEASE AMYLOID
[21:40:21] 
[21:40:21] Writing local files
[21:40:29] Extra SSE boost OK.
[21:40:30] Writing local files
[21:40:30] Completed 0 out of 250000 steps  (0%)
[21:41:38] - Autosending finished units...
[21:41:38] Trying to send all finished work units
[21:41:38] + No unsent completed units remaining.
[21:41:38] - Autosend completed
[21:50:07] Writing local files
[21:50:07] Completed 2500 out of 250000 steps  (1%)
[21:59:42] Writing local files
[21:59:42] Completed 5000 out of 250000 steps  (2%)
[22:09:19] Writing local files
[22:09:19] Completed 7500 out of 250000 steps  (3%)
[22:18:54] Writing local files
[22:18:54] Completed 10000 out of 250000 steps  (4%)
[22:28:31] Writing local files
[22:28:31] Completed 12500 out of 250000 steps  (5%)
[22:38:05] Writing local files
[22:38:05] Completed 15000 out of 250000 steps  (6%)
[22:47:42] Writing local files
[22:47:42] Completed 17500 out of 250000 steps  (7%)
[22:57:19] Writing local files
[22:57:19] Completed 20000 out of 250000 steps  (8%)
[23:06:55] Writing local files
[23:06:55] Completed 22500 out of 250000 steps  (9%)
[23:16:32] Writing local files
[23:16:32] Completed 25000 out of 250000 steps  (10%)
[23:26:09] Writing local files
[23:26:09] Completed 27500 out of 250000 steps  (11%)
[23:35:22] CoreStatus = 0 (0)
[23:35:22] Client-core communications error: ERROR 0x0
[23:35:22] - Attempting to download new core...
[23:35:22] + Downloading new core: FahCore_78.exe
[23:35:22] Downloading core (/~pande/Linux/x86/Core_78.fah from www.stanford.edu)
[23:35:23] Initial: AFDE; + 10240 bytes downloaded

...

[23:35:26] Initial: 97FE; + 522240 bytes downloaded
[23:35:29] Initial: 6B45; + 1134407 bytes downloaded
[23:35:29] Verifying core Core_78.fah...
[23:35:29] Signature is VALID
[23:35:29] 
[23:35:29] Trying to unzip core FahCore_78.exe
[23:35:30] Decompressed FahCore_78.exe (3435296 bytes) successfully
[23:35:30] + Core successfully engaged
[23:35:30] Deleting current work unit & continuing...
[23:35:48] Trying to send all finished work units
[23:35:48] + No unsent completed units remaining.
[23:35:48] - Preparing to get new work unit...
[23:35:48] + Attempting to get work packet
[23:35:48] - Will indicate memory of 1000 MB
[23:35:48] - Connecting to assignment server
[23:35:48] Connecting to http://assign.stanford.edu:8080/
[23:35:49] Posted data.
[23:35:49] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[23:35:49] + News From Folding@Home: Welcome to Folding@Home
[23:35:49] Loaded queue successfully.
[23:35:49] Connecting to http://171.67.108.33:8080/
[23:35:50] Posted data.
[23:35:50] Initial: 0000; - Receiving payload (expected size: 376432)
[23:35:52] - Downloaded at ~183 kB/s
[23:35:52] - Averaged speed for that direction ~211 kB/s
[23:35:52] + Received work.
[23:35:52] + Closed connections
[23:35:57] 
[23:35:57] + Processing work unit
[23:35:57] Core required: FahCore_78.exe
[23:35:57] Core found.
[23:35:57] Working on Unit 07 [June 28 23:35:57]
[23:35:57] + Working ...
[23:35:57] - Calling './FahCore_78.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 2264 -version 602'
[23:35:57] 
[23:35:57] *------------------------------*
[23:35:57] Folding@Home Gromacs Core
[23:35:57] Version 1.90 (March 8, 2006)
[23:35:57] 
[23:35:57] Preparing to commence simulation
[23:35:57] - Looking at optimizations...
[23:35:57] - Created dyn
[23:35:57] - Files status OK
[23:35:57] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[23:35:57] - Starting from initial work packet
[23:35:57] 
[23:35:57] Project: 6881 (Run 333, Clone 11, Gen 93)
[23:35:57] 
[23:35:57] Assembly optimizations on if available.
[23:35:57] Entering M.D.
[23:36:03] Protein: ALZHEIMERS DISEASE AMYLOID
[23:36:03] 
[23:36:03] Writing local files
[23:36:12] Extra SSE boost OK.
[23:36:12] Writing local files
[23:36:12] Completed 0 out of 250000 steps  (0%)
[23:45:50] Writing local files
[23:45:50] Completed 2500 out of 250000 steps  (1%)
[23:55:26] Writing local files
[23:55:26] Completed 5000 out of 250000 steps  (2%)
[00:05:04] Writing local files
[00:05:04] Completed 7500 out of 250000 steps  (3%)
[00:14:41] Writing local files

...

[01:22:02] Completed 27500 out of 250000 steps  (11%)
[01:31:16] CoreStatus = 0 (0)
[01:31:16] Client-core communications error: ERROR 0x0
[01:31:16] Deleting current work unit & continuing...
[01:31:33] Trying to send all finished work units
[01:31:33] + No unsent completed units remaining.
01:31:33] - Preparing to get new work unit...
[01:31:33] + Attempting to get work packet
[01:31:33] - Will indicate memory of 1000 MB
[01:31:33] - Connecting to assignment server
[01:31:33] Connecting to http://assign.stanford.edu:8080/
[01:31:34] Posted data.
[01:31:34] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[01:31:34] + News From Folding@Home: Welcome to Folding@Home
[01:31:35] Loaded queue successfully.
[01:31:35] Connecting to http://171.67.108.33:8080/
[01:31:36] Posted data.
[01:31:36] Initial: 0000; - Receiving payload (expected size: 376432)
[01:31:40] - Downloaded at ~91 kB/s
[01:31:40] - Averaged speed for that direction ~187 kB/s
[01:31:40] + Received work.
[01:31:40] + Closed connections
[01:31:45] 
[01:31:45] + Processing work unit
[01:31:45] Core required: FahCore_78.exe
[01:31:45] Core found.
[01:31:45] Working on Unit 08 [June 29 01:31:45]
[01:31:45] + Working ...
[01:31:45] - Calling './FahCore_78.exe -dir work/ -suffix 08 -checkpoint 15 -verbose -lifeline 2264 -version 602'

[01:31:45] 
[01:31:45] *------------------------------*
[01:31:45] Folding@Home Gromacs Core
[01:31:45] Version 1.90 (March 8, 2006)
[01:31:45] 
[01:31:45] Preparing to commence simulation
[01:31:45] - Looking at optimizations...
[01:31:45] - Created dyn
[01:31:45] - Files status OK
[01:31:45] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[01:31:45] - Starting from initial work packet
[01:31:45] 
[01:31:45] Project: 6881 (Run 333, Clone 11, Gen 93)
[01:31:45] 
[01:31:45] Assembly optimizations on if available.
[01:31:45] Entering M.D.
[01:31:51] Protein: ALZHEIMERS DISEASE AMYLOID
[01:31:51] 
[01:31:51] Writing local files
[01:32:00] Extra SSE boost OK.
[01:32:00] Writing local files
[01:32:00] Completed 0 out of 250000 steps  (0%)
[01:41:38] Writing local files
[01:41:38] Completed 2500 out of 250000 steps  (1%)
[01:51:13] Writing local files
[01:51:13] Completed 5000 out of 250000 steps  (2%)

...

[09:25:08] Completed 122500 out of 250000 steps  (49%)
[09:34:46] Writing local files
[09:34:46] Completed 125000 out of 250000 steps  (50%)
[09:41:38] - Autosending finished units...
[09:41:38] Trying to send all finished work units
[09:41:38] + No unsent completed units remaining.
[09:41:38] - Autosend completed
[09:44:23] Writing local files
[09:44:23] Completed 127500 out of 250000 steps  (51%)
[09:54:00] Writing local files
[09:54:00] Completed 130000 out of 250000 steps  (52%)
[10:03:37] Writing local files
[10:03:37] Completed 132500 out of 250000 steps  (53%)
At this point, the electricians were running some electrically-noisy cutting/drilling equipment and it caused the earth-leakage circuit breaker to trip, and off went all of the computers (I can't afford a UPS). It took me a few hours to fix it because I was asleep at the time (it was a late night..); those electricians start early.

Code: Select all

--- Opening Log file [June 29 12:00:30] 


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /media/usb_disk/Spatula/home/lex/FAH/run
Executable: ./fah6
Arguments: -verbosity 9 

[12:00:30] - Ask before connecting: No
[12:00:30] - User name: telexl (Team 86565)
[12:00:30] - User ID: 5D6EBFC80B535272
[12:00:30] - Machine ID: 1
[12:00:30] 
[12:00:30] Loaded queue successfully.
[12:00:30] 
[12:00:30] + Processing work unit
[12:00:30] Core required: FahCore_78.exe
[12:00:30] Core found.
[12:00:30] - Autosending finished units...
[12:00:30] Trying to send all finished work units
[12:00:30] + No unsent completed units remaining.
[12:00:30] - Autosend completed
[12:00:30] Working on Unit 08 [June 29 12:00:30]
[12:00:30] + Working ...
[12:00:30] - Calling './FahCore_78.exe -dir work/ -suffix 08 -checkpoint 15 -verbose -lifeline 2078 -version 602'

[12:00:30] 
[12:00:30] *------------------------------*
[12:00:30] Folding@Home Gromacs Core
[12:00:30] Version 1.90 (March 8, 2006)
[12:00:30] 
[12:00:30] Preparing to commence simulation
[12:00:30] - Ensuring status. Please wait.
[12:00:47] - Looking at optimizations...
[12:00:47] - Working with standard loops on this execution.
[12:00:47] - Previous termination of core was improper.
[12:00:47] - Files status OK
[12:00:48] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[12:00:48] 
[12:00:48] Project: 6881 (Run 333, Clone 11, Gen 93)
[12:00:48] 
[12:00:48] Entering M.D.
[12:01:08] (Starting from checkpoint)
[12:01:08] Protein: ALZHEIMERS DISEASE AMYLOID
[12:01:08] 
[12:01:08] Writing local files
[12:01:17] Completed 132500 out of 250000 steps  (53%)
[12:16:18] Timered checkpoint triggered.
[12:20:17] Writing local files
[12:20:17] Completed 135000 out of 250000 steps  (54%)
[12:35:18] Timered checkpoint triggered.
[12:37:15] Writing local files
[12:37:15] Completed 137500 out of 250000 steps  (55%)
[12:39:59] CoreStatus = 0 (0)
[12:39:59] Client-core communications error: ERROR 0x0
[12:39:59] Deleting current work unit & continuing...
[12:40:17] Trying to send all finished work units
[12:40:17] + No unsent completed units remaining.
[12:40:17] - Preparing to get new work unit...
[12:40:17] + Attempting to get work packet
[12:40:17] - Will indicate memory of 1000 MB
[12:40:17] - Detect CPU. Vendor: AuthenticAMD, Family: 6, Model: 8, Stepping: 1
[12:40:17] - Connecting to assignment server
[12:40:17] Connecting to http://assign.stanford.edu:8080/
[12:40:18] Posted data.
[12:40:18] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[12:40:18] + News From Folding@Home: Welcome to Folding@Home
[12:40:18] Loaded queue successfully.
[12:40:18] Connecting to http://171.67.108.33:8080/
[12:40:20] Posted data.
[12:40:20] Initial: 0000; - Receiving payload (expected size: 376432)
[12:40:22] - Downloaded at ~183 kB/s
[12:40:22] - Averaged speed for that direction ~186 kB/s
[12:40:22] + Received work.
[12:40:22] + Closed connections
[12:40:27] 
[12:40:27] + Processing work unit
[12:40:27] Core required: FahCore_78.exe
[12:40:27] Core found.
[12:40:27] Working on Unit 09 [June 29 12:40:27]
[12:40:27] + Working ...
[12:40:27] - Calling './FahCore_78.exe -dir work/ -suffix 09 -checkpoint 15 -verbose -lifeline 2078 -version 602'

[12:40:27] 
[12:40:27] *------------------------------*
[12:40:27] Folding@Home Gromacs Core
[12:40:27] Version 1.90 (March 8, 2006)
[12:40:27] 
[12:40:27] Preparing to commence simulation
[12:40:27] - Looking at optimizations...
[12:40:27] - Created dyn
[12:40:27] - Files status OK
[12:40:27] - Expanded 375920 -> 1809460 (decompressed 481.3 percent)
[12:40:27] - Starting from initial work packet
[12:40:27] 
[12:40:27] Project: 6881 (Run 333, Clone 11, Gen 93)
[12:40:27] 
[12:40:27] Assembly optimizations on if available.
[12:40:27] Entering M.D.
[12:40:33] Protein: ALZHEIMERS DISEASE AMYLOID
[12:40:33] 
[12:40:33] Writing local files
[12:40:42] Extra SSE boost OK.
[12:40:42] Writing local files
[12:40:42] Completed 0 out of 250000 steps  (0%)
[12:50:20] Writing local files
[12:50:20] Completed 2500 out of 250000 steps  (1%)
[12:59:56] Writing local files
[12:59:56] Completed 5000 out of 250000 steps  (2%)
[13:09:34] Writing local files
[13:09:34] Completed 7500 out of 250000 steps  (3%)
[13:19:11] Writing local files
[13:19:11] Completed 10000 out of 250000 steps  (4%)
[13:28:48] Writing local files
[13:28:48] Completed 12500 out of 250000 steps  (5%)
[13:38:23] Writing local files
[13:38:23] Completed 15000 out of 250000 steps  (6%)
[13:48:01] Writing local files
[13:48:01] Completed 17500 out of 250000 steps  (7%)
[13:57:39] Writing local files
[13:57:39] Completed 20000 out of 250000 steps  (8%)
[14:07:16] Writing local files
[14:07:16] Completed 22500 out of 250000 steps  (9%)
[14:16:53] Writing local files
[14:16:53] Completed 25000 out of 250000 steps  (10%)
[14:26:32] Writing local files
[14:26:32] Completed 27500 out of 250000 steps  (11%)
[14:35:46] CoreStatus = 0 (0)
[14:35:46] Client-core communications error: ERROR 0x0
[14:35:46] Deleting current work unit & continuing...
[14:36:03] Trying to send all finished work units
[14:36:03] + No unsent completed units remaining.
[14:36:03] - Preparing to get new work unit...
[14:36:03] + Attempting to get work packet
[14:36:03] - Will indicate memory of 1000 MB
[14:36:03] - Connecting to assignment server
[14:36:03] Connecting to http://assign.stanford.edu:8080/
[14:36:04] Posted data.
[14:36:04] Initial: 43AB; - Successful: assigned to (171.67.108.33).
[14:36:04] + News From Folding@Home: Welcome to Folding@Home
[14:36:04] Loaded queue successfully.
[14:36:04] Connecting to http://171.67.108.33:8080/
[14:36:06] Posted data.
[14:36:06] Initial: 0000; - Receiving payload (expected size: 376432)
[14:36:07] - Downloaded at ~367 kB/s
[14:36:07] - Averaged speed for that direction ~222 kB/s
[14:36:07] + Received work.
[14:36:07] + Closed connections
[14:36:12] 
[14:36:12] + Processing work unit
[14:36:12] Core required: FahCore_78.exe
[14:36:12] Core found.
[14:36:12] Working on Unit 00 [June 29 14:36:12]
[14:36:12] + Working ...
[14:36:12] - Calling './FahCore_78.exe -dir work/ -suffix 00 -checkpoint 15 -verbose -lifeline 2078 -version 602'

[14:36:12] 
[14:36:12] *------------------------------*
[14:36:12] Folding@Home Gromacs Core
[14:36:12] Version 1.90 (March 8, 2006)
And so it repeats, over and over again: always the same unit, and usually aborting at 11%.

This is a dumb question, but I have to ask, out of curiosity: when a unit returns an error, why does the server send the same unit back again? Why doesn't it let someone else try it and then if the same unit aborts early on many different clients, why doesn't it then flag some sort of catastrophic error flag in the FAH server - at least sound an alarm or blink a light at an admin's distant workstation, somewhere...? I know that I am naive about the vast complexity of FAH, so a reply of 'RTFM' would probably suffice. :)

Lex
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 6881 (Run 333, Clone 11, Gen 93)

Post by PantherX »

There isn't any data in the WU Database yet:
No data back from query
I have marked it for a follow-up.

The reason for the WU being reassigned is because some donors only want high PPD WUs so will delete the low PPD WUs. Hence a server will reassign the same WU to your machine to discourage cherry-picking.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
brooknet
Posts: 11
Joined: Sat Apr 23, 2011 5:27 am

Re: Project: 6881 (Run 333, Clone 11, Gen 93)

Post by brooknet »

Hello PantherX,

Thanks for that, and for explaining why unit is reassigned; I'll keep watching this topic.
bruce
Posts: 20822
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6881 (Run 333, Clone 11, Gen 93)

Post by bruce »

All WUs have a sort of flag approximating what you've called a catastrophic error flag in the FAH server. It's called the "timeout" or the "Preferred Deadline" and the value changes, depending on the project. For Project 6881 the value is 14 days.

After some types of errors, the client makes an error report and the server will assigned the WU to someone else immediately. After other types of errors (like yours) the client says "Deleting current work unit & continuing..." and the server doesn't know anything about the error, only that somehow your client lost it. After that type of error, the WU will often be reassigned to you, but the server may not give up on you for as much as 14 days from when it was initially assigned to you, though many other things might happen.
brooknet
Posts: 11
Joined: Sat Apr 23, 2011 5:27 am

Re: Project: 6881 (Run 333, Clone 11, Gen 93)

Post by brooknet »

Hello Bruce,

Thanks - I am beginning to understand this a bit more now. I didn't think it would be a simple matter of 'bad unit - delete forever!' but when this happened before, I was assigned the same bad WU for about a month solid, and the FAH client was just going around in a loop: load, process, segfault, and start again. You said it depends on the project, so the timeout/deadline may have been different for the other bad unit.

I have got another computer running Folding@Home now, in addition to the one that is being discussed here. Wouldn't you know it though, I fitted rather a cheap & nasty PSU to that computer, and it crashes as soon as the FAH client starts. Running 'sensors' shows that the 5V line is at 4.78V! I'll be getting another PSU fairly soon, so I can continue running Folding@Home on it.
Post Reply