PRCG 6941 0 227 94 Gone Missing Possibly

Moderators: Site Moderators, FAHC Science Team

Post Reply
shunter
Posts: 84
Joined: Sun Apr 06, 2008 8:22 am
Location: Hertfordshire, United Kingdom

PRCG 6941 0 227 94 Gone Missing Possibly

Post by shunter »

During the last 10 hours I seem to have lost a completed unit and think this is the one which completed at about 23.41 GMT. After 2 attempts log shows it as being submitted but server reported problems with this unit and I did not get the usual number of units completed.
It's possible I've been credited with some points but not the full amount and therefore my Units Completed has not gone up but why? Unit completed successfully so what's the issue?
Thanks

Code: Select all

[23:40:52] Completed 500000 out of 500000 steps  (100%)
[23:40:53] DynamicWrapper: Finished Work Unit: sleep=10000
[23:41:03] 
[23:41:03] Finished Work Unit:
[23:41:03] - Reading up to 3699696 from "work/wudata_08.trr": Read 3699696
[23:41:03] trr file hash check passed.
[23:41:03] edr file hash check passed.
[23:41:03] logfile size: 64204
[23:41:03] Leaving Run
[23:41:06] - Writing 3798836 bytes of core data to disk...
[23:41:07] Done: 3798324 -> 3521769 (compressed to 92.7 percent)
[23:41:07]   ... Done.
[23:41:36] - Shutting down core
[23:41:36] 
[23:41:36] Folding@home Core Shutdown: FINISHED_UNIT
[23:41:40] CoreStatus = 64 (100)
[23:41:40] Unit 8 finished with 63 percent of time to deadline remaining.
[23:41:40] Updated performance fraction: 0.614847
[23:41:40] Sending work to server
[23:41:40] Project: 6941 (Run 0, Clone 227, Gen 94)


[23:41:40] + Attempting to send results [January 30 23:41:40 UTC]
[23:41:40] - Reading file work/wuresults_08.dat from core
[23:41:40]   (Read 3522281 bytes from disk)
[23:41:40] Connecting to http://128.143.199.96:8080/
[23:42:43] - Couldn't send HTTP request to server
[23:42:43] + Could not connect to Work Server (results)
[23:42:43]     (128.143.199.96:8080)
[23:42:43] + Retrying using alternative port
[23:42:43] Connecting to http://128.143.199.96:80/
[23:43:46] - Couldn't send HTTP request to server
[23:43:46] + Could not connect to Work Server (results)
[23:43:46]     (128.143.199.96:80)
[23:43:46] - Error: Could not transmit unit 08 (completed January 30) to work server.
[23:43:46] - 1 failed uploads of this unit.
[23:43:46]   Keeping unit 08 in queue.
[23:43:46] Trying to send all finished work units
[23:43:46] Project: 6941 (Run 0, Clone 227, Gen 94)


[23:43:46] + Attempting to send results [January 30 23:43:46 UTC]
[23:43:46] - Reading file work/wuresults_08.dat from core
[23:43:46]   (Read 3522281 bytes from disk)
[23:43:46] Connecting to http://128.143.199.96:8080/
[23:44:49] - Couldn't send HTTP request to server
[23:44:49] + Could not connect to Work Server (results)
[23:44:49]     (128.143.199.96:8080)
[23:44:49] + Retrying using alternative port
[23:44:49] Connecting to http://128.143.199.96:80/
[23:45:53] - Couldn't send HTTP request to server
[23:45:53] + Could not connect to Work Server (results)
[23:45:53]     (128.143.199.96:80)
[23:45:53] - Error: Could not transmit unit 08 (completed January 30) to work server.
[23:45:53] - 2 failed uploads of this unit.


[23:45:53] + Attempting to send results [January 30 23:45:53 UTC]
[23:45:53] - Reading file work/wuresults_08.dat from core
[23:45:53]   (Read 3522281 bytes from disk)
[23:45:53] Connecting to http://128.143.231.201:8080/
[23:45:59] Posted data.
[23:45:59] Initial: 0000; - Uploaded at ~573 kB/s
[23:45:59] - Averaged speed for that direction ~477 kB/s
[23:45:59] - Server reports problem with unit.
[23:45:59]   Successfully sent unit 08 to Collection server.
[23:46:07] + Sent 1 of 1 completed units to the server
[23:46:07] - Preparing to get new work unit...
[23:46:07] Cleaning up work directory
[23:46:11] + Attempting to get work packet
[23:46:11] Passkey found
[23:46:11] - Will indicate memory of 1938 MB
[23:46:11] - Connecting to assignment server
[23:46:11] Connecting to http://assign.stanford.edu:8080/
[23:46:12] Posted data.
[23:46:12] Initial: 8F80; - Successful: assigned to (128.143.199.96).
[23:46:12] + News From Folding@Home: Welcome to Folding@Home
[23:46:12] Loaded queue successfully.
[23:46:12] Sent data
[23:46:12] Connecting to http://128.143.199.96:8080/
[23:47:15] - Couldn't send HTTP request to server
[23:47:15] + Could not connect to Work Server
[23:47:15] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[23:47:22] + Attempting to get work packet
[23:47:22] Passkey found
[23:47:22] - Will indicate memory of 1938 MB
[23:47:22] - Connecting to assignment server
[23:47:22] Connecting to http://assign.stanford.edu:8080/
[23:47:23] Posted data.
[23:47:23] Initial: 8F80; - Successful: assigned to (128.143.231.202).
[23:47:23] + News From Folding@Home: Welcome to Folding@Home
[23:47:23] Loaded queue successfully.
[23:47:23] Sent data
[23:47:23] Connecting to http://128.143.231.202:8080/
[23:47:24] Posted data.
[23:47:24] Initial: 0000; - Receiving payload (expected size: 3807487)
[23:47:27] - Downloaded at ~1239 kB/s
[23:47:27] - Averaged speed for that direction ~1239 kB/s
[23:47:27] + Received work.
[23:47:27] Trying to send all finished work units
[23:47:27] + No unsent completed units remaining.
[23:47:27] + Closed connections
[23:47:27] 
[23:47:27] + Processing work unit
[23:47:27] Core required: FahCore_a3.exe
[23:47:27] Core found.
[23:47:27] Working on queue slot 09 [January 30 23:47:27 UTC]
[23:47:27] + Working ...
[23:47:27] - Calling './FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 2 -checkpoint 15 -forceasm -verbose -lifeline 2183 -version 634'

[23:47:27] 
[23:47:27] *------------------------------*
[23:47:27] Folding@Home Gromacs SMP Core
[23:47:27] Version 2.27 (Dec. 15, 2010)
[23:47:27] 
[23:47:27] Preparing to commence simulation
[23:47:27] - Assembly optimizations manually forced on.
[23:47:27] - Not checking prior termination.
[23:47:28] - Expanded 3806975 -> 4136808 (decompressed 108.6 percent)
[23:47:28] Called DecompressByteArray: compressed_data_size=3806975 data_size=4136808, decompressed_data_size=4136808 diff=0
[23:47:28] - Digital signature verified
[23:47:28] 
[23:47:28] Project: 6098 (Run 2, Clone 69, Gen 293)
[23:47:28] 
[23:47:28] Assembly optimizations on if available.
[23:47:28] Entering M.D.
[23:47:34] Mapping NT from 2 to 2 
[23:47:36] Completed 0 out of 500000 steps  (0%)
[00:40:59] Completed 5000 out of 500000 steps  (1%)
[01:32:46] Completed 10000 out of 500000 steps  (2%)
[02:24:31] Completed 15000 out of 500000 steps  (3%)
[03:15:34] - Autosending finished units... [January 31 03:15:34 UTC]
[03:15:34] Trying to send all finished work units
[03:15:34] + No unsent completed units remaining.
[03:15:34] - Autosend completed

Image
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: PRCG 6941 0 227 94 Gone Missing Possibly

Post by P5-133XL »

When you get the error "Server reports problem with unit." It means that the server detected some form of data corruption that was not detected by the client. The error is not specific as to the cause.

As to why the number of completed WU's did not go up -- I checked the database and I'm not seeing that the server even recorded that it was returned.
Image
shunter
Posts: 84
Joined: Sun Apr 06, 2008 8:22 am
Location: Hertfordshire, United Kingdom

Re: PRCG 6941 0 227 94 Gone Missing Possibly

Post by shunter »

Well that's a shame - waste of 36 hours crunching time.
Thanks for the info
Image
Post Reply