Stats not updating

Moderators: Site Moderators, FAHC Science Team

-alias-
Posts: 121
Joined: Sun Feb 22, 2009 1:20 pm

Re: Stats not updating

Post by -alias- »

7im wrote:
-alias- wrote:As usual, nobody reacts with PG / Stanford to fix the problem.
Except they just fixed the OS STATS page reporting so that blows your "as usual" theory completely out of the water and in to completely unrecognizable dust.
I think that you've lost My point here! Fact is that the stat-server was down for about 24 hours, and PG did not manage to discover that the server was out of service before you notified them with a ping, and even then it took several hours before it was any reaction.. I am an amateur with 6 servers spread across three buildings, but if an error occurs on one of them I get automatically notified by HFM.NET via e-mail to my smartphone. I can then log on to the server from remote and fixe the problem promptly. PG is the professional part in this system, but it looks like they do not have any sort of automatic notification if something goes wrong on there side.
drougnor
Posts: 147
Joined: Tue Dec 29, 2009 2:21 am

Re: Stats not updating

Post by drougnor »

What you need to remember, -alias-, is that Stanford is a Research University, therefore their IT is focused on monitoring and maintaining the systems that are CORE to that research.

That core focus, unfortunately for us, DOESN'T include our outbound stats server.

So, we monitor the stats and report and WHEN the appropriate IT folks can, they bring them back up. But we have to be patient and deal with the fact that a small IT team is already spread far to thin. Just like in EVERY professional IT setting.

my $.02. do with it what you will.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Stats not updating

Post by 7im »

No offense -alias- but you have no way of knowing what notifications PG received about the stats server or not. My ping post could very well have been redundant.

Not at all related to your position, but it could be said that the Ping post is as much to placate the perpetual complainers as any other function.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
kasi
Posts: 7
Joined: Fri Apr 18, 2014 2:29 am
Location: Australia

Re: Stats not updating

Post by kasi »

OK, have chosen to do another task. Not sure if that is advisable with current tasks that take more than a day to complete, but I'll give it a go and see what happens. If that one goes missing as well I'll stop, from years of distributed computing know better than to continue to send results to servers that are overloaded or malfunctioning.
kyleb
Pande Group Member
Posts: 272
Joined: Fri Mar 12, 2010 8:53 pm

Re: Stats not updating

Post by kyleb »

I'm looking into this. I definitely have received the results from 13001 R6 C0 G7, so I'm trying to figure out what went wrong here.
davidcoton
Posts: 1094
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Stats not updating

Post by davidcoton »

kasi wrote:OK, have chosen to do another task. Not sure if that is advisable with current tasks that take more than a day to complete, but I'll give it a go and see what happens. If that one goes missing as well I'll stop, from years of distributed computing know better than to continue to send results to servers that are overloaded or malfunctioning.
I don't think there is a problem with the work being received.
Joe_H wrote: ... one of the stats logs may have not been processed into the database.
That implies that everything was received correctly, but the points were not transferred when the stats server came back. It does not imply an overloaded Collection Server, though a separate(?) problem with another server recently DID affect the return of some WUs. 90%+ confidence that the missing log file can be found and transferred, so the credit should turn up eventually.

David
Image
Sunny
Posts: 16
Joined: Mon Mar 12, 2012 1:23 pm

Re: Stats not updating

Post by Sunny »

Points not yet updated for following 13000 Project (completed yesterday morning Apr 17):
18:17:20:WU00:FS01:0x17:Project: 13000 (Run 838, Clone 0, Gen 6)
18:17:20:WU00:FS01:0x17:Unit: 0x00000011538b3db753108822086b93a1

Code: Select all

11:35:23:WU00:FS01:0x17:Completed 5000000 out of 5000000 steps (100%)
11:35:23:WU01:FS01:Connecting to assign-GPU.stanford.edu:80
11:35:24:WU01:FS01:News: Welcome to Folding@Home
11:35:24:WU01:FS01:Assigned to work server 171.64.65.56
11:35:24:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GF114 [GeForce GTX 560 Ti] from 171.64.65.56
11:35:24:WU01:FS01:Connecting to 171.64.65.56:8080
11:35:25:WU01:FS01:Downloading 4.32MiB
11:35:31:WU01:FS01:Download 53.48%
11:35:35:WU01:FS01:Download complete
11:35:35:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9408 run:451 clone:0 gen:1 core:0x17 unit:0x000000010a3b1e5c5342d982d96a60c7
11:35:37:WU00:FS01:0x17:Saving result file logfile_01.txt
11:35:37:WU00:FS01:0x17:Saving result file checkpointState.xml
11:35:39:WU00:FS01:0x17:Saving result file checkpt.crc
11:35:39:WU00:FS01:0x17:Saving result file log.txt
11:35:39:WU00:FS01:0x17:Saving result file positions.xtc
11:35:42:WU00:FS01:0x17:Folding@home Core Shutdown: FINISHED_UNIT
11:35:42:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
11:35:42:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13000 run:838 clone:0 gen:6 core:0x17 unit:0x00000011538b3db753108822086b93a1
11:35:42:WU00:FS01:Uploading 12.83MiB to 140.163.4.231
11:35:42:WU00:FS01:Connecting to 140.163.4.231:8080
11:35:42:WU01:FS01:Starting
11:35:42:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /root/spot/Folding/cores/www.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 01 -suffix 01 -version 703 -lifeline 32361 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
11:35:42:WU01:FS01:Started FahCore on PID 32601
11:35:42:WU01:FS01:Core PID:32605
11:35:42:WU01:FS01:FahCore 0x17 started
11:35:43:WU01:FS01:0x17:*********************** Log Started 2014-04-17T11:35:43Z ***********************
11:35:43:WU01:FS01:0x17:Project: 9408 (Run 451, Clone 0, Gen 1)
11:35:43:WU01:FS01:0x17:Unit: 0x000000010a3b1e5c5342d982d96a60c7
11:35:43:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
11:35:43:WU01:FS01:0x17:Machine: 1
11:35:43:WU01:FS01:0x17:Reading tar file state.xml
11:35:43:WU01:FS01:0x17:Reading tar file system.xml
11:35:44:WU01:FS01:0x17:Reading tar file integrator.xml
11:35:44:WU01:FS01:0x17:Reading tar file core.xml
11:35:44:WU01:FS01:0x17:Digital signatures verified
11:35:48:WU00:FS01:Upload 5.36%
11:35:55:WU00:FS01:Upload 10.23%
11:36:01:WU00:FS01:Upload 14.61%
11:36:07:WU00:FS01:Upload 19.00%
11:36:13:WU00:FS01:Upload 23.38%
11:36:19:WU00:FS01:Upload 26.30%
11:36:25:WU00:FS01:Upload 31.66%
11:36:31:WU00:FS01:Upload 35.56%
11:36:38:WU00:FS01:Upload 39.46%
11:36:44:WU00:FS01:Upload 43.84%
11:36:50:WU00:FS01:Upload 48.23%
11:36:56:WU00:FS01:Upload 52.12%
11:37:02:WU00:FS01:Upload 56.02%
11:37:08:WU00:FS01:Upload 60.40%
11:37:14:WU00:FS01:Upload 64.30%
11:37:21:WU00:FS01:Upload 68.68%
11:37:28:WU00:FS01:Upload 74.04%
11:37:34:WU00:FS01:Upload 77.94%
11:37:40:WU00:FS01:Upload 82.32%
11:37:46:WU00:FS01:Upload 86.71%
11:37:53:WU00:FS01:Upload 91.09%
11:37:59:WU00:FS01:Upload 94.99%
11:38:06:WU00:FS01:Upload 99.86%
11:38:19:WU00:FS01:Upload complete
11:38:19:WU00:FS01:Server responded WORK_ACK (400)
11:38:19:WU00:FS01:Final credit estimate, 41042.00 points
11:38:19:WU00:FS01:Cleaning up
folding_hoomer
Posts: 349
Joined: Sun Feb 10, 2013 6:06 pm
Hardware configuration: Sys 1: I7 2700K@4,4GHz with NH-C14
8GB G.Skill Sniper DDR3 1866MHz CL 9-10-9-28
MSI Z68A-GD65 (G3), various operating systems (WinXP, Ubuntu: 10.4.3 LTS, 12.04.2 LTS)
Optional: GTX560TI 448@stock/OC´d

Sys 2: I7 3930K@4,4GHz with Corsair H110
16GB G.Skill Ripjaws X DDR3 1866MHz CL 9-10-9-28
ASUS Ranpage IV Formula, Ubuntu 10.10

Sys 3 i7 875K@3,826 GHz with Scythe Mine2
8GB G.Skill Sniper DDR3 1866MHz CL 9-10-9-28
MSI P55-GD80, Win7 64Bit Pro
Sapphire Radeon HD5870@1,163V 900/1250MHz
Sapphire Radeon HD7870@1,218V 1200/1300MHz

Sys 4 i7 2600K@4,4GHz with Scythe Mine2
8GB G.Skill Sniper DDR3 1866MHz CL 9-10-9-28
MSI Z68A-GD65 (G3), various operating systems (WinXP, Ubuntu: 10.4.3 LTS, 12.04.2 LTS)
Optional: GTX560TI 448@stock/OC´d

Optional:
ASUS P5Q Pro with Q9550
ASUS P5Q Pro with Q6300
Location: Bavaria, Germany

Re: Stats not updating

Post by folding_hoomer »

A member of my team is missing the credits for one 13001.
Member: hbf878, Team: 70335
WU 13001 (351,8,3)
Upload finished: 17.4, 16:47:12 UTC

Log:

Code: Select all

15:49:36:WU01:FS00:0x17:Completed 4900000 out of 5000000 steps (98%)
16:16:04:WU01:FS00:0x17:Completed 4950000 out of 5000000 steps (99%)
******************************* Date: 2014-04-17 *******************************
16:42:31:WU01:FS00:0x17:Completed 5000000 out of 5000000 steps (100%)
16:43:01:WU01:FS00:0x17:Saving result file logfile_01.txt
16:43:01:WU01:FS00:0x17:Saving result file checkpointState.xml
16:43:04:WU01:FS00:0x17:Saving result file checkpt.crc
16:43:04:WU01:FS00:0x17:Saving result file log.txt
16:43:04:WU01:FS00:0x17:Saving result file positions.xtc
16:43:07:WU01:FS00:0x17:Folding@home Core Shutdown: FINISHED_UNIT
16:43:08:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
16:43:09:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:13001 run:351 clone:8 gen:3 core:0x17 unit:0x00000004538b3db75328b3622cd3349f
16:43:09:WU01:FS00:Uploading 12.83MiB to 140.163.4.231
16:43:09:WU01:FS00:Connecting to 140.163.4.231:8080
16:43:16:WU01:FS00:Upload 9.74%
16:43:23:WU01:FS00:Upload 11.69%
16:43:30:WU01:FS00:Upload 17.05%
16:43:42:WU01:FS00:Upload 19.48%
16:43:48:WU01:FS00:Upload 24.84%
16:43:59:WU01:FS00:Upload 27.27%
16:44:06:WU01:FS00:Upload 33.12%
16:44:16:WU01:FS00:Upload 35.06%
16:44:22:WU01:FS00:Upload 40.42%
16:44:33:WU01:FS00:Upload 42.86%
16:44:39:WU01:FS00:Upload 48.21%
16:44:51:WU01:FS00:Upload 50.65%
16:44:57:WU01:FS00:Upload 56.01%
16:45:03:WU01:FS00:Upload 58.44%
16:45:09:WU01:FS00:Upload 60.88%
16:45:15:WU01:FS00:Upload 63.80%
16:45:25:WU01:FS00:Upload 66.72%
16:45:33:WU01:FS00:Upload 70.62%
16:45:40:WU01:FS00:Upload 73.05%
16:45:47:WU01:FS00:Upload 75.97%
16:45:53:WU01:FS00:Upload 78.41%
16:45:59:WU01:FS00:Upload 81.33%
16:46:05:WU01:FS00:Upload 83.77%
16:46:11:WU01:FS00:Upload 86.69%
16:46:17:WU01:FS00:Upload 89.61%
16:46:24:WU01:FS00:Upload 92.05%
16:46:30:WU01:FS00:Upload 94.48%
16:46:36:WU01:FS00:Upload 97.40%
16:46:43:WU01:FS00:Upload 99.84%
16:47:12:WU01:FS00:Upload complete
16:47:12:WU01:FS00:Server responded WORK_ACK (400)
16:47:12:WU01:FS00:Final credit estimate, 39511.00 points
16:47:12:WU01:FS00:Cleaning up
Thanks in advance.
Image
kasi
Posts: 7
Joined: Fri Apr 18, 2014 2:29 am
Location: Australia

Re: Stats not updating

Post by kasi »

Thank you kyleb for confirming that results have been received for 13001 R6 C0 G7.

I would prefer the credit if/when it can be fixed but my main concerns in this were to avoid burning coal wastefully and to contribute to scientific research.
MonsterBuilder
Posts: 1
Joined: Fri Apr 18, 2014 7:59 pm

Re: Stats not updating

Post by MonsterBuilder »

Looks like I lost one in the jumble as well - I've not been credited with a completed gpu WU since 4/14 - was this one received successfully?

Code: Select all

02:03:56:WU00:FS00:0x17:*********************** Log Started 2014-04-16T02:03:55Z ***********************
02:03:56:WU00:FS00:0x17:Project: 13001 (Run 264, Clone 6, Gen 3)
02:03:56:WU00:FS00:0x17:Unit: 0x00000005538b3db753289aad433ed310
02:03:56:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
02:03:56:WU00:FS00:0x17:Machine: 0
02:03:56:WU00:FS00:0x17:Digital signatures verified
02:03:56:WU00:FS00:0x17:Folding@home GPU core17
02:03:56:WU00:FS00:0x17:Version 0.0.52
02:03:56:WU00:FS00:0x17:  Found a checkpoint file
02:04:21:Started thread 10 on PID 4388
02:08:57:WU00:FS00:0x17:Completed 2500000 out of 5000000 steps (50%)
02:08:57:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
02:37:15:WU00:FS00:0x17:Completed 2550000 out of 5000000 steps (51%)

- snip -

******************************* Date: 2014-04-17 *******************************

03:50:30:WU00:FS00:0x17:Completed 4800000 out of 5000000 steps (96%)
04:53:56:WU00:FS00:0x17:Completed 4850000 out of 5000000 steps (97%)
05:22:04:WU00:FS00:0x17:Completed 4900000 out of 5000000 steps (98%)
05:49:46:WU00:FS00:0x17:Completed 4950000 out of 5000000 steps (99%)
06:17:27:WU00:FS00:0x17:Completed 5000000 out of 5000000 steps (100%)
06:17:27:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
06:17:27:WU02:FS00:News: Welcome to Folding@Home
06:17:27:WU02:FS00:Assigned to work server 140.163.4.231
06:17:27:WU02:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:R575A [AMD Radeon HD7700 Series] from 140.163.4.231
06:17:27:WU02:FS00:Connecting to 140.163.4.231:8080
06:17:28:WU02:FS00:Downloading 4.84MiB
06:17:33:WU02:FS00:Download complete
06:17:33:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:13001 run:257 clone:6 gen:9 core:0x17 unit:0x00000013538b3db7532898b0b4095bfd
06:17:49:WU00:FS00:0x17:Saving result file logfile_01.txt
06:17:49:WU00:FS00:0x17:Saving result file checkpointState.xml
06:17:52:WU00:FS00:0x17:Saving result file checkpt.crc
06:17:52:WU00:FS00:0x17:Saving result file log.txt
06:17:52:WU00:FS00:0x17:Saving result file positions.xtc
06:17:55:WU00:FS00:0x17:Folding@home Core Shutdown: FINISHED_UNIT
06:17:55:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
06:17:55:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:13001 run:264 clone:6 gen:3 core:0x17 unit:0x00000005538b3db753289aad433ed310
06:17:55:WU00:FS00:Uploading 12.83MiB to 140.163.4.231
06:17:55:WU00:FS00:Connecting to 140.163.4.231:8080
06:17:56:WU02:FS00:Starting
06:17:56:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/Eric Buchanan/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe" -dir 02 -suffix 01 -version 703 -lifeline 4388 -checkpoint 15 -gpu 0 -gpu-vendor ati
06:17:57:WU02:FS00:Started FahCore on PID 5568
06:17:57:Started thread 13 on PID 4388
06:17:58:WU02:FS00:Core PID:8420
06:17:58:WU02:FS00:FahCore 0x17 started
06:17:58:WU02:FS00:0x17:*********************** Log Started 2014-04-17T06:17:58Z ***********************
06:17:58:WU02:FS00:0x17:Project: 13001 (Run 257, Clone 6, Gen 9)
06:17:58:WU02:FS00:0x17:Unit: 0x00000013538b3db7532898b0b4095bfd
06:17:58:WU02:FS00:0x17:CPU: 0x00000000000000000000000000000000
06:17:58:WU02:FS00:0x17:Machine: 0
06:17:58:WU02:FS00:0x17:Reading tar file state.xml
06:17:59:WU02:FS00:0x17:Reading tar file system.xml
06:17:59:WU02:FS00:0x17:Reading tar file integrator.xml
06:17:59:WU02:FS00:0x17:Reading tar file core.xml
06:17:59:WU02:FS00:0x17:Digital signatures verified
06:17:59:WU02:FS00:0x17:Folding@home GPU core17
06:17:59:WU02:FS00:0x17:Version 0.0.52
06:18:01:WU00:FS00:Upload 7.79%
06:18:07:WU00:FS00:Upload 13.15%
06:18:13:WU00:FS00:Upload 18.51%
06:18:19:WU00:FS00:Upload 23.87%
06:18:25:WU00:FS00:Upload 29.71%
06:18:31:WU00:FS00:Upload 34.58%
06:18:37:WU00:FS00:Upload 39.94%
06:18:43:WU00:FS00:Upload 45.30%
06:18:49:WU00:FS00:Upload 50.66%
06:18:55:WU00:FS00:Upload 56.02%
06:19:01:WU00:FS00:Upload 61.37%
06:19:07:WU00:FS00:Upload 66.73%
06:19:13:WU00:FS00:Upload 72.09%
06:19:19:WU00:FS00:Upload 77.45%
06:19:25:WU00:FS00:Upload 82.81%
06:19:31:WU00:FS00:Upload 88.16%
06:19:37:WU00:FS00:Upload 93.52%
06:19:43:WU00:FS00:Upload 98.88%
06:19:58:WU00:FS00:Upload complete
06:19:59:WU00:FS00:Server responded WORK_ACK (400)
06:19:59:WU00:FS00:Final credit estimate, 36677.00 points
06:19:59:WU00:FS00:Cleaning up

Valkyrie
Posts: 43
Joined: Sat Apr 14, 2012 9:03 pm
Location: Canada

Re: Stats not updating

Post by Valkyrie »

So did 140.163.4.231 accept a bunch of completed WU's and just keep them? That server seems to be a common theme here.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Stats not updating

Post by Joe_H »

Based on what has been reported in prior stats server problems, I would guess that one or more logs of point results were sent from the WS to the stats server but not processed. In the past investigating to find unprocessed logs has usually taken a few days. As I understand it, the information on which WU's were received but not credited is helpful in locating the specific logs.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
DocJonz
Posts: 244
Joined: Thu Dec 06, 2007 6:31 pm
Hardware configuration: Folding with: 4x RTX 4070Ti, 1x RTX 4080 Super
Location: United Kingdom
Contact:

Re: Stats not updating

Post by DocJonz »

Joe_H wrote:Based on what has been reported in prior stats server problems, I would guess that one or more logs of point results were sent from the WS to the stats server but not processed. In the past investigating to find unprocessed logs has usually taken a few days. As I understand it, the information on which WU's were received but not credited is helpful in locating the specific logs.
I'm guessing I fall in that category - my stats for the 16th April were half what they should have been. How would I go about trying to determine which WU's went awry?
Folding Stats (HFM.NET): DocJonz Folding Farm Stats
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Stats not updating

Post by Joe_H »

DocJonz wrote:I'm guessing I fall in that category - my stats for the 16th April were half what they should have been. How would I go about trying to determine which WU's went awry?
Everybody's stats were short for the 16th, the stats stopped updating about half way through the day. The backlog was processed starting about 24 hours later, you should have seen a bump in your points on the 17th as the ones turned in during the outage were credited. As for determining which if any WU's did not get credited, you would have to compare the estimated points with what was awarded eventually. That can be relatively easy if you have a small number of machines and WU's, not very easy if you have many. Though so far it appears mostly that some WU's from projects 13000 and 13001 are the ones missing. As for specific WU's being reported as uncredited, they may have enough reports already to determine which logs they are looking for.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
DocJonz
Posts: 244
Joined: Thu Dec 06, 2007 6:31 pm
Hardware configuration: Folding with: 4x RTX 4070Ti, 1x RTX 4080 Super
Location: United Kingdom
Contact:

Re: Stats not updating

Post by DocJonz »

Thanks Joe_H.
I was expecting the spike when things were credited (as has happened before) ... but haven't seen one just yet.
I had 5x GTX 780's running 13000/1 over the period (in addition to high core-count CPU's) so maybe they are still in the 'missing' category.
Folding Stats (HFM.NET): DocJonz Folding Farm Stats
Post Reply