Page 1 of 2
Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 2:38 pm
by dschief
Bizarre symptom. Have 6 Boxes W/ 10 occupied GPU Slots. Stanford stats only show 9 active {within 7 days }
Have checked all Advanced control setups/configs, User name & team are correct. logs for all 10 slots look OK {up-loads & down-loads + ACK }
Can not begin to think how to track down which card isn't being picked up, much less where the points are going
Only posted here because, I would guess it's between collection server & stats reporting
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 2:53 pm
by bollix47
You have one slot that hasn't turned in a work unit since Oct 22 which would have dropped your active count from 10 to 9 for the last seven days.
Here is it's last return:
Donator: dschief Team: 13761
CPUId: XXXXXXXX45A75BD6
Credit: 5757 Credit Time: 2013-10-22 14:00:11
Entered into logs at: 2013-10-22 14:00:03
WU assigned to donor at: 2013-10-21 08:16:50
Days taken to complete WU: 1.24
Error code: 0
Hi dschief (team 13761),
Your WU (P8018 R185 C0 G314) was added to the stats database on 2013-10-22 14:00:11 for 5757 points of credit.
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 4:19 pm
by dschief
Where do I find CPUId: XXXXXXXX45A75BD6; to help pinpointing to 1 box
I have gone back thru the logs for all 10 slots, All 10 show uploads with ACK {400} within 24hrs. some of the quicker
ones multiple time
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 4:50 pm
by bruce
dschief wrote:Where do I find CPUId: XXXXXXXX45A75BD6; to help pinpointing to 1 box
I have gone back thru the logs for all 10 slots, All 10 show uploads with ACK {400} within 24hrs. some of the quicker
ones multiple time
That value did appear in the V6 logs but V7 does not display it. It is rarely useful to the Donor. Just check the log for that WU. Leaving it out avoids the confusion of having two different things called a UserID and a UserName when we're really talking about a software ID for the client installation.
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 4:56 pm
by dschief
So since every thing looks ok on my end { good logs } and V7 doesnt display CPUid there's no way
to tell which box Stanford does'nt recognize even though I'm getting ACK's back
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 4:59 pm
by bruce
Give use the PRCG numbers of a WU from each client and we can spot which two may be using the same CPUID (assuming that you cloned a client from one machine to another and now two of them are being treated as the same one).
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 5:01 pm
by Joe_H
I don't know of a way to search for that CPUId with the current client. Instead, since by default the client keeps up to 16 older logs, I would search the current and older log files for that WU that was turned in. One format the character string would be in is 'Project: 8018 (Run 185, Clone 0, Gen 314)'. Most often you will only have been assigned a WU once ever, so the machine you find it on would be the one to check for errors in the configuration such as typos in your username.
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 6:14 pm
by dschief
bruce wrote:Give use the PRCG numbers of a WU from each client and we can spot which two may be using the same CPUID (assuming that you cloned a client from one machine to another and now two of them are being treated as the same one).
I did not clone any client between boxes, I did however shuffle cards to eliminate a problem of having
a GTX460Se paired with a GTX560Ti. Would that have a similar result
re: viewtopic.php?f=19&t=25099
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 6:35 pm
by P5-133XL
Just as a side note, the Stanford web site will not track a slot that has not yet returned a WU...
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 7:02 pm
by dschief
thats whats driving me crazy, I've got real time logs on my end showing uploads with ACk from all 10 slots most with 2+ a day.
the collection server must be taking them in since I'm seeing completed transactions, the stats server seems to be ignoring
one particular card/slot/client . If I let all boxes finish , & shut down for 7 days, then do clean Installs on all boxes.
Would that reset the values in the UpStream Servers?
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 7:22 pm
by P5-133XL
Supply one completed Project (run, gen, clone) from each slot and I'll look each one up ...
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 8:26 pm
by bruce
dschief wrote:bruce wrote:Give use the PRCG numbers of a WU from each client and we can spot which two may be using the same CPUID (assuming that you cloned a client from one machine to another and now two of them are being treated as the same one).
I did not clone any client between boxes, I did however shuffle cards to eliminate a problem of having
a GTX460Se paired with a GTX560Ti. Would that have a similar result
re: viewtopic.php?f=19&t=25099
No. The CPUID is established when the client is (re-)installed. If you didn't move files between computers, that leaves the possibility that you misspelled your user name or something like that. Giving us the PRCG's of completed WUs from each client is probably the only way to figure it out. Specific combinations of GPUs would unrelated information.
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 8:59 pm
by dschief
OPUS: Slot 0 7624 R17 C0 G239
KERMIT Slot 0 5770 R0 C282 G8106
1 5770 R2 C262 G4881
Grendle Slot 0 7626 R305 C0 G195
MARDUK Slot 0 7660 R109 C0 G333
1 7626 R250 C0 G251
FRODO Slot 0 7625 R117 C0 G235
1 7660 R206 C0 G332
EORE Slot 0 7625 R266 C0 G283
Slot 1 7626 R162 C0 G227
Re: Client/Slot not being tracked?
Posted: Wed Oct 30, 2013 9:39 pm
by bollix47
All 10 of those are showing in the database:
Code: Select all
OPUS: Slot 0 7624 R17 C0 G239
CPUId: XXXXXXXX521D65EC
Hi dschief (team 13761),
Your WU (P7624 R17 C0 G239) was added to the stats database on 2013-10-29 21:07:01 for 14093 points of credit.
KERMIT Slot 0 5770 R0 C282 G8106
CPUId: XXXXXXXX75C9A2C0
Hi dschief (team 13761),
Your WU (P5770 R0 C282 G8106) was added to the stats database on 2013-10-30 13:07:39 for 353 points of credit.
1 5770 R2 C262 G4881
CPUId: XXXXXXXX75C9A2C0
Hi dschief (team 13761),
Your WU (P5770 R2 C262 G4881) was added to the stats database on 2013-10-30 10:07:46 for 353 points of credit.
Grendle Slot 0 7626 R305 C0 G195
CPUId: XXXXXXXX78B02975
Hi dschief (team 13761),
Your WU (P7626 R305 C0 G195) was added to the stats database on 2013-10-30 10:07:49 for 14093 points of credit.
MARDUK Slot 0 7660 R109 C0 G333
CPUId: XXXXXXXX6B739890
Hi dschief (team 13761),
Your WU (P7660 R109 C0 G333) was added to the stats database on 2013-10-30 06:07:52 for 4431 points of credit.
1 7626 R250 C0 G251
CPUId: XXXXXXXX6B739891
Hi dschief (team 13761),
Your WU (P7626 R250 C0 G251) was added to the stats database on 2013-10-29 21:07:01 for 14093 points of credit.
FRODO Slot 0 7625 R117 C0 G235
CPUId: XXXXXXXX6A448CBF
Hi dschief (team 13761),
Your WU (P7625 R117 C0 G235) was added to the stats database on 2013-10-30 06:07:52 for 14093 points of credit.
1 7660 R206 C0 G332
CPUId: XXXXXXXX6A448CC0
Hi dschief (team 13761),
Your WU (P7660 R206 C0 G332) was added to the stats database on 2013-10-29 17:07:06 for 4431 points of credit.
EORE Slot 0 7625 R266 C0 G283
CPUId: XXXXXXXX58030720
Hi dschief (team 13761),
Your WU (P7625 R266 C0 G283) was added to the stats database on 2013-10-29 07:07:56 for 14093 points of credit.
Slot 1 7626 R162 C0 G227
CPUId: XXXXXXXX58030721
Hi dschief (team 13761),
Your WU (P7626 R162 C0 G227) was added to the stats database on 2013-10-30 02:13:41 for 14093 points of credit.
Kermit slots 0 and 1 appear to have the same CPUId and if that's true then the two slots would only count as 1 in the stats. Please verify that the two PRCG numbers supplied for Kermit came from different slots. If so, you could set slot one to finish the current work unit, remove that slot after the WU uploads and re-add the slot. After the newly added slot has finished a work unit let us know the PRCG numbers and we can check the CPUId or you can check your stats after it uploads to see if your client count has changed to 10. It could take as long as an hour after uploading before it will show up in your stats.
If the second GPU was added after the software was installed it's possible that the indices are not set correctly and that both work units are being processed on the same GPU. GPU-z or other monitoring software can be used to verify it this is happening. You could pause both slots and change the gpu-index for slot 0 from -1 to 0 and for slot 1 from -1 to 1. AFAIK the slot number does become part of the CPUId which should make them different (see the last digit of CPUId in the other multiple GPU clients - slot 1 is one more that slot 0) so it's not clear to me why the indices would have anything to do with the problem but may be worth a try.
Another option which should work if the above doesn't change anything would be to finish both slots, uninstall the software including the data and reinstall the software. The installer is efficient at detecting the usable hardware and setting up multiple GPUs correctly as opposed to when adding a second GPU slot manually after the software has already been installed.
Re: Client/Slot not being tracked?
Posted: Thu Oct 31, 2013 6:08 am
by bruce
The raw CPUid is added to the slot number, giving what I would call a SlotID (though that's not what the label says). [In V6, it was a MachineID plus a UserID, and both terms were a bit obscure then, too.] Slots 0 and 1 on the same V7 client will have IDs that differ by one. The digits to the left will be the same (except if the least significant digit produces a carry).
CPUId: XXXXXXXX521D65EC OPUS Slot 0
CPUId: XXXXXXXX75C9A2C0 KERMIT Slot 0
CPUId: XXXXXXXX75C9A2C0 KERMIT Slot 1
CPUId: XXXXXXXX78B02975 Grendle Slot 0
CPUId: XXXXXXXX6B739890 MARDUK Slot 0
CPUId: XXXXXXXX6B739891 MARDUK slot 1
CPUId: XXXXXXXX6A448CBF FRODO Slot 0
CPUId: XXXXXXXX6A448CC0 FRODO slot 1
CPUId: XXXXXXXX58030720 EORE Slot 0
CPUId: XXXXXXXX58030721 EORE Slot 1
What settings appear in the configuration of KERMIT's slots? Did you edit that configuration about the time the machine count changed?
I would remove slot 1 from KERMIT and then add a new one. The only question I'd have is whether to reset the same modified values or adjust them.
Check as many recent WUs as you can find for those two slots. Are they being assigned duplicate WUs or are they all different.