Lost Time
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 5
- Joined: Sun Dec 02, 2007 2:25 pm
- Hardware configuration: 2) Q6600 G0 @ 3.2Ghz / ASUS P5B & Dlx / 2gig OCZ Plat. R2 / Vista 32/XP
XFX 8800 GTS 512 - 177.35 drivers
Q6600 G0 @ 3.2Ghz / XFX nForce 780i SLI / 2gig OCZ Plat. R2 / Vista 32 /
2x XFX 9800GTX - 177.35 drivers
Lost Time
I hope this is the right place to post this topic, looked like a good spot to me.
I have been wondering why we couldn't be working on the next work unit while the finished one uploads? There is a lot of wasted time every day waiting, and uploading is a slow process compared to downloading. Is there a way to make this work? It would allow more work to get done in the same time period.
I have been wondering why we couldn't be working on the next work unit while the finished one uploads? There is a lot of wasted time every day waiting, and uploading is a slow process compared to downloading. Is there a way to make this work? It would allow more work to get done in the same time period.
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Lost Time
I have the same feelings ... it's even more important when the WU are short (many uploads and downloads) or need to send or download huge files (I'm thinking about SMP WU and some BigWu too) ...
It must be easy to reverse the process : download a WU and then send the results back while working on the new one (instead of send result, get a new WU, process it).
If it's not possible to change this behavior, another solution would be to take upload and download time into account when benchmarking the WU.
It must be easy to reverse the process : download a WU and then send the results back while working on the new one (instead of send result, get a new WU, process it).
If it's not possible to change this behavior, another solution would be to take upload and download time into account when benchmarking the WU.
-
- Posts: 1024
- Joined: Sun Dec 02, 2007 12:43 pm
Re: Lost Time
I don't think that Stanford can take the upload and download time into account. Are you proposing that FAH somehow compensate everyone for their network time at maybe 1 Gb (which is essentially zero) while the rest of us are somewhere between 47 KB (modem) for some people and maybe 500/1500 KB or better for others?
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Lost Time
Good idea. It's been suggested many times before. However, the best time to have added that feature was 4 years ago when everyone was still on dialup and connection speeds were slow. Now that everyone is moving to faster and faster broadband, the advantage gets to be less and less, and the idea slowing falls lower and lower on the priority list for new developments. Stanford has limited resources, so they have to set priorities.
There used to be some counter-arguements too, but I don't remember what they were, or even if they still apply.
There used to be some counter-arguements too, but I don't remember what they were, or even if they still apply.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 5
- Joined: Sun Dec 02, 2007 2:25 pm
- Hardware configuration: 2) Q6600 G0 @ 3.2Ghz / ASUS P5B & Dlx / 2gig OCZ Plat. R2 / Vista 32/XP
XFX 8800 GTS 512 - 177.35 drivers
Q6600 G0 @ 3.2Ghz / XFX nForce 780i SLI / 2gig OCZ Plat. R2 / Vista 32 /
2x XFX 9800GTX - 177.35 drivers
Re: Lost Time
I wasn't thinking about compensation, points are paid for work done. I was just wondering if it was possible to streamline the process to allow work to continue through the upload process. I don't know how the upload process works (If it needs more computer resources than just a file transfer) but I thought it would help the effort if we could eliminate the lost time.
Truly the point system is a great tool. It allows Stanford to direct the flow by awarding more points where the work is needed most. It also gives people a chance to see what they have done and combine their efforts with teams adding fun competition. But honestly if the points went away tomorrow, I would still be folding. In the end, it's about building a better future. The more work we can get done, the closer we get to that goal.
Truly the point system is a great tool. It allows Stanford to direct the flow by awarding more points where the work is needed most. It also gives people a chance to see what they have done and combine their efforts with teams adding fun competition. But honestly if the points went away tomorrow, I would still be folding. In the end, it's about building a better future. The more work we can get done, the closer we get to that goal.
-
- Posts: 450
- Joined: Tue Dec 04, 2007 8:36 pm
Re: Lost Time
One very simple change would help a lot: Download a new wu BEFORE uploading the result. Then we can get started on the next assignment while the upload is processed and if the upload fails two or three times. we don't lose as much time. Downloads are often small, compared to uploads, too, and my DSL download speed is higher than my upload speed.
Re: Lost Time
It doesn't work that way folks.
As it is set up, and has been for a long time.
I will explain it as simply as I can.
They (the servers) need the unit that you crunched back first. And there is more than one reason for this.
If you don't send back the one that the AS last assigned you (on that system), you will be assigned the same unit again. The exception to this , is if the client reports that the unit was completed, but is holding it in your queue, due to the fact that it could not connect to the WS or CS to upload the completed unit.
After this, I would assume that there is some sort of communication to the effect of "OK got prior unit returned, send new work unit", or "Holding prior completed unit for upload later, send new work unit."
I could be way off base here. But what you are suggesting, would be to circumvent the checks that are in place to make sure that they get the completed assigned work back first. Which defeats the purpose of having them in the first place.
As I stated above, there is more than one reason for these checks to be in place. But, I will not go into the details of the others.
As it is set up, and has been for a long time.
I will explain it as simply as I can.
They (the servers) need the unit that you crunched back first. And there is more than one reason for this.
If you don't send back the one that the AS last assigned you (on that system), you will be assigned the same unit again. The exception to this , is if the client reports that the unit was completed, but is holding it in your queue, due to the fact that it could not connect to the WS or CS to upload the completed unit.
After this, I would assume that there is some sort of communication to the effect of "OK got prior unit returned, send new work unit", or "Holding prior completed unit for upload later, send new work unit."
I could be way off base here. But what you are suggesting, would be to circumvent the checks that are in place to make sure that they get the completed assigned work back first. Which defeats the purpose of having them in the first place.
As I stated above, there is more than one reason for these checks to be in place. But, I will not go into the details of the others.
-=MB=-
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Lost Time
Aha, one of those counter-arguments I mentioned... I remember now. There's no point in sending out a new WU if the last WU you completed is actually screwed up because of too much overclocking or whatever. There is actually a small benefit in waiting so see how the previous WU turns out before sending out the next WU.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 1
- Joined: Thu Jun 19, 2008 1:08 am
- Hardware configuration: Q6600 @ 3.25 ** 8800 GTS 512 ** 4 GB ** Vista Ultimate SMP & GPU
Phenom @ 2.64 ** HD 3850 512 ** 4 GB ** Vista Ultimate SMP & GPU
A64 3200+ ** 1 GB ** Windows Home Server Console
P4 3.0 ** 1 GB ** XP Pro Console
P4 1.8 ** 512 MB ** XP Pro Console
Re: Lost Time
Judging by the simplified process explained above, it seems possible since a user can in fact get a new WU before sending the results of the current one. Why not just have the client request a new WU at 99% and tell the server that the next one is waiting in queue, when in reality it's only 99% complete or even 99.5%. With broadband connection though, I can't see this time loss being too much unless you have hundreds of boxes.
"A drug is neither moral nor immoral - it's a chemical compound. The compound itself is not a menace to society until a human being treats it as if consumption bestowed a temporary license to act like an asshole."
-
- Posts: 5
- Joined: Sun Dec 02, 2007 2:25 pm
- Hardware configuration: 2) Q6600 G0 @ 3.2Ghz / ASUS P5B & Dlx / 2gig OCZ Plat. R2 / Vista 32/XP
XFX 8800 GTS 512 - 177.35 drivers
Q6600 G0 @ 3.2Ghz / XFX nForce 780i SLI / 2gig OCZ Plat. R2 / Vista 32 /
2x XFX 9800GTX - 177.35 drivers
Re: Lost Time
Ah, I see there is a reason why it works the way it does.
Thanks for answering my question, I thought it was worth asking.
Thanks for answering my question, I thought it was worth asking.
-
- Posts: 704
- Joined: Tue Dec 04, 2007 6:56 am
- Hardware configuration: Ryzen 7 5700G, 22.40.46 VGA driver; 32GB G-Skill Trident DDR4-3200; Samsung 860EVO 1TB Boot SSD; VelociRaptor 1TB; MSI GTX 1050ti, 551.23 studio driver; BeQuiet FM 550 PSU; Lian Li PC-9F; Win11Pro-64, F@H 8.3.5.
[Suspended] Ryzen 7 3700X, MSI X570MPG, 32GB G-Skill Trident Z DDR4-3600; Corsair MP600 M.2 PCIe Gen4 Boot, Samsung 840EVO-250 SSDs; VelociRaptor 1TB, Raptor 150; MSI GTX 1050ti, 526.98 driver; Kingwin Stryker 500 PSU; Lian Li PC-K7B. Win10Pro-64, F@H 8.3.5. - Location: @Home
- Contact:
Re: Lost Time
Should be fairly simple to have the event that triggers the "FINISHED UNIT" message (or any other "cleanup chore" if applicable) in the log also trigger a "download new WU first" switch in the client. After all, after the FINISHED UNIT message is triggered, the client will still download a WU if the old WU is, for any reason, unable to be immediately uploaded...7im wrote:Aha, one of those counter-arguments I mentioned... I remember now. There's no point in sending out a new WU if the last WU you completed is actually screwed up because of too much overclocking or whatever. There is actually a small benefit in waiting so see how the previous WU turns out before sending out the next WU.
Ryzen 7 5700G, 22.40.46 VGA driver; MSI GTX 1050ti, 551.23 studio driver
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
-
- Posts: 270
- Joined: Sun Dec 02, 2007 2:26 pm
- Hardware configuration: Folders: Intel C2D E6550 @ 3.150 GHz + GPU XFX 9800GTX+ @ 765 MHZ w. WinXP-GPU
AMD A2X64 3800+ @ stock + GPU XFX 9800GTX+ @ 775 MHZ w. WinXP-GPU
Main rig: an old Athlon Barton 2500+ @2.25 GHz & 2* 512 MB RAM Apacer, Radeon 9800Pro, WinXP SP3+ - Location: Belgium, near the International Sea-Port of Antwerp
Re: Lost Time
I think the crux is someplace else;
Since I 've re-joined Linux SMP (with v6 client), I 've encountered a few WU finishes (at 100%) where the client then just stops its natural sequence and just halts.
It 's still running, but not. You need to stop it with CTRL+C, then run the repair sequence, Qfix, delete queue entry, then Qfix again ...
I just did 2 of those repairs on both my SMP rigs !
It 's very very odd that the sequence for the queue entry delete runs for about exactly 4 minutes; sounds familiar.
The waiting time after WU finish, after the upload of results, before a new WU is downloaded is ... the same amount of time !
There 's a bug somewhere; they just don't deem it important enough to fix it.
PROOF: ---> After you 've done the above repair sequence, you just restart the client; it will indicate that results are going to be sent back, in the seconds following that message, a WU is downloaded and started.
After the upload has finished, you get to see the message confirming the successful upload and the "Thank You ..." message too (as usual).
In the last case, upload & download can happen simultaneously ! ! !
Maybe a good pointer for the Coders !
Since I 've re-joined Linux SMP (with v6 client), I 've encountered a few WU finishes (at 100%) where the client then just stops its natural sequence and just halts.
It 's still running, but not. You need to stop it with CTRL+C, then run the repair sequence, Qfix, delete queue entry, then Qfix again ...
I just did 2 of those repairs on both my SMP rigs !
It 's very very odd that the sequence for the queue entry delete runs for about exactly 4 minutes; sounds familiar.
The waiting time after WU finish, after the upload of results, before a new WU is downloaded is ... the same amount of time !
There 's a bug somewhere; they just don't deem it important enough to fix it.
PROOF: ---> After you 've done the above repair sequence, you just restart the client; it will indicate that results are going to be sent back, in the seconds following that message, a WU is downloaded and started.
After the upload has finished, you get to see the message confirming the successful upload and the "Thank You ..." message too (as usual).
In the last case, upload & download can happen simultaneously ! ! !
Maybe a good pointer for the Coders !
- stopped Linux SMP w. HT on i7-860@3.5 GHz
....................................
Folded since 10-06-04 till 09-2010
....................................
Folded since 10-06-04 till 09-2010
Re: Lost Time
Oh, come now. It's not a question of importance, it's a question of reproducibility. Every time they test their fix, it's going to work correctly but then when it gets out in the field, it's going to fail 1% of the time (or however often it fails now.) They can't fix a bug that doesn't happen when they test it.noorman wrote:There 's a bug somewhere; they just don't deem it important enough to fix it.
If you can demonstrate a reproducible method to make this happen, they'd be glad to fix it -- and quickly, I suppose.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Lost Time
lost time
pre downloading maybe not good idiea but what should happen is when we hit 100% it should send and download the next unit at the same time (or at least download next work unit then send the compleated one once download has finished) so once it has downloaded it it can start working on that project, if the project failed and needs sending back pre downloading should not be used (basicly use the FINISHED_UNIT for download or core error to not start it send only just in case its failing all the time or Pause is used or -oneunit)
SMP it should be be done and some single console projects as some of them are 25MB+ in size to Send back to the server wasted time waiting for it to send the project back could of been used folding next project, ignoreing points you get if you add up all that lost time idleing waiting for it to send the unit (that mite fail if its very big) thats alot of folding time that could of been used, all
running 2 clients easy fix to this problem but you should not need to do that and it makes the project times longer to return back to the server
pre downloading maybe not good idiea but what should happen is when we hit 100% it should send and download the next unit at the same time (or at least download next work unit then send the compleated one once download has finished) so once it has downloaded it it can start working on that project, if the project failed and needs sending back pre downloading should not be used (basicly use the FINISHED_UNIT for download or core error to not start it send only just in case its failing all the time or Pause is used or -oneunit)
SMP it should be be done and some single console projects as some of them are 25MB+ in size to Send back to the server wasted time waiting for it to send the project back could of been used folding next project, ignoreing points you get if you add up all that lost time idleing waiting for it to send the unit (that mite fail if its very big) thats alot of folding time that could of been used, all
running 2 clients easy fix to this problem but you should not need to do that and it makes the project times longer to return back to the server
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Lost Time
It's more helpful to the project to get completed work unit back to them for study before downloading the next WU. Delaying the upload until after the download adds a small delay, but repeated many times over, the delay becomes very large. Even sharing the bandwidth with concurrent uploads/downloads is a small delay.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.