Hi, I'm working on some algorithms that are calculated on the GPU (using Brook, Cuda, ...). My problem is that my application is still slower than the CPU version I'm comparing it with. Can someone give me general hints how I could get my application faster (apart from buying a new GPU ;) ) ? If you...