c++ - CUDA debugging procedure for non-deterministic output -
i'm debugging cuda 4.0/thrust-based image reconstruction code on ubuntu 10.10 64-bit system , i've been trying figure out how debug run-time error have in output images appear random "noise." there no random number generator output in code, expect output consistent between runs, if it's wrong. however, it's not...
i wondering if 1 has general procedure debugging cuda runtime errors such these. i'm not using shared memory in cuda kernels. i've taken pains avoid race conditions involving global memory, have missed something.
i've tried using gpu ocelot, has problems recognizing of cuda , cusparse function calls.
also, code works. it's when change 1 setting these non-deterministic results. i've checked code associated setting, can't figure out i'm doing wrong. if can distill can post here, might that, @ point it's complicated post here.
are sure of kernels have proper blocksize/remainder handling? 1 place have seen non-deterministic results occurred when had data elements @ end of array not being processed.
our kernels were intended data known integer multiple of 256 elements. used blocksize of 256, , did simple division number of blocks. when data changed length, leftover 255 or less elements never got processed. spots in output had random data.
Comments
Post a Comment