Does OpenCL always zero-initialize device memory? -
i've noticed often, global
, constant
device memory initialized 0. universal rule? wasn't able find in standard.
no doesn't. instance had small kernel test atomic add:
kernel void atomicadd(volatile global int *result){ atomic_add(&result[0], 1); }
calling host code (pyopencl + unittest):
def test_atomic_add(self): ndrange = (4, 4) result = np.zeros(1, dtype=np.int32) out_buf = cl.buffer(self.ctx, self.mf.write_only, size=result.nbytes) self.prog.atomicadd(self.queue, ndrange, ndrange, out_buf) cl.enqueue_copy(self.queue, result, out_buf).wait() self.assertequal(result, 16)
was returning correct value when using cpu. on ati hd 5450 returned value junk.
and if recall, on nvidia first run returning correct value, i.e. 16, following run, values 32, 48, etc. reusing same location old value still stored there.
when corrected host code line (copying 0 value buffer):
out_buf = cl.buffer(self.ctx, self.mf.write_only | self.mf.copy_host_ptr, hostbuf=result)
everything worked fine on devices.
Comments
Post a Comment