Fix #429 -- add branch-free paths for simple cases like CuArray{Float32}
#430
+54
−15
CuArray{Float32}
#430