Commit 5f141b8c authored by Paal Kvamme's avatar Paal Kvamme
Browse files

Multi-threading of RoundAndClip.

parent 0957e76c
......@@ -1103,8 +1103,18 @@ DataBufferNd<T,NDim>::s_scaleFromFloat(const DataBuffer* in,
dst_type::value_type *dst_ptr = dst->data();
const typename src_type::value_type *src_ptr = src->data();
const typename src_type::value_type *src_end = src_ptr + src->allocsize();
while (src_ptr < src_end)
*dst_ptr++ = RoundAndClip<dst_type::value_type>(*src_ptr++ * a + b, defaultstorage);
const std::int64_t totalsize = src->allocsize();
// TODO-Performance: There is a risk of the OpemMP overhead being larger
// then the speedup gained by multiple threads. I ran tests only in one
// environment. It seemed safe by a wide margin if there was at least one
// standard-sized brick (256 KB to 1 MB) being processed. I am still
// worrying, because technically there is no upper limit to how much
// overhead OpenMP might add. While doing serial processing has a fixed
// and not that dramatic cost.
#pragma omp parallel for if(totalsize >=256*1024)
for (std::int64_t ii=0; ii<totalsize; ++ii) {
dst_ptr[ii] = RoundAndClip<dst_type::value_type>(src_ptr[ii] * a + b, defaultstorage);
}
return dst;
}
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment