Reduce allocations. Don't create intermediary arrays which we then
consume right after. Manually fuse the arrays and decode straight into
the sum instead.
Furthermore, don't invoke a Reader, but carve out the locations via a
loop, directly.
These two changes taken together speeds up oshash computations by a
factor of 10 according to the benchmark tests. The main reason for
this change is a much lowered memory allocation rate which in turn
improves GC pressure.
While here, add a benchmark for oshash computations and use it for
testing the performance.