Timewarp - Projects - Compiler Optimization Transforms

← Back to Projects

Compiler optimizations for improving performance of Harris Corner Detection Algorithm on multicore/SIMD CPUs

Compiler Optimizations
SIMD Vectorization
AVX / SSE2
Parallelization / OpenMP
Image/Video processing algo
Course Project

The objective of the assignment is to optimize and tune the Harris corner detection algorithm for performance using locality, SIMD and multicore parallelism transformations. We tune the
Using suitable compiler flags, transforms and optimizations we obtain a speed up of 11.5X over unparallelized reference implementation and 13.5X over OpenCV using GCC 4.9.2 compiler and 11.3X over unparallelized reference implementation and 14.6X over OpenCV using ICC 15.0 compiler. All experiments were performaned on Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz [Haswell μarch, 4 core, 64 KB L1 private / 256 KB L2 private / 8 MB L3 shared cache].
Highlight results below. Detailed performance comparison of ICC vs GCC and auto-vectorization, parallel scaling etc. available in the full report.

◉ Speedup and Execution time (in ms) by vectorization and locality transforms

	OpenCV	Reference	Optimized	Speedup by locality transforms
No Vectorize	3515.29	3767.32	2442.4	1.54
Vectorize	3566.35	3035.41	930.90	3.26
Vectorization	-	1.24x	2.62x
Speedup

◉ Speedup and Execution time (in ms) using ICC 15.0

	OpenCV	Reference	Optimized	Speedup w.r.t Reference
1 core	3567.95	2755.83	904.61	3.04x
2 core	-	1617.88	355.724	4.54x
4 core	-	1444.89	243.19	5.94x
Speedup by	-	1.90x	3.72x
Parallelism

◉ Speedup and Execution time (in ms) using GCC 4.9.2

	OpenCV	Reference	Optimized	Speedup w.r.t Reference
1 core	3566.35	3035.41	930.90	3.26x
2 core	-	1990.6	422.54	4.71x
4 core	-	1940.92	264.73	7.34x
Speedup by	-	1.56x	3.52x
Parallelism

Other projects similar to this:

10th Oct 2014

Comp µ-Architecure Projectread more →

10th Oct 2014

Implement Dynamic Scoping for LLVM Clangread more →

Project - Optimizing Image/Video processing algorithms for multicore / SIMD CPUs

Home → Projects

← Back to Projects

Compiler optimizations for improving performance of Harris Corner Detection Algorithm on multicore/SIMD CPUs

10th Oct 2014

10th Oct 2014