diff options
| author | Even Rouault <even.rouault@spatialys.com> | 2020-03-09 22:33:40 +0100 |
|---|---|---|
| committer | Even Rouault <even.rouault@spatialys.com> | 2020-03-09 22:51:55 +0100 |
| commit | d9b1db0ff166aceb8c74517bcbe056678a048554 (patch) | |
| tree | 87528369f1216d0ed5f2019d1295ff206a664859 /docs/source/development/reference/cpp | |
| parent | 13782b19974ed8c99a8176692e688512c96e8481 (diff) | |
| download | PROJ-d9b1db0ff166aceb8c74517bcbe056678a048554.tar.gz PROJ-d9b1db0ff166aceb8c74517bcbe056678a048554.zip | |
Approximate tmerc (Snyder): speed optimizations
fwd: 7% faster on Core-i7@2.6GHz (with FMA triggered), 22% faster on GCE Xeon@2GHz (with FMA)
inv: 31% faster on Core-i7@2.6GHz (with FMA triggered), 60% faster on GCE Xeon@2GHz (with FMA)
The optimizations consists in different things:
- optionaly use the FMA (Fused Multiply Addition) instruction set with gcc >= 6.
Binaries are generated with the standard instruction set (SSE/SSE2),
and with one variant with FMA, and the appropriate version is selected automatically
at runtime. This gives a modest speedup, but measurable. The speedup is more
obvious on lower clocked CPU.
- inline mlfn and inv_mlfn
- for inv_mlfn avoid recomputation of sin()/cos() at each iteration stage,
by observing that the argument changes in modest way at each iteration,
and using approximation of sin()/cos(). The differences due to that approximation
are way below the 1e-11 tolerance threshold.
Different in results are neglectable (only found in areas where the approximations
of the Snyder formulas are already no longer valid)
Before:
$ echo 8e5 9e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
799997.896522093331 8999999.520601103082
$ echo 8e5 5e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
800000.000007762224 4999999.999971268699
$ echo 18e5 9e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
1079182.990696100984 8661150.574729491025
$ echo 18e5 5e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
1799997.510861013783 4999999.567328464240
After:
$ echo 8e5 9e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
799997.896522093331 8999999.520601103082
$ echo 8e5 5e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
800000.000007762224 4999999.999971268699
$ echo 18e5 9e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
1079182.990696124267 8661150.574729502201
$ echo 18e5 5e6 | src/proj -d 12 +proj=utm +zone=31 -I +approx | src/proj -d 12 +proj=utm +zone=31 +approx
1799997.510861013783 4999999.567328464240
Diffstat (limited to 'docs/source/development/reference/cpp')
0 files changed, 0 insertions, 0 deletions
