aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/source/community/rfc/index.rst1
-rw-r--r--docs/source/community/rfc/rfc-4.rst1406
2 files changed, 1407 insertions, 0 deletions
diff --git a/docs/source/community/rfc/index.rst b/docs/source/community/rfc/index.rst
index 3dd52a8d..a41dfe7d 100644
--- a/docs/source/community/rfc/index.rst
+++ b/docs/source/community/rfc/index.rst
@@ -14,3 +14,4 @@ the project.
rfc-1
rfc-2
rfc-3
+ rfc-4
diff --git a/docs/source/community/rfc/rfc-4.rst b/docs/source/community/rfc/rfc-4.rst
new file mode 100644
index 00000000..6cd9a82f
--- /dev/null
+++ b/docs/source/community/rfc/rfc-4.rst
@@ -0,0 +1,1406 @@
+.. _rfc4:
+
+====================================================================
+PROJ RFC 4: Remote access to grids and GeoTIFF grids
+====================================================================
+
+:Author: Even Rouault, Howard Butler
+:Contact: even.rouault@spatialys.com, howard@hobu.co
+:Status: Adopted
+:Implementation target: PROJ 7
+:Last Updated: 2020-01-10
+
+Motivation
+-------------------------------------------------------------------------------
+
+PROJ 6 brings undeniable advances in the management of coordinate
+transformations between datums by relying and applying information available in
+the PROJ database. PROJ's rapid evolution from a cartographic projections
+library with a little bit of geodetic capability to a full geodetic
+transformation and description environment has highlighted the importance of
+the support data. Users desire the convenience of software doing the right
+thing with the least amount of fuss, and survey organizations wish to deliver
+their models across as wide a software footprint as possible. To get results
+with the highest precision, a grid file that defines a model that provides
+dimension shifts is often needed. The proj-datumgrid project centralizes grids
+available under an open data license and bundles them in different archives
+split along major geographical regions of the world .
+
+It is assumed that a PROJ user has downloaded and installed grid files that are
+referred to in the PROJ database. These files can be quite large in aggregate,
+and packaging support by major distribution channels is somewhat uneven due to
+their size, sometimes ambiguous licensing story, and difficult-to-track
+versioning and lineage. It is not always clear to the user, especially to
+those who may not be so familiar with geodetic operations, that the highest
+precision transformation may not always being applied if grid data is not
+available. Users want both convenience and correctness, and management of the
+shift files can be challenging to those who may not be aware of their
+importance to the process.
+
+The computing environment in which PROJ operates is also changing. Because the
+shift data can be so large (currently more than 700 MB of uncompressed data,
+and growing), deployment of high accuracy operations can be limited due to
+deployment size constraints (serverless operations, for example). Changing to a
+delivery format that supports incremental access over a network along with
+convenient access and compression will ease the resource burden the shift files
+present while allowing the project to deliver transformation capability with
+the highest known precision provided by the survey organizations.
+
+Adjustment grids also tend to be provided in many different formats depending
+on the organization and country that produced them. In PROJ, we have over time
+"standardized" on using horizontal shift grids as NTv2 and vertical shift grids
+using GTX. Both have poor general support as dedicated formats, limited
+metadata capabilities, and neither are not necessarily "cloud optimized" for
+incremental access across a network.
+
+Summary of work planned by this RFC
+-------------------------------------------------------------------------------
+
+- Grids will be hosted by one or several Content Delivery Networks (CDN)
+- Grid loading mechanism will be reworked to be able to download grids or parts
+ of grids from a online repository. When opted in, users will no longer have to
+ manually fetch grid files and place them in PROJ_LIB.
+ Full and accurate capability of the software will no longer require hundreds
+ of megabytes of grid shift files in advance, even if only just a few of them
+ are needed for the transformations done by the user.
+- Local caching of grid files, or even part of files, so that users end up
+ mirroring what they actually use.
+- A grid shift format, for both horizontal and vertical shift grids (and in
+ potential future steps, for other needs, such as deformation models) will be
+ implemented.
+
+The use of grids locally available will of course still be available, and will
+be the default behaviour.
+
+Network access to grids
+-------------------------------------------------------------------------------
+
+curl will be an optional build dependency of PROJ, added in autoconf and cmake
+build systems. It can be disabled at build time, but this must be
+an explicit setting of configure/cmake as the resulting builds have less functionality.
+When curl is enabled at build time, download of grids themselves will not be
+enabled by default at runtime. It will require explicit consent of the user, either
+through the API
+(:c:func:`proj_context_set_enable_network`) through the PROJ_NETWORK=ON
+environment variable, or the ``network = on`` setting of proj.ini.
+
+Regarding the minimum version of libcurl required, given GDAL experience that
+can build with rather ancient libcurl for similar functionality, we can aim for
+libcurl >= 7.29.0 (as being available in RHEL 7).
+
+An alternate pluggable network interface can also be set by the user in case
+suppot for libcurl was not built in, or if for the desired context of use, the
+user wishes to provide the network implementation (a typical use case could be
+QGIS that would use its QT-based networking facilities to solve issues with
+SSL, proxy, authentication, etc.)
+
+A text configuration file, installed in ${installation_prefix}/share/proj/proj.ini
+(or ${PROJ_LIB}/proj.ini)
+will contain the URL of the CDN that will be used.
+The user may also override this setting with the
+:c:func:`proj_context_set_url_endpoint` or through the PROJ_NETWORK_ENDPOINT
+environment variable.
+
+The rationale for putting proj.ini in that location is
+that it is a well-known place by PROJ users, with the existing PROJ_LIB mechanics
+for systems like Windows where hardcoded paths at runtime aren't generaly usable.
+
+C API
++++++
+
+The preliminary C API for the above is:
+
+.. code-block:: c
+
+ /** Enable or disable network access.
+ *
+ * @param ctx PROJ context, or NULL
+ * @return TRUE if network access is possible. That is either libcurl is
+ * available, or an alternate interface has been set.
+ */
+ int proj_context_set_enable_network(PJ_CONTEXT* ctx, int enable);
+
+ /** Define URL endpoint to query for remote grids.
+ *
+ * This overrides the default endpoint in the PROJ configuration file or with
+ * the PROJ_NETWORK_ENDPOINT environment variable.
+ *
+ * @param ctx PROJ context, or NULL
+ * @param url Endpoint URL. Must NOT be NULL.
+ */
+ void proj_context_set_url_endpoint(PJ_CONTEXT* ctx, const char* url);
+
+ /** Opaque structure for PROJ. Implementations might cast it to their
+ * structure/class of choice. */
+ typedef struct PROJ_NETWORK_HANDLE PROJ_NETWORK_HANDLE;
+
+ /** Network access: open callback
+ *
+ * Should try to read the size_to_read first bytes at the specified offset of
+ * the file given by URL url,
+ * and write them to buffer. *out_size_read should be updated with the actual
+ * amount of bytes read (== size_to_read if the file is larger than size_to_read).
+ * During this read, the implementation should make sure to store the HTTP
+ * headers from the server response to be able to respond to
+ * proj_network_get_header_value_cbk_type callback.
+ *
+ * error_string_max_size should be the maximum size that can be written into
+ * the out_error_string buffer (including terminating nul character).
+ *
+ * @return a non-NULL opaque handle in case of success.
+ */
+ typedef PROJ_NETWORK_HANDLE* (*proj_network_open_cbk_type)(
+ PJ_CONTEXT* ctx,
+ const char* url,
+ unsigned long long offset,
+ size_t size_to_read,
+ void* buffer,
+ size_t* out_size_read,
+ size_t error_string_max_size,
+ char* out_error_string,
+ void* user_data);
+
+ /** Network access: close callback */
+ typedef void (*proj_network_close_cbk_type)(PJ_CONTEXT* ctx,
+ PROJ_NETWORK_HANDLE* handle,
+ void* user_data);
+
+ /** Network access: get HTTP headers */
+ typedef const char* (*proj_network_get_header_value_cbk_type)(
+ PJ_CONTEXT* ctx,
+ PROJ_NETWORK_HANDLE* handle,
+ const char* header_name,
+ void* user_data);
+
+ /** Network access: read range
+ *
+ * Read size_to_read bytes from handle, starting at offset, into
+ * buffer.
+ * During this read, the implementation should make sure to store the HTTP
+ * headers from the server response to be able to respond to
+ * proj_network_get_header_value_cbk_type callback.
+ *
+ * error_string_max_size should be the maximum size that can be written into
+ * the out_error_string buffer (including terminating nul character).
+ *
+ * @return the number of bytes actually read (0 in case of error)
+ */
+ typedef size_t (*proj_network_read_range_type)(
+ PJ_CONTEXT* ctx,
+ PROJ_NETWORK_HANDLE* handle,
+ unsigned long long offset,
+ size_t size_to_read,
+ void* buffer,
+ size_t error_string_max_size,
+ char* out_error_string,
+ void* user_data);
+
+ /** Define a custom set of callbacks for network access.
+ *
+ * All callbacks should be provided (non NULL pointers).
+ *
+ * @param ctx PROJ context, or NULL
+ * @param open_cbk Callback to open a remote file given its URL
+ * @param close_cbk Callback to close a remote file.
+ * @param get_header_value_cbk Callback to get HTTP headers
+ * @param read_range_cbk Callback to read a range of bytes inside a remote file.
+ * @param user_data Arbitrary pointer provided by the user, and passed to the
+ * above callbacks. May be NULL.
+ * @return TRUE in case of success.
+ */
+ int proj_context_set_network_callbacks(
+ PJ_CONTEXT* ctx,
+ proj_network_open_cbk_type open_cbk,
+ proj_network_close_cbk_type close_cbk,
+ proj_network_get_header_value_cbk_type get_header_value_cbk,
+ proj_network_read_range_type read_range_cbk,
+ void* user_data);
+
+
+To make network access efficient, PROJ will internally have a in-memory cache
+of file ranges to only issue network requests by chunks of 16 KB or multiple of them,
+to limit the number of HTTP GET requests and minimize latency caused by network
+access. This is very similar to the behaviour of the GDAL
+`/vsicurl/ <https://gdal.org/user/virtual_file_systems.html#vsicurl-http-https-ftp-files-random-access>`_
+I/O layer. The plan is to mostly copy GDAL's vsicurl implementation inside PROJ, with
+needed adjustmeents and proper namespacing of it.
+
+A retry strategy (typically a delay with an exponential back-off and some random
+jitter) will be added to account for intermittent network or server-side failure.
+
+URL building
+++++++++++++
+
+The PROJ database has a ``grid_transformation`` grid whose column ``grid_name``
+(and possibly ``grid2_name``) contain the name of the grid as indicated by the
+authority having registered the transformation (typically EPSG). As those
+grid names are not generally directly usable by PROJ, the PROJ database has
+also a ``grid_alternatives`` table that link original grid names to the ones used
+by PROJ. When network access will be available and needed due to lack of a
+local grid, the full URL will be the
+endpoint from the configuration or set by the user, the basename of the PROJ
+usable filename, and the "tif" suffix. So if the CDN is at http://example.com
+and the name from ``grid_alternatives`` is egm96_15.gtx, then the URL will
+be http://example.com/egm96_15.tif
+
+Grid loading
+++++++++++++
+
+The following files will be affected, in one way or another, by the above describes
+changes:
+nad_cvt.cpp, nad_intr.cpp, nad_init.cpp, grid_info.cpp, grid_list.cpp, apply_gridshift.cpp,
+apply_vgridshift.cpp.
+
+In particular the current logic that consists to ingest all the values of a
+grid/subgrid in the ct->cvs array will be completely modified, to enable
+access to grid values at a specified (x,y) location.
+
+proj_create_crs_to_crs() / proj_create_operations() impacts
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+Once network access is available, all grids known to the PROJ database
+(grid_transformation + grid_alternatives table) will be assumed to be available,
+when computing the potential pipelines between two CRS.
+
+Concretely, this will be equivalent to calling
+:cpp:func:`proj_operation_factory_context_set_grid_availability_use`
+with the ``use`` argument set to a new enumeration value
+
+.. code-block:: c
+
+ /** Results will be presented as if grids known to PROJ (that is
+ * registered in the grid_alternatives table of its database) were
+ * available. Used typically when networking is enabled.
+ */
+ PROJ_GRID_AVAILABILITY_KNOWN_AVAILABLE
+
+
+Local on-disk caching of remote grids
++++++++++++++++++++++++++++++++++++++
+
+As many workflows will tend to use the same grids over and over, a local
+on-disk caching of remote grids will be added. The cache will be a single
+SQLite3 database, in a user-writable directory shared by all applications using
+PROJ.
+
+Its total size will be configurable, with a default maximum size of 100 MB
+in proj.ini. The cache will also keep the timestamp of the last time it checked
+various global properties of the file (its size, Last-Modified and ETag headers).
+A time-to-live parameter, with a default of 1 day in proj.ini, will be used to
+determine whether the CDN should be hit to verify if the information in the
+cache is still up-to-date.
+
+.. code-block:: c
+
+ /** Enable or disable the local cache of grid chunks
+ *
+ * This overrides the setting in the PROJ configuration file.
+ *
+ * @param ctx PROJ context, or NULL
+ * @param enabled TRUE if the cache is enabled.
+ */
+ void proj_grid_cache_set_enable(PJ_CONTEXT *ctx, int enabled);
+
+ /** Override, for the considered context, the path and file of the local
+ * cache of grid chunks.
+ *
+ * @param ctx PROJ context, or NULL
+ * @param fullname Full name to the cache (encoded in UTF-8). If set to NULL,
+ * caching will be disabled.
+ */
+ void proj_grid_cache_set_filename(PJ_CONTEXT* ctx, const char* fullname);
+
+ /** Override, for the considered context, the maximum size of the local
+ * cache of grid chunks.
+ *
+ * @param ctx PROJ context, or NULL
+ * @param max_size_MB Maximum size, in mega-bytes (1024*1024 bytes), or
+ * negative value to set unlimited size.
+ */
+ void proj_grid_cache_set_max_size(PJ_CONTEXT* ctx, int max_size_MB);
+
+ /** Override, for the considered context, the time-to-live delay for
+ * re-checking if the cached properties of files are still up-to-date.
+ *
+ * @param ctx PROJ context, or NULL
+ * @param ttl_seconds Delay in seconds. Use negative value for no expiration.
+ */
+ void proj_grid_cache_set_ttl(PJ_CONTEXT* ctx, int ttl_seconds);
+
+ /** Clear the local cache of grid chunks.
+ *
+ * @param ctx PROJ context, or NULL.
+ */
+ void proj_grid_cache_clear(PJ_CONTEXT* ctx);
+
+The planned database structure is:
+
+.. code-block:: sql
+
+ -- General properties on a file
+ CREATE TABLE properties(
+ url TEXT PRIMARY KEY NOT NULL,
+ lastChecked TIMESTAMP NOT NULL,
+ fileSize INTEGER NOT NULL,
+ lastModified TEXT,
+ etag TEXT
+ );
+
+ -- Store chunks of data. To avoid any potential fragmentation of the
+ -- cache, the data BLOB is always set to the maximum chunk size of 16 KB
+ -- (right padded with 0-byte)
+ -- The actual size is stored in chunks.data_size
+ CREATE TABLE chunk_data(
+ id INTEGER PRIMARY KEY AUTOINCREMENT CHECK (id > 0),
+ data BLOB NOT NULL
+ );
+
+ -- Record chunks of data by (url, offset)
+ CREATE TABLE chunks(
+ id INTEGER PRIMARY KEY AUTOINCREMENT CHECK (id > 0),
+ url TEXT NOT NULL,
+ offset INTEGER NOT NULL,
+ data_id INTEGER NOT NULL,
+ data_size INTEGER NOT NULL,
+ CONSTRAINT fk_chunks_url FOREIGN KEY (url) REFERENCES properties(url),
+ CONSTRAINT fk_chunks_data FOREIGN KEY (data_id) REFERENCES chunk_data(id)
+ );
+ CREATE INDEX idx_chunks ON chunks(url, offset);
+
+ -- Doubly linked list of chunks. The next link is to go to the least-recently
+ -- used entries.
+ CREATE TABLE linked_chunks(
+ id INTEGER PRIMARY KEY AUTOINCREMENT CHECK (id > 0),
+ chunk_id INTEGER NOT NULL,
+ prev INTEGER,
+ next INTEGER,
+ CONSTRAINT fk_links_chunkid FOREIGN KEY (chunk_id) REFERENCES chunks(id),
+ CONSTRAINT fk_links_prev FOREIGN KEY (prev) REFERENCES linked_chunks(id),
+ CONSTRAINT fk_links_next FOREIGN KEY (next) REFERENCES linked_chunks(id)
+ );
+ CREATE INDEX idx_linked_chunks_chunk_id ON linked_chunks(chunk_id);
+
+ -- Head and tail pointers of the linked_chunks. The head pointer is for
+ -- the most-recently used chunk.
+ -- There should be just one row in this table.
+ CREATE TABLE linked_chunks_head_tail(
+ head INTEGER,
+ tail INTEGER,
+ CONSTRAINT lht_head FOREIGN KEY (head) REFERENCES linked_chunks(id),
+ CONSTRAINT lht_tail FOREIGN KEY (tail) REFERENCES linked_chunks(id)
+ );
+ INSERT INTO linked_chunks_head_tail VALUES (NULL, NULL);
+
+The chunks table will store 16 KB chunks (or less for terminating chunks).
+The linked_chunks and linked_chunks_head_tail table swill act as a doubly linked
+list of chunks, with the least recently used ones at the end of the list, which
+will be evicted when the cache saturates.
+
+The directory used to locate this database will be ${XDG_DATA_HOME}/proj
+(per https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html)
+where ${XDG_DATA_HOME} defaults to ${HOME}/.local/share on Unix builds
+and ${LOCALAPPDATA} on Windows builds. Exact details to be sorted out, but
+https://github.com/ActiveState/appdirs/blob/a54ea98feed0a7593475b94de3a359e9e1fe8fdb/appdirs.py#L45-L97
+can be a good reference.
+
+As this database might be accesse by several threads or processes at the same
+time, the code accessing to it will carefully honour SQLite3 errors regarding
+to locks, to do appropriate retries if another thread/process is currently
+locking the database. Accesses requiring a modification of the database will
+start with a BEGIN IMMEDIATE transaction so as to acquire a write lock.
+
+.. note:: This database should be hosted on a local disk, not a network one.
+ Otherwise SQLite3 locking issues are to be expected.
+
+CDN provider
+++++++++++++
+
+`Amazon Public Datasets <https://aws.amazon.com/opendata/public-datasets/>`_
+has offered to be a storage and CDN provider.
+
+The program covers storage and egress (bandwidth) of the data.
+They generally don't allow usage of CloudFront
+(their CDN) as part of the program (we would usually look to have it covered
+by credits), but in this instance, they would be fine to provide it.
+They'd only ask that we keep the CloudFront URL "visible" (as appropriate for
+the use case) so people can see where the data is hosted in case they go looking.
+Their terms can be seen at https://aws.amazon.com/service-terms/ and CloudFront
+has its own, small section. Those terms may change a bit from time to time for
+minor changes. Major changing service terms is assumed to be unfrequent.
+There are also the Public Dataset Program terms at http://aws.amazon.com/public-datasets/terms/.
+Those also do not effectively change over time and are renewed on a 2 year basis.
+
+Criteria for grid hosting
++++++++++++++++++++++++++
+
+The grids hosted on the CDN will be exactly the ones collected,
+currently and in the future, by the `proj-datumgrid <https://github.com/OSGeo/proj-datumgrid/>`_
+initiative. In particular, new grids are accepted as long as
+they are released under a license that is compatible with the
+`Open Source Definition <https://opensource.org/osd-annotated>`_ and the source
+of the grid is clearly stated and verifiable. Suitable licenses include:
+
+- Public domain
+- X/MIT
+- BSD 2/3/4 clause
+- CC0
+- CC-BY (v3.0 or later)
+- CC-BY-SA (v3.0 or later)
+
+For new grids to be transparently used by the proj_create_crs_to_crs() mechanics,
+they must be registered in the PROJ database (proj.db) in the ``grid_transformation`` and
+``grid_alternatives`` table. The nominal path to have a new record in the grid_transformation
+is to have a transformation being registered in the EPSG dataset (if there is no
+existing one), which will be subsequently imported into the PROJ database.
+
+Versioning, historical preservation of grids
+++++++++++++++++++++++++++++++++++++++++++++
+
+The policy regarding this should be similar to the one applied to
+`proj-datumgrid <https://github.com/OSGeo/proj-datumgrid/>`_, which even if
+not formalized, is around the following lines:
+
+- Geodetic agencies release regularly new version of grids. Typically for the
+ USA, NOAA has released GEOID99, GEOID03, GEOID06, GEOID09, GEOID12A, GEOID12B,
+ GEOID18 for the NAVD88 to NAD83/NAD83(2011) vertical adjustments. Each of these
+ grids is considered by EPSG and PROJ has a separate object, with distinct filenames.
+ The release of a new version does not cause the old grid to be automatically removed.
+ That said, due to advertized accuracies and supersession rules of the EPSG dataset, the
+ most recent grid will generally be used for a CRS -> CRS transformation if the
+ user uses proj_create_crs_to_crs() (with the exception that if a VERT_CRS WKT
+ includes a GEOID_MODEL known to PROJ, an old version of the grid will be used).
+ If the user specifies a whole pipeline with an explicit grid name, it will be
+ of course strictly honoured.
+ As time goes, the size of the datasets managed by proj-datumgrid will be increasing,
+ we will have to explore on we managed that for the distributed .zip / .tar.gz
+ archives. This should not be a concern for CDN hosted content.
+
+- In case software-related conversion errors from the original grid format to the
+ one used by PROJ (be it GTX, NTv2 or GeoTIFF) would happen, the previous erroneous
+ version of the dataset would be replaced by the corrected one. In that situation,
+ this might have an effect with the local on-disk caching of remote grids. We will
+ have to see with the CDN providers used if we can use for example the ETag HTTP header
+ on the client to detect a change, so that old cached content is not erroneously
+ reused (if not possible, we'll have to use some text file listing the grid names and their
+ current md5sum)
+
+
+Grids in GeoTIFF format
+-------------------------------------------------------------------------------
+
+Limitations of current formats
+++++++++++++++++++++++++++++++
+
+Several formats exist depending on the ad-hoc needs and ideas of the original
+data producer. It would be apropriate to converge on a common format able to
+address the different use cases.
+
+- Not tiled. Tiling is a nice to have propery for cloud-friendly access to
+ large files.
+- No support for compression
+- The NTv2 structures is roughly: header of main grid, data of main grid,
+ header of subgrid 1, data of subgrid 1, header of subgrid 2, data of subgrid 2,
+ etc.Due to the headers being scattered through the file, it is not possibly
+ to retrieve with a single HTTP GET request all header information.
+- GTX format has no provision to store metadata besides the minimum georeferencing
+ of the grid. NTv2 is a bit richer, but no extensible metadata possible.
+
+Discussion on choice of format
+++++++++++++++++++++++++++++++
+
+We have been made recently aware of other initiatives from the industry to come
+with a common format to store geodetic adjustment data. Some discussions have
+happen recently within the OGC CRS Working group. Past efforts include the
+Esri's proposed Geodetic data Grid eXchange Format, GGXF, briefly mentionned at
+page 86 of
+https://iag.dgfi.tum.de/fileadmin/IAG-docs/Travaux2015/01_Travaux_Template_Comm_1_tvd.pdf
+and page 66 of ftp://ftp.iaspei.org/pub/meetings/2010-2019/2015-Prague/IAG-Geodesy.pdf
+The current trend of those works would be to use a netCDF / HDF5 container.
+
+So, for the sake of completness, we list hereafter a few potential candidate
+formats and their pros and cons.
+
+TIFF/GeoTIFF
+************
+
+Strong points:
+
+* TIFF is a well-known and widespread format.
+
+* The GeoTIFF encoding is a widely industry supported scheme to encode georeferencing.
+ It is now a `OGC standard <https://www.opengeospatial.org/standards/geotiff>`_
+
+* There are independent initiatives to share grids as GeoTIFF, like
+ `that one <https://www.agisoft.com/downloads/geoids/>`_
+
+* TIFF can contain multiple images (IFD: Image File Directory) chained together.
+ This is the mechanism used for multiple-page scanned TIFF files, or in the
+ geospatial field to store multi-resolution/pyramid rasters. So it can be
+ used with sub-grids as in the NTv2 format.
+
+* Extensive experience with the TIFF format, and its appropriatness for network
+ access, in particular through the `Cloud Optimized GeoTIFF initiative <https://www.cogeo.org/>`_
+ whose layout can make use of sub-grids efficient from a network access
+ perspective, because grid headers can be put at the beginning of the file, and
+ so being retrieved in a single HTTP GET request.
+
+* TIFF can be tiled.
+
+* TIFF can be compressed. Commonly found compression formats arre DEFLATE, LZW,
+ combined with differential integer or floating point predictors
+
+* A TIFF image can contain a configurable number of channels/bands/samples.
+ In the rest of the document, we will use the sample terminology for this concept.
+
+* TIFF sample organization can be configured: either the values of different
+ samples are packed together (`PlanarConfiguration <https://www.awaresystems.be/imaging/tiff/tifftags/planarconfiguration.html>`_ = Contig), or put in separate tiles/strips
+ (PlanarConfiguration = Separate)
+
+* libtiff is a dependency commonly found in binary distributions of the
+ "ecosystem" to which PROJ belongs too
+
+* libtiff benefits from many years of efforts to increase its security, for
+ example being integrated to the oss-fuzz initiative. Given the potential
+ fetching of grids, using security tested components is an important concern.
+
+* Browser-side: there are "ports" of libtiff/libgeotiff in the browser such
+ as https://geotiffjs.github.io/ which could potentially make a port of PROJ
+ easier.
+
+Weak points:
+
+* we cannot use libgeotiff, since it depends itself on PROJ (to resolve CRS
+ or components of CRS from their EPSG codes). That said, for PROJ intended
+ use, we only need to decode the ModelTiepointTag and ModelPixelScaleTag TIFF
+ tags, so this can be done "at hand"
+
+* the metadata capabilities of TIFF baseline are limited. The TIFF format comes
+ with a predefined set of metadata items whose keys have numeric values. That
+ said, GDAL has used for the last 20 years or so a dedicated tag,
+ `GDAL_METADATA <https://www.awaresystems.be/imaging/tiff/tifftags/gdal_metadata.html>`_
+ of code 42112 that holds a XML-formatted string being able to store arbitrary
+ key-pair values.
+
+netCDF v3
+*********
+
+Strong points:
+
+* The binary format description as given in
+ `OGC 10-092r3 <http://portal.opengeospatial.org/files/?artifact_id=43734>`_ is relatively simple,
+ but it would still probably be necessary to use libnetcdf-c to access it
+
+* Metadata can be stored easily in netCDF attributes
+
+
+Weak points:
+
+* No compression in netCDF v3
+
+* No tiling in netCDF v3
+
+* Multi-samples variables are located in different sections of the files
+ (correspond to TIFF PlanarConfiguration = Separate)
+
+* No natural way of having hiearchical / multigrids. They must be encoded as
+ separate variables
+
+* georeferencing in netCDF is somewhat less standardized than TIFF/GeoTIFF.
+ The generally used model is `the conventions for CF (Climate and Forecast)
+ metadata <http://cfconventions.org/>`_
+ but there is nothing really handy in them for simple georeferencing with
+ the coordinate of the upper-left pixel and the resolution. The practice is
+ to write explict lon and lat variables with all values taken by the grid.
+ GDAL has for many years supported a simpler syntax, using a GeoTransform
+ attribute.
+
+* From the format description, its layout could be relatively cloud friendly,
+ except that libnetcdf has no API to plug an alternate I/O layer.
+
+* Most binary distributions of netCDF nowadays are based on libnetcdf v4, which
+ implies the HDF5 dependency.
+
+* From a few issues we identified a few years ago regarding crashes on corrupted
+ datasets, we contacted libnetcdf upstream, but they did not seem to be
+ interested in addressing those security issues.
+
+netCDF v4 / HDF5
+****************
+
+Note: The netCDF v4 format is a profile of the HDF5 file format.
+
+Strong points:
+
+* Compression supported (ZLIB and SZIP predefined)
+
+* Tiling (chunking) supported
+
+* Values of Multi-sample variables can be interleaved together (similarly
+ to TIFF PlanarConfiguration = Contig) by using compound data types.
+
+* Hierarchical organization with groups
+
+* While the netCDF API does not provide an alternate I/O layer, this is
+ possible with the HDF5 API.
+
+* Grids can be indexed by more than 2 dimensions (for current needs, we
+ don't need more than 2D support)
+
+Weak points:
+
+* The `HDF 5 File format <https://support.hdfgroup.org/HDF5/doc/H5.format.html>`_
+ is more complex than netCDF v3, and likely more than TIFF. We do not have
+ in-depth expertise of it to assess its cloud-friendliness.
+
+* The ones mentionned for netCDF v3 regarding georeferencing and security apply.
+
+
+GeoPackage
+**********
+
+As PROJ has already a SQLite3 dependency, GeoPackage could be examined as a
+potential solution.
+
+Strong points:
+
+* SQLite3 dependency
+
+* OGC standard
+
+* Multi-grid capabilities
+
+* Tiling
+
+* Compression
+
+* Metadata capabilities
+
+Weak points:
+
+* GeoPackage mostly address the RGB(A) Byte use case, or via the tile gridded
+ data extension, single-sample non-Byte data. No native support for multi-sample
+ non-Byte data: each sample should be put in a separate raster table.
+
+* Experience shows that SQLite3 layout (at least the layout adopted when using
+ the standard libsqlite3) is not cloud friendly. Indices may be scattered in
+ different places of the file.
+
+Conclusions
+***********
+
+The 2 major contenders regarding our goals and constraints are GeoTIFF and HDF5.
+Given past positive experience and its long history, GeoTIFF remains our preferred
+choice.
+
+
+.. _description_geotiff_format:
+
+Description of the PROJ GeoTIFF format
+++++++++++++++++++++++++++++++++++++++
+
+The general principles that guide the following requirements and recommendations
+are such that files will be properly recognized by PROJ, and also by GDAL which
+is an easy way to inspect such grid files:
+
+- `TIFF 6.0 <https://www.awaresystems.be/imaging/tiff/specification/TIFF6.pdf>`_
+ based (could possibly be BigTIFF without code changes, if we ever
+ need some day to handle grids larger than 4GB)
+
+- `GeoTIFF 1.1 <http://docs.opengeospatial.org/is/19-008r4/19-008r4.html>`_ for the georeferencing.
+ GeoTIFF 1.1 is a recent standard, compared to the original GeoTIFF 1.0 version,
+ but its backward compatibility is excellent, so that should not cause much trouble
+ to readers that are not official GeoTIFF 1.1 compliant.
+
+- Files hosted on the CDN will use a Geographic 2D CRS for the GeoTIFF GeoKeys.
+ That CRS is intended to be the interpolation CRS as defined in
+ `OGC Abstract Specification Topic 2 <http://docs.opengeospatial.org/as/18-005r4/18-005r4.html>`_,
+ that is the CRS to which grid values are refered to.
+
+ Given that they will nominally be related to the EPSG dataset, the `GeodeticCRSGeoKey
+ <http://docs.opengeospatial.org/is/19-008r4/19-008r4.html#_requirements_class_geodeticcrsgeokey>`_
+ will be used to store the EPSG code of the CRS. If the CRS cannot be reliably
+ encoded through that key or other geokeys, the ``interpolation_crs_wkt`` metadata
+ item detailed afterwards should be used.
+
+ This CRS will be generally the source CRS (for geographic to
+ geographic horizontal shift grids, or geographic to vertical shift grids), but
+ for vertical to vertical CRS adjustment, this will be the geographic CRS to
+ which the grid is referenced. In some very rare cases of geographic to vertical
+ shift grids, the interpolation CRS might be a geographic CRS that is not the
+ same as the source CRS (into which ellipsoidal height are expressed). The only
+ instance we have in mind is for the EPSG:7001 "ETRS89 to NAP height (1)" transformation
+ using the naptrans2008 VDatum-grid which is referenced to Amersfoort EPSG:4289
+ instead of ETRS89...
+
+ On the reading side, PROJ will ignore that information:
+ the CRS is already stored in the source_crs or interpolation_crs column of the
+ grid_transformation table.
+
+ For geographic to vertical shift files (geoid models), the GeoTIFF 1.1
+ convention will be used to store the value of the `VerticalGeoKey
+ <http://docs.opengeospatial.org/is/19-008r4/19-008r4.html#_requirements_class_verticalgeokey>`_
+ So a geoid model that apply to WGS 84 EPSG:4979 will have GeodeticCRSGeoKey = 4326
+ and VerticalGeoKey = 4979.
+
+- Files hosted on the CDN will use the GeoTIFF defined `ModelTiepointTag and ModelPixelScaleTag
+ <http://docs.opengeospatial.org/is/19-008r4/19-008r4.html#_raster_to_model_coordinate_transformation_requirements>`_ TIFF tags
+ to store the coordinates of the upper-left pixel and the resolution of the pixels.
+ On the reading side, they will be required and ModelTransformationTag will be ignored.
+
+ .. note::
+
+ Regarding anti-meridian handling, a variety of possibilities exist.
+ We do not attempt to standardize this and filesh hosted on the CDN will use
+ a georeferencing close to the original data producer.
+ For example, NOAA vertical grids that apply to Conterminous USA might even have a top-left
+ longitude beyond 180 (for consistency with Alaska grids, whose origin is < 180)
+ Anti-meridian handling in PROJ has probably issues. This RFC does not attempt
+ to address them in particular, as they are believed to be orthogonal to the
+ topics it covers, and being mostly implementation issues.
+
+- Files hosted on the CDN will use the `GTRasterTypeGeoKey
+ <http://docs.opengeospatial.org/is/19-008r4/19-008r4.html#_requirements_class_gtrastertypegeokey>`_
+ = PixelIsPoint convention.
+ This is the convention used by most existing grid formats currently. Note that GDAL
+ typically use a PixelIsArea convention (but can handle both conventions), so the
+ georeferencing it displays when opening a .gsb or .gtx file appears to have a
+ half-pixel shift regarding to the coordinates stored in the original grid file. On
+ the reading side, PROJ will accept both conventions (for equivalent georeferencing,
+ the value of the origin in a PixelIsArea convention is shifted by a half-pixel
+ towards the upper-left direction). Unspecified behaviour if this GeoKey is absent.
+
+- Files hosted on the CDN will be tiled, presumably with 256x256 tiles (small
+ grids that are smaller than 256x256 will use a single strip). On the reading
+ side, PROJ will accept TIFF files with any strip or tile organization.
+ Tiling is expressed by specifying the TileWidth, TileHeight, TileOffsets
+ and TileByteCounts tags. Strip organization is expressed by specifying the
+ RowsPerStrip, StripByteCounts and StripOffsets tags.
+
+- Files hosted on the CDN will use `Compression
+ <https://www.awaresystems.be/imaging/tiff/tifftags/compression.html>`_ = DEFLATE
+ or LZW (to be determined, possibly with
+ `Predictor <https://www.awaresystems.be/imaging/tiff/tifftags/predictor.html>`_ = 2
+ or 3)
+ On the reading side, PROJ will accept TIFF files with any compression method
+ (appropriate for the data types and PhotometricInterpretation considered)
+ supported by the libtiff build used by PROJ. Of course uncompressed files will be supported.
+
+- Files hosted on the CDN will use little-endian byte ordering. On the reading
+ side, libtiff will transparently handle both little-endian and big-endian
+ ordering.
+
+- Files hosted on the CDN will use PlanarConfiguration=Separate.
+ The tools described in a later section will order blocks so that blocks needed
+ for a given location are close to each other.
+ On the reading side, PROJ will handle also PlanarConfiguration=Contig.
+
+- Files hosted on the CDN will generally use Float32 (BitsPerSample=32 and SampleFormat=IEEEFP)
+ Files may be created using Signed Int 16 (
+ `BitsPerSample <https://www.awaresystems.be/imaging/tiff/tifftags/bitspersample.html>`_ =16 and
+ `SampleFormat <https://www.awaresystems.be/imaging/tiff/tifftags/sampleformat.html>`_ = INT),
+ Unsigned Int 16 (BitsPerSample=16 and SampleFormat=UINT), Signed Int 32 or Unsigned Int 32 generally with an
+ associate scale/offset.
+ On the reading side, only those three data types will be supported as well.
+
+- Files hosted on the CDN will have a `PhotometricInterpretation
+ <https://www.awaresystems.be/imaging/tiff/tifftags/photometricinterpretation.html>`_ = MinIsBlack.
+ It will be assumed, and ignored on the reading side.
+
+- Files hosted on the CDN will nominally have:
+
+ * `SamplesPerPixel <https://www.awaresystems.be/imaging/tiff/tifftags/samplesperpixel.html>`_ = 2
+ for horizontal shift grid, with the first sample being the longitude offset
+ and the second sample being the latitude offset.
+
+ * SamplesPerPixel = 1 for vertical shift grids.
+
+ In the future, different values of SamplesPerPixel may be used to accomodate
+ for other needs. For example for deformation models, SamplesPerPixel = 3 to combine
+ horizontal and vertical adjustments.
+ And even for the current identified needs of horizontal or vertical shifts,
+ more samples may be present (to indicate for example uncertainties), but
+ will be ignored by PROJ.
+
+ The `ExtraSamples <https://www.awaresystems.be/imaging/tiff/tifftags/extrasamples.html>`_
+ tag should be set to a value of SamplesPerPixel - 1 (given the rules that
+ apply for PhotometricInterpretation = MinIsBlack)
+
+- The `ImageDescription <https://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html>`_
+ tag may be used to convey extra information about the name, provenance, version
+ and last updated date of the grid.
+ Will be set when possible fo files hosted on the CDN.
+ Ignored by PROJ.
+
+- The `Copyright <https://www.awaresystems.be/imaging/tiff/tifftags/copyright.html>`_
+ tag may be used to convey extra information about the copyright and license of the grid.
+ Will be set when possible fo files hosted on the CDN.
+ Ignored by PROJ.
+
+- The `DateTime <https://www.awaresystems.be/imaging/tiff/tifftags/datetime.html>`_
+ tag may be used to convey the date at which the file has been created or
+ converted. In case of a file conversion, for example from NTv2, this will be
+ the date at which the conversion has been performed. The ``ImageDescription``
+ tag however will contain the latest of the CREATED or UPDATED fields from the NTv2 file.
+ Will be set when possible fo files hosted on the CDN.
+ Ignored by PROJ.
+
+- Files hosted on the CDN will use the `GDAL_NODATA
+ <https://www.awaresystems.be/imaging/tiff/tifftags/gdal_nodata.html>`_ tag to encode
+ the value of the nodata / missing value, when it applies to the grid.
+
+ If offset and/or scaling is used, the nodata value corresponds to the raw value,
+ before applying offset and scaling.
+ The value found in this tag, if present, will be honoured (to the extent to
+ which current PROJ code makes use of nodata).
+ For floating point data, writers are strongly discouraged to use non-finite values
+ (+/- infinity, NaN) of nodata to maximimize interoperability.
+ The GDAL_NODATA value applies to all samples of a given TIFF IFD.
+
+- Files hosted on the CDN will use the `GDAL_METADATA
+ <https://www.awaresystems.be/imaging/tiff/tifftags/gdal_metadata.html>`_ tag to encode extra
+ metadata not supported by baseline or extended TIFF.
+
+ * The root XML node should be ``GDALMetadata``
+
+ * Zero, one or several child XML nodes ``Item`` may be present.
+
+ * A Item should have a ``name`` attribute, and a child text node with its value.
+ ``role`` and ``sample`` attributes may be present for attributes that have
+ a special semantics (recognized by GDAL). The value of `sample` should be
+ a integer value between 0 and number_of_samples - 1.
+
+ * Scale and offset to convert integer raw values to floating point values
+ may be expressed with XML `Item` elements whose name attribute is respectively
+ ``SCALE`` and ``OFFSET``, and their ``role`` attribute is respectively ``scale``
+ and ``offset``. The decoded value will be: {offset} + {scale} * raw_value_from_geotiff_file
+
+ For a offset value of 1 and scaling of 2, the following payload should be
+ stored:
+
+ .. code-block:: xml
+
+ <GDALMetadata>
+ <Item name="OFFSET" sample="0" role="offset">1</Item>
+ <Item name="SCALE" sample="0" role="scale">2</Item>
+ </GDALMetadata>
+
+ * The type of the grid must be specified with a `Item` whose ``name`` is set
+ to ``TYPE``.
+
+ Values recognized by PROJ currently are:
+
+ - ``HORIZONTAL_OFFSET``: implies the presence of at least two samples.
+ The first sample must contain the latitude offset and the second
+ sample must contain the longitude offset.
+ Corresponds to PROJ ``hgridshift`` method.
+
+ - ``VERTICAL_OFFSET_GEOGRAPHIC_TO_VERTICAL``: implies the presence of at least one sample.
+ The first sample must contain the vertical adjustment. Must be used when
+ the source/interpolation CRS is a Geographic CRS and the target CRS a Vertical CRS.
+ Corresponds to PROJ ``vgridshift`` method.
+
+ - ``VERTICAL_OFFSET_VERTICAL_TO_VERTICAL``: implies the presence of at least one sample.
+ The first sample must contain the vertical adjustment. Must be used when
+ the source and target CRS are Vertical CRS.
+ Corresponds to PROJ ``vgridshift`` method.
+
+ - ``GEOCENTRIC_TRANSLATION``: implies the presence of at least 3 samples.
+ The first 3 samples must be respectively the geocentric adjustments along
+ the X, Y and Z axis. Must be used when the source and target CRS are
+ geocentric CRS. The interpolation CRS must be a geographic CRS.
+ Corresponds to PROJ ``xyzgridshift`` method.
+
+ - ``VELOCITY``: implies the presence of at least 3 samples.
+ The first 3 samples must be respectively the velocities along
+ the E(ast), N(orth), U(p) axis in the local topocentric coordinate system.
+ Corresponds to PROJ ``deformation`` method.
+
+ For example:
+
+ .. code-block:: xml
+
+ <Item name="TYPE">HORIZONTAL_OFFSET</Item>
+
+ * The description of each sample must be specified with a Item whose ``name``
+ attribute is set to ``DESCRIPTION`` and ``role`` attribute to ``description``.
+
+ Values recognized by PROJ for this Item are currently:
+
+ + ``latitude_offset``: valid for TYPE=HORIZONTAL_OFFSET. Sample values should be
+ the value to add a latitude expressed in the CRS encoded in the GeoKeys
+ to obtain a latitude value expressed in the target CRS.
+
+ + ``longitude_offset``: valid for TYPE=HORIZONTAL_OFFSET. Sample values should be
+ the value to add a longitude expressed in the CRS encoded in the GeoKeys
+ to obtain a longitude value expressed in the target CRS.
+
+ + ``geoid_undulation``: valid for TYPE=VERTICAL_OFFSET_GEOGRAPHIC_TO_VERTICAL.
+ For a source CRS being a geographic CRS and a target CRS being a vertical CRS,
+ sample values should be the value to add to a geoid-related height (that
+ is expressed in the target CRS) to
+ get an ellipsoidal height (that is expressed in the source CRS), also
+ called the geoid undulation.
+ Note the possible confusion related to what is the source CRS and target CRS and
+ the semantics of the value stored (to convert from the source to the target,
+ one must subtract the value contained in the grid). This is the convention
+ used by the `EPSG:9665 <https://www.epsg-registry.org/export.htm?gml=urn:ogc:def:method:EPSG::9665>`_
+ operation method.
+
+ + ``vertical_offset``: valid for TYPE=VERTICAL_OFFSET_VERTICAL_TO_VERTICAL.
+ For a source and target CRS being vertical CRS,
+ sample values should be the value to add to an elevation expressed in the
+ source CRS to obtain a longitude value expressed in the target CRS.
+
+ + ``x_translation`` / ``y_translation`` / ``z_translation``: valid for
+ TYPE=GEOCENTRIC_TRANSLATION.
+ Sample values should be the value to add to the input geocentric coordinates
+ expressed in the source CRS to geocentric coordinates expressed in the target CRS.
+
+ + ``east_velocity`` / ``north_velocity`` / ``up_velocity``: valid for
+ TYPE=VELOCITY.
+ Sample values should be the velocity in a linear/time unit in a ENU local
+ topocentric coordinate system.
+
+ For example:
+
+ .. code-block:: xml
+
+ <Item name="DESCRIPTION" sample="0" role="description">latitude_offset</Item>
+ <Item name="DESCRIPTION" sample="1" role="description">longitude_offset</Item>
+
+ Other values may be used (not used by PROJ):
+
+ + ``latitude_offset_accuracy``: valid for TYPE=HORIZONTAL_OFFSET. Sample values should be
+ the accuracy of corresponding latitude_offset samples. Generally in metre (if converted from NTv2)
+
+ + ``longitude_offset_accuracy``: valid for TYPE=HORIZONTAL_OFFSET. Sample values should be
+ the accuracy of corresponding longitude_offset samples. Generally in metre (if converted from NTv2)
+
+ * The sign convention for the values of the ``longitude_offset`` channel
+ should be indicated with an Item named ``positive_value`` whose value
+ can be ``west`` or ``east``. NTv2 products originally use a ``west``
+ convention, but when converting from them to GeoTIFF, the sign of those
+ samples will be inverted so they use a more natural ``east`` convention.
+ If this item is absent, the default value is ``east``.
+
+ * The unit of the values stored in the grid must be specified for each
+ sample through an Item of name ``UNITTYPE`` and role ``unittype``
+ Valid values should be the name of entries from the EPSG ``unitofmeasure``
+ table. To maximize interoperability, writers are strongly encouraged to
+ limit themselves to the following values:
+
+ For linear units:
+
+ - ``metre`` (default value assumed if absent for vertical shift grid files, and value used for files stored on PROJ CDN)
+ - ``US survey foot``
+
+ For angular units:
+
+ - ``degree``
+ - ``arc-second`` (default value assumed if absent for longitude and latitude offset samples of horizontal shift grid files, and value used for files stored on PROJ CDN)
+
+ For velocity units:
+
+ - ``millimetres per year``
+
+ The longitude and latitude offset samples should use the same unit.
+ The geocentric translation samples should use the same unit.
+ The velocity samples should use the same unit.
+
+ Example:
+
+ .. code-block:: xml
+
+ <Item name="UNITTYPE" sample="0" role="unittype">arc-second</Item>
+ <Item name="UNITTYPE" sample="1" role="unittype">arc-second</Item>
+
+ * The ``target_crs_epsg_code`` metadata item should be present.
+ For a horizontal shift grid, this is the EPSG
+ code of the target geographic CRS. For a vertical shift grid, this is the
+ EPSG code of a the target vertical CRS.
+ If the target CRS has no associated EPSG code, ``target_crs_wkt`` must be
+ used.
+ Ignored by PROJ currently.
+
+ * The ``target_crs_wkt`` metadata item must be present if the
+ ``target_crs_epsg_code`` cannot be used.
+ Its value should be a valid WKT string according to
+ `WKT:2015 <http://docs.opengeospatial.org/is/12-063r5/12-063r5.html>`_
+ or `WKT:2019 <hhttp://docs.opengeospatial.org/is/18-010r7/18-010r7.html>`_
+ Ignored by PROJ currently.
+
+ * The ``source_crs_epsg_code`` metadata item must be present if the source
+ and interpolation CRS are not the same (typical use case is vertical CRS to vertical CRS
+ transformation), because the GeoKeys encode the interpolation CRS and not the source CRS.
+ If the source CRS has no associated EPSG code, ``source_crs_wkt`` must be
+ used.
+ Ignored by PROJ currently.
+
+ * The ``source_crs_wkt`` metadata item must be present if the
+ ``source_crs_epsg_code`` cannot be used.
+ Its value should be a valid WKT string according to WKT:2015 or WKT:2019.
+ Ignored by PROJ currently.
+
+ * The ``interpolation_crs_wkt`` metadata item may be present if the GeoKeys
+ cannot be used to express reliably the interpolation CRS.
+ Its value should be a valid WKT string according to WKT:2015 or WKT:2019.
+ Ignored by PROJ currently.
+
+ * The ``recommended_interpolation_method`` metadata item may be present to
+ describe the method to use to interpolation values at locations not
+ coincident with nodes stored in the grid file. Potential values: ``bilinear``,
+ ``bicubic``.
+ Ignored by PROJ currently.
+
+ * The ``area_of_use`` metadata item can be used to indicate plain text information
+ about the area of use of the grid (like "USA - Wisconsin"). In case of multiple
+ subgrids, it should be set only on the first one, but applies to the whole
+ set of grids, not just the first one.
+
+ * The ``grid_name`` metadata item should be present if there are
+ subgrids for this grid (that is grids whose extent is contained in the extent
+ of this grid), or if this is a subgrid.
+ It is intended to be a relatively short identifier
+ Will be ignored by PROJ (this information can be inferred by the grids extent)
+
+ * The ``parent_grid_name`` metadata item should be present if this is a
+ subgrid and its value should be equal to the paren's ``grid_name``
+ Will be ignored by PROJ (this information can be inferred by the grids extent)
+
+ * The ``number_of_nested_grids`` metadata item should be present if there are
+ subgrids for this grid (that is grids whose extent is contained in the extent
+ of this grid).
+ Will be ignored by PROJ (this information can be inferred by the grids extent)
+
+Example
++++++++
+
+https://github.com/rouault/sample_proj_gtiff_grids/blob/master/ntf_r93.tif has
+been converted from https://github.com/OSGeo/proj-datumgrid/blob/master/ntf_r93.gsb
+with https://github.com/rouault/sample_proj_gtiff_grids/blob/master/ntv2_to_gtiff.py
+
+::
+
+ $ tiffinfo ntf_r93.tif
+
+ TIFF Directory at offset 0x4e (78)
+ Image Width: 156 Image Length: 111
+ Bits/Sample: 32
+ Sample Format: IEEE floating point
+ Compression Scheme: AdobeDeflate
+ Photometric Interpretation: min-is-black
+ Extra Samples: 3<unspecified, unspecified, unspecified>
+ Samples/Pixel: 4
+ Rows/Strip: 111
+ Planar Configuration: separate image planes
+ ImageDescription: NTF (EPSG:4275) to RGF93 (EPSG:4171). Converted from ntf_r93.gsb (version IGN07_01, last updated on 2007-10-31)
+ DateTime: 2019:12:09 00:00:00
+ Copyright: Derived from work by IGN France. Open License https://www.etalab.gouv.fr/wp-content/uploads/2014/05/Open_Licence.pdf
+ Tag 33550: 0.100000,0.100000,0.000000
+ Tag 33922: 0.000000,0.000000,0.000000,-5.500000,52.000000,0.000000
+ Tag 34735: 1,1,1,3,1024,0,1,2,1025,0,1,2,2048,0,1,4275
+ Tag 42112: <GDALMetadata>
+ <Item name="grid_name">FRANCE</Item>
+ <Item name="target_crs_epsg_code">4171</Item>
+ <Item name="TYPE">HORIZONTAL_OFFSET</Item>
+ <Item name="UNITTYPE" sample="0" role="unittype">arc-second</Item>
+ <Item name="DESCRIPTION" sample="0" role="description">latitude_offset</Item>
+ <Item name="positive_value" sample="1">east</Item>
+ <Item name="UNITTYPE" sample="1" role="unittype">arc-second</Item>
+ <Item name="DESCRIPTION" sample="1" role="description">longitude_offset</Item>
+ <Item name="UNITTYPE" sample="2" role="unittype">arc-second</Item>
+ <Item name="DESCRIPTION" sample="2" role="description">latitude_offset_accuracy</Item>
+ <Item name="UNITTYPE" sample="3" role="unittype">arc-second</Item>
+ <Item name="DESCRIPTION" sample="3" role="description">longitude_offset_accuracy</Item>
+ </GDALMetadata>
+
+ Predictor: floating point predictor 3 (0x3)
+
+
+::
+
+ $ listgeo ntf_r93.tif
+
+ Geotiff_Information:
+ Version: 1
+ Key_Revision: 1.1
+ Tagged_Information:
+ ModelTiepointTag (2,3):
+ 0 0 0
+ -5.5 52 0
+ ModelPixelScaleTag (1,3):
+ 0.1 0.1 0
+ End_Of_Tags.
+ Keyed_Information:
+ GTModelTypeGeoKey (Short,1): ModelTypeGeographic
+ GTRasterTypeGeoKey (Short,1): RasterPixelIsPoint
+ GeodeticCRSGeoKey (Short,1): Code-4275 (NTF)
+ End_Of_Keys.
+ End_Of_Geotiff.
+
+ GCS: 4275/NTF
+ Datum: 6275/Nouvelle Triangulation Francaise
+ Ellipsoid: 7011/Clarke 1880 (IGN) (6378249.20,6356515.00)
+ Prime Meridian: 8901/Greenwich (0.000000/ 0d 0' 0.00"E)
+ Projection Linear Units: User-Defined (1.000000m)
+
+ Corner Coordinates:
+ Upper Left ( 5d30' 0.00"W, 52d 0' 0.00"N)
+ Lower Left ( 5d30' 0.00"W, 40d54' 0.00"N)
+ Upper Right ( 10d 6' 0.00"E, 52d 0' 0.00"N)
+ Lower Right ( 10d 6' 0.00"E, 40d54' 0.00"N)
+ Center ( 2d18' 0.00"E, 46d27' 0.00"N)
+
+::
+
+ $ gdalinfo ntf_r93.tif
+
+ Driver: GTiff/GeoTIFF
+ Files: ntf_r93.tif
+ Size is 156, 111
+ Coordinate System is:
+ GEOGCRS["NTF",
+ DATUM["Nouvelle Triangulation Francaise",
+ ELLIPSOID["Clarke 1880 (IGN)",6378249.2,293.466021293627,
+ LENGTHUNIT["metre",1]]],
+ PRIMEM["Greenwich",0,
+ ANGLEUNIT["degree",0.0174532925199433]],
+ CS[ellipsoidal,2],
+ AXIS["geodetic latitude (Lat)",north,
+ ORDER[1],
+ ANGLEUNIT["degree",0.0174532925199433]],
+ AXIS["geodetic longitude (Lon)",east,
+ ORDER[2],
+ ANGLEUNIT["degree",0.0174532925199433]],
+ ID["EPSG",4275]]
+ Data axis to CRS axis mapping: 2,1
+ Origin = (-5.550000000000000,52.049999999999997)
+ Pixel Size = (0.100000000000000,-0.100000000000000)
+ Metadata:
+ AREA_OR_POINT=Point
+ grid_name=FRANCE
+ target_crs_epsg_code=4171
+ TIFFTAG_DATETIME=2019:12:09 00:00:00
+ TIFFTAG_IMAGEDESCRIPTION=NTF (EPSG:4275) to RGF93 (EPSG:4171). Converted from ntf_r93.gsb (version IGN07_01, last updated on 2007-10-31)
+ TYPE=HORIZONTAL_OFFSET
+ Image Structure Metadata:
+ COMPRESSION=DEFLATE
+ INTERLEAVE=BAND
+ Corner Coordinates:
+ Upper Left ( -5.5500000, 52.0500000) ( 5d33' 0.00"W, 52d 3' 0.00"N)
+ Lower Left ( -5.5500000, 40.9500000) ( 5d33' 0.00"W, 40d57' 0.00"N)
+ Upper Right ( 10.0500000, 52.0500000) ( 10d 3' 0.00"E, 52d 3' 0.00"N)
+ Lower Right ( 10.0500000, 40.9500000) ( 10d 3' 0.00"E, 40d57' 0.00"N)
+ Center ( 2.2500000, 46.5000000) ( 2d15' 0.00"E, 46d30' 0.00"N)
+ Band 1 Block=156x111 Type=Float32, ColorInterp=Gray
+ Description = latitude_offset
+ Unit Type: arc-second
+ Band 2 Block=156x111 Type=Float32, ColorInterp=Undefined
+ Description = longitude_offset
+ Unit Type: arc-second
+ Metadata:
+ positive_value=east
+ Band 3 Block=156x111 Type=Float32, ColorInterp=Undefined
+ Description = latitude_offset_accuracy
+ Unit Type: arc-second
+ Band 4 Block=156x111 Type=Float32, ColorInterp=Undefined
+ Description = longitude_offset_accuracy
+ Unit Type: arc-second
+
+Multi-grid storage
+++++++++++++++++++
+
+Formats like NTv2 can contain multiple subgrids. This can be transposed to
+TIFF by using several IFD chained together with the last 4 bytes (or 8 bytes
+for BigTIFF) of an IFD pointing to the offset of the next one.
+
+The first IFD should have a full description according to the
+:ref:`Description of the PROJ GeoTIFF format <description_geotiff_format>`.
+Subsequent IFD might have a more compact description, omitting for example, CRS
+information if it is identical to the main IFD (which should be the case for
+the currently envisionned use cases), or Copyright / ImageDescription metadata
+items.
+
+Each IFD will have its
+`NewSubfileType <https://www.awaresystems.be/imaging/tiff/tifftags/newsubfiletype.html>`_
+tag set to 0.
+
+If a low-resolution grid is available, it should be put before subgrids of
+higher-resolution in the chain of IFD linking. On reading, PROJ will use the
+value from the highest-resoluted grid that contains the point of interest.
+
+For efficient reading from the network, files hosted on the CDN will use
+a layout similar to the one described in the `low level paragraph of the Cloud Optimized GeoTIFF
+GDAL driver page <https://gdal.org/drivers/raster/cog.html#low-level>`_
+
+The layout for a file converted from NTv2 will for example be:
+
+- TIFF/BigTIFF header/signature and pointer to first IFD (Image File Directory)
+- "ghost area" indicating the generated process
+- IFD of the first grid, followed by TIFF tags values, excluding the TileOffsets and TileByteCounts arrays
+- ...
+- IFD of the last grid, followed by TIFF tags values, excluding the GDAL_METADATA tag, TileOffsets and TileByteCounts arrays
+- TileOffsets and TileByteCounts arrays for first IFD
+- ...
+- TileOffsets and TileByteCounts arrays for last IFD
+- Value of GDAL_METADATA tag for IFDs following the first IFD
+- First IFD: Data corresponding to latitude offset of Block_0_0
+- First IFD: Data corresponding to longitude offset of Block_0_0
+- First IFD: Data corresponding to latitude offset of Block_0_1
+- First IFD: Data corresponding to longitude offset of Block_0_1
+- ...
+- First IFD: Data corresponding to latitude offset of Block_n_m
+- First IFD: Data corresponding to longitude offset of Block_n_m
+- ...
+- Last IFD: Data corresponding to latitude offset of Block_0_0
+- Last IFD: Data corresponding to longitude offset of Block_0_0
+- Last IFD: Data corresponding to latitude offset of Block_0_1
+- Last IFD: Data corresponding to longitude offset of Block_0_1
+- ...
+- Last IFD: Data corresponding to latitude offset of Block_n_m
+- Last IFD: Data corresponding to longitude offset of Block_n_m
+
+If longitude_offset_accuracy and latitude_offset_accuracy are present, this
+will be followed by:
+
+- First IFD: Data corresponding to latitude offset accuracy of Block_0_0
+- First IFD: Data corresponding to longitude offset accuracy of Block_0_0
+- ...
+- First IFD: Data corresponding to latitude offset accuracy of Block_n_m
+- First IFD: Data corresponding to longitude offset accuracy of Block_n_m
+- ...
+- Last IFD: Data corresponding to latitude offset accuracy of Block_0_0
+- Last IFD: Data corresponding to longitude offset accuracy of Block_0_0
+- ...
+- Last IFD: Data corresponding to latitude offset accuracy of Block_n_m
+- Last IFD: Data corresponding to longitude offset accuracy of Block_n_m
+
+.. note::
+
+ TIFF has another mechanism to link IFDs, the SubIFD tag. This potentially
+ enables to define a hiearchy of IFDs (similar to HDF5 groups). There is no
+ support for that in most TIFF-using software, notably GDAL, and no compelling
+ need to have a nested hiearchy, so "flat" organization with the standard IFD chaining
+ mechanism is adopted.
+
+Examples of multi-grid dataset
+++++++++++++++++++++++++++++++
+
+https://github.com/rouault/sample_proj_gtiff_grids/blob/master/GDA94_GDA2020_conformal.tif has
+been converted from https://github.com/OSGeo/proj-datumgrid/blob/master/oceania/GDA94_GDA2020_conformal.gsb
+with https://github.com/rouault/sample_proj_gtiff_grids/blob/master/ntv2_to_gtiff.py
+
+It contains 5 subgrids. All essential metadata to list the subgrids and their
+georeferencing is contained within the first 3 KB of the file.
+
+The file size is 4.8 MB using DEFLATE compression and floating-point predictor.
+To be compared with the 83 MB of the original .gsb file.
+
+https://github.com/rouault/sample_proj_gtiff_grids/blob/master/ntv2_0.tif has
+been converted from https://github.com/OSGeo/proj-datumgrid/blob/master/north-america/ntv2_0.gsb
+
+It contains 114 subgrids. All essential metadata to list the subgrids and their
+georeferencing is contained within the first 40 KB of the file.
+
+
+Tooling
++++++++
+
+A script will be deveoped to accept a list of individual grids to combine
+together into a single file.
+
+A ntv2_to_gtiff.py convenience script will be created to convert NTv2 grids,
+including their subgrids, to the above
+described GeoTIFF layout.
+
+A validation Python script will be created to check that a file meets the above
+described requirements and recommendations.
+
+Build requirements
+++++++++++++++++++
+
+The minimum libtiff version will be 4.0 (RHEL 7 ships with libtiff 4.0.3).
+To be able to read grids stored on the CDN, libtiff will need to build against
+zlib to have DEFLATE and LZW suport, which is met by all known binary distributions
+of libtiff.
+
+The libtiff dependency can be disabled at build time, but this must be
+an explicit setting of configure/cmake as the resulting builds have less functionality.
+
+Dropping grid catalog functionality
+-------------------------------------------------------------------------------
+
+While digging through existing code, I more or less discovered that the PROJ
+code base has the concept of a grid catalog. This is a feature apparently triggered by
+using the +catalog=somefilename.csv in a PROJ string, where the CSV file list
+grid names, their extent, priority and date. It seems to be an alternative to using
++nadgrids with multiple grids, with the extra ability to interpolate shift values between
+several grids if a +date parameter is provided and the grid catalog mentions a
+date for each grids.
+It was added in June 2012 per `commit fcb186942ec8532655ff6cf4cc990e5da669a3bc
+<https://github.com/OSGeo/PROJ/commit/fcb186942ec8532655ff6cf4cc990e5da669a3bc>`_
+
+This feature is likely unknown to most users as there is no known documentation for
+it (neither in current documentation, nor in `historic one <https://web.archive.org/web/20160601000000*/http://trac.osgeo.org/proj/wiki/GenParms>`_).
+It is not either tested by PROJ tests, so its working status is unknown. It would
+likely make implementation of this RFC easier if this was removed. This would result in
+completely dropping the gridcatalog.cpp and gc_reader.cpp files, their call sites
+and the catalog_name and datum_date parameter from the PJ structure.
+
+In case similar functionality would be be needed, it might be later reintroduced
+as an extra mode of :ref:`hgridshift`, or using a dedicated transformation method,
+similarly to the :ref:`deformation` one,
+and possibly combining the several grids to interpolate among in the same file,
+with a date metadata item.
+
+Backward compatibility issues
+-------------------------------------------------------------------------------
+
+None anticipated, except the removal of the (presumably little used) grid catalog
+functionality.
+
+Potential future related work
+-----------------------------
+
+The foundations set in the definition of the GeoTIFF grid format should hopefully
+be reused to extend them to support deformation models (was initially discussed
+per https://github.com/OSGeo/PROJ/issues/1001).
+
+Definition of such an extension is out of scope of this RFC.
+
+Documentation
+-------------------------------------------------------------------------------
+
+- New API function will be documented.
+- A dedicated documentation page will be created to explain the working of
+ network-based access.
+- A dedicated documentation page will be created to describe the GeoTIFF based
+ grid format. Mostly reusing above material.
+
+Testing
+-------------------------------------------------------------------------------
+
+Number of GeoTIFF formulations (tiled vs untiled, PlanarConfiguration Separate vs
+Contig, data types, scale+offset vs not, etc.) will be tested.
+
+For testing of network capabilities, a mix of real hits to the CDN and use of
+the alternate pluggable network interface to test edge cases will be used.
+
+Proposed implementation
+-------------------------------------------------------------------------------
+
+A proposed implementation is available at https://github.com/OSGeo/PROJ/pull/1817
+
+Tooling scripts are currently available at https://github.com/rouault/sample_proj_gtiff_grids/
+(will be ultimately stored in PROJ repository)
+
+Adoption status
+-------------------------------------------------------------------------------
+
+The RFC was adopted on 2020-01-10 with +1's from the following PSC members
+
+* Kristian Evers
+* Even Rouault
+* Thomas Knudsen
+* Howard Butler
+* Kurt Schwehr