Image georectification and feature tracking toolbox : ImGRAFT

The use of time-lapse camera systems is becoming an increasingly popular method for data acquisition. The camera setup is often cost-effective and simple, allowing for a large amount of data to be accumulated over a variety of environments for relatively minimal effort. The acquired data can, with the correct post-processing, result in a wide range of useful quantitative and qualitative information in remote and dangerous areas. The post-processing requires a significant amount of steps to transform images into meaningful data for quantitative analysis, such as velocity fields. To the best of our knowledge at present a complete, openly available package that encompasses georeferencing, georectification and feature tracking of terrestrial, oblique images is still absent. This study presents a complete, yet adaptable, opensource package developed in MATLAB, that addresses and combines each of these post-processing steps into one complete suite in the form of an “Image GeoRectification and Feature Tracking” (ImGRAFT: http://imgraft.glaciology.net) toolbox. The toolbox can also independently produce other useful outputs, such as viewsheds, georectified and orthorectified images. ImGRAFT is primarily focused on terrestrial oblique images, for which there are currently limited post-processing options available. In this study, we illustrate ImGRAFT for glaciological applications on a small outlet glacier Engabreen, Norway.


Introduction
The use of terrestrial photography as a means of understanding spatio-temporal landscape evolution and change is not a new concept. It spans a vast range of disciplines in-cluding: disaster monitoring (Mulsow et al., 2013); glacier motion (Flotron, 1973;Harrison et al., 1986;Ahn and Box, 2010); mountain ecosystem understanding (Aschenwald et al., 2001); hydrological monitoring (Parajka et al., 2012;Danielson and Sharp, 2013); and snow monitoring (Smith Jr. et al., 2003;Corripio, 2004;Härer et al., 2013). It is a cheap, cost effective, simple method that allows the researcher to obtain a vast array of information about their study site. Today, more and more disciplines are discovering the immense power of terrestrial photography for both qualitative and quantitative applications due to the high repeat imaging capacity. The quantitative aspect relies heavily on the ability of the image to be georectified to a meaningful coordinate system. In order to achieve this, ground control points (GCPs) are needed and a good high resolution digital elevation model (DEM) often makes this process more successful. The conversion from image coordinates to realworld coordinates gives each image pixel a true estimate of the space they represent. In its simplest form, this might be the actual scale each pixel represents in metres. The more complex rectification includes a full registration of the image to an established coordinate system through georeferencing.
Examples of quantitative data are velocity fields of glaciers and other mass movement, such as a landslide or rock glacier (Kääb and Vollmer, 2000;Kääb, 2002;Debella-Gilo and Kääb, 2011). Here we shall focus on velocity measurements however, in addition to velocity, cameras provide other additional supporting information about the field site that is otherwise only obtained from prolonged field campaigns; for example, the exact timing of the first snowfall and at which elevation. This data can support and validate other records from the area, such as precipitation gauges. In some cases 24 A. Messerli and A. Grinsted: ImGRAFT: image georectification and feature tracking toolbox time-lapse imagery has been used to validate seismic data to detect large calving events at large outlet glaciers (Walter et al., 2010).
In this paper we present the new Image GeoRectification and Feature Tracking (ImGRAFT) toolbox. We perform a full georectification of the images using a newly developed processing chain. Once rectified, images are used to produce velocity fields at the glacier test site Engabreen. The ice displacement is determined using a cross-correlation feature tracking algorithm. Previous studies stretching back to the 1970s have also used time-lapse imagery as a means of monitoring glacial flow (e.g. Flotron, 1973;Harrison et al., 1986). These studies used various approaches to achieve the same result of obtaining ice flow estimates by tracking either existing features on the ice such as crevasses (Harrison et al., 1992;Evans, 2000;Ahn and Box, 2010) or specific targets placed on the glacier (Harrison et al., 1986). In our example, the "features" are automatically defined by the software as surface textures and patterns visible on the glacier surface, on which we then run the normalised cross-correlation algorithm. Here we present our method as an open source "toolbox" for georectification and feature tracking terrestrial images. Further full working examples, the source code and additional detailed information can be found in the examples section of the toolbox website at: http://imgraft.glaciology.net/documentation.

Motivation
The most successful existing software usually focus on either feature tracking or georectification (Corripio, 2004;Härer et al., 2013). To date the most commonly used publicly available feature tracking software are IMCORR (US National Snow and Ice Data Center (NSIDC), Boulder, CO), COSI-Corr (California Institute of Technology (CalTech), Pasadena, CA) and CIAS (Kääb, 2013). Both IMCORR and COSI-Corr, are optimised for use with aerial and satellite imagery, where the rectification process is fairly straight forward compared to that of an oblique terrestrial image. CIAS can be used for terrestrial imagery, however they still need to be rectified externally from the software.
Photogeoref, developed by Corripio (2004), and the more recent release of PRACTICE (Photo Rectification And Clas-sificaTIon SoftwarE) Härer et al. (2013), which is based on Photogeoref, focus mainly on the georectification of oblique images. During the testing stages using the aforementioned software we found difficulties with automation due to the use of separate existing georectification and feature tracking tools. Additionally, a workflow was needed that was able to deal with camera motion and lens distortion efficiently. Another concern was that traditional image registration as a preprocessing step can introduce loss of image quality and detail through resampling. As a result we were prompted to develop a new toolbox that met all the requirements, including the georectification and feature tracking processes all to be contained within one MATLAB package. Batch processing of the entire workflow is easily achieved through a case specific custom script, a feature not often available in other image feature tracking tools. The aim of the toolbox is to provide users with flexibility to adapt the code to suite their needs, using the demonstration and documentation online at http: //imgraft.glaciology.net/documentation as a basis to structure and implement the toolbox's functions.

Field setup and data
The test site for ImGRAFT is located at Engabreen in northern Norway (Fig. 1). Engabreen is a small Arctic valley glacier and outlet of the large Svartisen Ice Cap. Engabreen has a large icefall located at approximately 850 m a.s.l. An icefall is a steep area of the glacier where there is high ice flow and as a result extensional flow, leading to extensive development of large crevasses and unstable ice blocks known as séracs (Benn and Evans, 2010). In previous years, attempts have been made to instrument the icefall however, due to the nature of the moderate flow ( > 300 m yr −1 ) and the extensive crevassing the longevity of any instrument in this region is generally short-lived.
Our camera setup in the field consisted of one Canon Rebel T3 (1100D) single-lens reflex camera controlled by a Harbortronics DigiSnap intervalometer setup (https://www. harbortronics.com/Products/TimeLapsePackage/). The camera was programmed to take seven pictures per day at 3hourly intervals at the following times: 05:04; 08:04; 11:04; 14:04; 17:04; 20:04; and 23:04 CEST. During our 6-month monitoring period we experienced a drift in the intervalometer of approximately 5 sec over 6 months. We used a Sigma prime lens with a focal length of 30 mm. Our camera was mounted in a fibreglass, water-tight enclosure on a on a solid metal frame structure that was concreted into the ground. The camera system was powered by an 11.1 V 9000 mAh lithium polymer battery pack and supported by a 5 Watt solar panel. The camera was positioned on the eastern margin of the glacier, at about 770 m elevation on the valley side overlooking an icefall (Fig. 6a). The average height of the surface of the glacier measured was approximately 550 m a.s.l., so the camera was approximately 220 m above the average surface and 120 m above the highest point we measured. The look angle of the camera was approximately 13 • .
A key component of ImGRAFT and other georectification processes is the DEM. This is used in the georectification stage and therefore it is beneficial for it to be recent and of high resolution. Fortunately in our case, a high resolution DEM was produced from an airborne laser scan, which took place during the monitoring period on the 25 August 2013. This is extremely useful for georectifying the time-lapse images as both a DEM and at least one of the time-lapse images from the exact same time exist. Therefore, we have an absolute surface that we were able to use to rectify our images. In addition to this, a high resolution (10 cm) orthophoto of the entire area of the laser scan was taken from a camera mounted on board the plane. This combination of the DEM and orthophoto made the selection of additional GCPs easier and we were able to select many visible features in the camera field of view (FOV) that significantly aided the subsequent georectification of our images. We overlaid the orthophoto onto the DEM and picked out the features for our GCPs manually. These features included the entrance to a subterranean tunnel on the western valley side, spray painted boulders, other large distinct boulders and the edges of dominant, persistent snow patches in gulleys. In addition to the rock features, we were also able to use the crevasses as GCPs as a result of the exact overlap of the image and DEM acquisition. A small number of GCPs were measured using a global positioning system (GPS); these included the tunnel and the spray painted boulder.

Method
We present the method in separate sections to clearly distinguish between the processes contained within ImGRAFT (Sect. 4.3) and those not. In the first two sections (Sects. 4.1 and 4.2) we describe the image preparation and DEM preparation that are required but not directly contained within the ImGRAFT toolbox. The DEM preparation stage can be written into to the ImGRAFT processing chain, as is shown in the demonstration scripts in the documentation on the ImGRAFT website (http://imgraft.glaciology.net/ documentation).

Image preparation
Firstly we converted all collected images from RAW to tiff format using dcraw (Coffin, 2009). We chose a linear gamma curve in the conversion to preserve the dynamic range of the bright ice. We manually inspected the images to determine if they were suitable for feature tracking purposes or not, and removed all images that were deemed to be unusable. These included images that were taken at night and those where either all or a significant portion of the glacier were obscured by cloud or fog. Images where there appeared to be heavy rainfall were also removed as the raindrops themselves lead to extra distortion as they settle on the camera window. To allow for a better comparison, we also selected image pairs at the same time of day, as the sun illumination was more consistent over the glacier and valley. This also means that the shadowing around large crevasse features is similar between each image (Ahn and Box, 2010). The images that were collected in early spring when snowfall was still regular presented another problem: a lack of rock features for detecting camera motion; a lack of distinct glacier features (as they are covered by snow); and finally a rapidly changing surface. Lastly, the surface features change rapidly from one day to the next, either through new snowfall or as we saw later on, rapid melting.

DEM preparation
The DEM is a fundamental input in our method (Fig. 3) and needs to be prepared correctly for our purpose. We initially used a 1 m high resolution DEM, however on inspection of the results we identified some complications. The complications arose as large persistent features such as crevasses and séracs moved down the glacier over time. As a result it implies that after a day the DEM that encompasses the high resolution crevasses detail no longer represents the ice in the images. This is because the high peaks and low troughs of these crevasses are now downstream. For example, points that correspond to a peak of a crevasse in the image could be a trough in the DEM and vice versa, due to the motion of the glacier. To avoid this problem, we decided to "fill" our crevasses in the DEM, to ensure it corresponds to the visual surface seen by the feature tracking algorithm (templatematch). This removes the large local variability in the glacier surface caused by these crevasses and séracs to achieve a smoother surface. A snapshot of our DEM "filling" can be seen in Fig. 2, where the following described surfaces are represented by the three lines.
There are many technical solutions for how you might fill crevasses in the DEM. Generally these methods need the specification of a vertical scale to define what is considered a crevasse and a horizontal scale to define the spatial extent of the crevasses to be bridged. Here, we outline a computationally efficient approach, which uses image filtering techniques. First we smooth the DEM (z) with a Gaussian spa- Figure 2. A zoomed-in snapshot of a slice through the DEM. This highlights how the DEM has been "filled". The green line shows the high frequency surface structure from crevasses, which are removed in the final "filled" surface (blue). The red line shows the first stage of the smoothing, which results in a lower surface. tial filter (S g ), which results in a DEM that lies lower than the original (red line in Fig. 2). The deviation between the smoothed DEM and the original surface (green line in Fig. 2) is passed through a non-linear function (exp) and spatially smoothed with a disk filter (S d ). This strongly weighs positive deviations from the smoothed DEM, i.e. crevasse tops. The final DEM (z filled ) is the smoothed DEM plus the upper surface DEM (blue line in Fig. 2): where a is the weighting constant. We apply a final smoothing (S g ) to reduce the stepped nature of z filled . The characteristic length scale of the smoothing operators has been chosen to bridge the largest crevasses, and the weighting constant a has been chosen to be the same order of magnitude as the standard deviation between z and S g .
We are fortunate to have obtained a DEM on the 25 August 2013 between 10:20 and 11:13 CEST, and an image was taken on the same day at 11:04 CEST. We are therefore confident that as a result of the simultaneous image and DEM acquisition, our rectification is accurate particularly with regards to the stable rock regions within the image. As we are monitoring a dynamic surface (ice), in all other images the ice surface has changed in relation to the DEM and therefore must have a lowering/raising function applied to it to correct for the glacier surface evolution. This is due to the alteration of the ice surface as a result of melting/snowfall. At Engabreen we experience a significant surface lowering on the order of 10 m on the lower tongue during a single melt season. In this example, we derive the elevation change factor for the ice from direct mass balance measurements taken at the glacier at monthly intervals throughout the operational period of the camera in 2013. In cases where such observations are not possible, then other methods of estimating surface lowering should be investigated. An estimation of the surface elevation change will aid in the correction of the DEM.

ImGRAFT: processing
Here we present the major steps in the processing chain as illustrated in Fig. 3. All of the specific ImGRAFT terminology is listed and defined in Table 1 and Fig. 3, and is subsequently written in italics in the text. The following sections focus in more detail on the unique ImGRAFT features and provide an overview of the standard processes. Further information about the practical aspects of ImGRAFT along with a full working example can be found on the toolbox website (http://imgraft.glaciology.net/). This includes the link to the source code, documentation and further examples.

Camera motion and model camera determination
As for any time-lapse camera setup, minimising movement of the camera is vital. Even though the camera was mounted on a solid structure it is almost impossible to avoid some form of camera motion. This can be due to strong winds, thermal expansion of the camera enclosure or of the mounting platform and human interference, for example when changing SD memory cards. This motion introduces errors and need to be corrected for, as does the distortion around the edge of the image as a result of the curvature of the lens. In order to account for any motion and distortion we generate a model camera for each corresponding image. For clarity, we stress that the model camera contains all of the physical camera information from the actual time-lapse camera, plus  any optimised camera parameters (e.g. rotations and changes in view direction), that differ from that of the model master camera. Each image taken from the time-lapse camera has an associated model camera containing this updated information. We determine the camera view parameters for a master image (see Table 1) from GCPs to generate the model master camera (Fig. 3a). The view direction, focal lengths, and lens distortion model are optimised to minimise the square projection error of the GCPs using a modified Levenberg-Marquardt algorithm (Fletcher, 1971) in the form of optimizecam. The model camera formulation has a close correspondence to that of OpenCV (Bradski, 2000) which is loosely based on Claus and Fitzgibbon (2005). We reference all other model cameras (Fig. 3b) to the model master camera. Due to the inclusion of a full distortion model, this opens up the opportunity to use lower quality cameras with higher distortion.  Fig. 3 and frequently referred to in the main body of the text. The first column presents the variable or function name, the second column indicates if it is a variable or a function and the third column provides a short description of the associated term.

Image (V)
A single terrestrial photo taken from the time-lapse camera.
Image pair (V) Two images consisting of image A and image B. Note that image A and image B are not fixed.
Model master camera (V) The camera coordinates and specifications that are measured in the field, optimised within the ImGRAFT processing and relate best to the GCPs.
Master image (V) Fully rectified terrestrial image used to rectify subsequent images to. This is the image that corresponds to the master camera.
Model camera (V) A set of parameters describing the view for the associated image. There is a model camera for every image in the processing. This model camera is an updated camera view direction that includes any motion observed in the camera, in terms of yaw, pitch and roll. A full rotational model is applied to fully capture any motion in the camera caused by movement of the camera housing, such as from wind. Note that the model cameras are optimised camera view directions relative to the master camera. These are not physical cameras, just updated view parameters including any camera shift.

GCPs (V)
Ground control points (GCPs) are used to determine the view of the model master camera/image. They consist of features that are clearly identifiable in the images that have a known coordinate, such as a stable prominent boulder or a static feature such as a tunnel entrance or a distinct mountain peak. Where possible it is an advantage to have GCPs spread evenly across the image.

Template (V)
The small extract taken from Image A that contains the surface textures and features we wish to locate in Image B (see Fig. 4 for a schematic example). The template is centred around the grid points we define. An example of the grid with the associated points can be seen in Fig. 5.

Search region (V)
This is a constrained area in Image B where the templatematch algorithm searches for the best match of the template.

Point (V)
Each point is defined as a 2-D and 3-D centre coordinates around which the template is defined. We choose these points in map coordinates to generate a regular static grid, to ease comparison between velocity fields.
Templatematch (F) This feature tracking algorithm uses a correlation based matching algorithm such as NCC to find the highest correlation of the template from image A within the search region in image B. The coordinate that is registered as the displacement between image A and image B is the centre of the template location that is found to have the highest correlation in image B. Note that templatematch algorithm is used on both the stable features in the image in templatematch rock and also on the dynamic features in the image, such as the ice surface.
Optimizecam (F) Function used to minimise the misfit between the 3-D and 2-D coordinates in the model camera determination and optimise camera view parameters.
Inverseproject (F) Inverse projection function that projects the image coordinates to world coordinates (2-D to 3-D).
We determine the model cameras of all the other images by determining the camera motion relative to the model master camera/image. As the physical time-lapse camera is mounted on a solid, stationary platform all relative motion is in the form of rotation. Therefore we only optimise the rotational parameters in Fig. 3b, when generating the subse-quent model cameras. Typically camera motion is accounted for in a pre-processing step, where all the images are coregistered. This can result in compromising the image quality that can occur from cropping, rotating and re-saving. As a result our approach to dealing with camera motion differs, as we account for camera motion in the projection calcu-  Figure 4. Schematic diagram to show the different components of the templatematch process. Note that the template and search region are not to scale. A unique template will be extracted from image A for every point defined in the grid (see Fig. 5). The user defines the size of both the template and the search region (Table 1).

Figure 5.
Screen shot of the templatematch stage of the processing chain. The regularly spaced grid can be seen in left hand image and the corresponding "tracked" points can be seen in the right hand image. The marker colour corresponds to the signal minus the noise, yellow indicating a good match between image pair.
lations. Camera motion, as discussed previously, is caused by a variety of uncontrollable factors that subsequently lead to an offset between image pairs. The main motion experienced is the rotation about the vertical (yaw), as the camera was mounted on a round pole. However, we experience rotation about all three axes (compass direction/yaw, inclination/pitch, horizon-tilt/roll). We firstly need to determine the offset. This is done by tracking stable rock features on both the near and far valley sides. In order to determine this motion we use the templatematch function in ImGRAFT.
Templatematch uses a standard normalised cross-correlation (NCC) algorithm (Heid and Kääb, 2012), to match the small subset templates defined about points on a grid in image A (see Figs. 4 and 5). The user defines the size of the template in order to capture some surface texture and pattern, examples of which can found in Fig. 4. Contrary to some other feature tracking tools, we do not define individual features such as a distinct crevasse or boulder, but rather a small area automatically selected based on the template size criteria. We subsequently search for the same texture and pattern contained in the template in the second image in the image pair, image B. In order to reduce the computation time of searching for the best correlation in image B, we define a constrained area, the search region, within which the template is searched for. The search region must always be bigger than the template, and the location of the search region is based around the original location of the selected points. The recorded location of the template displacement is defined as the point of highest correlation between the input template and the search region. We use this NCC based templatem-atch function to both track motion in the bedrock (used to determine camera motion), as well as for determining displacement of the deformational surface, e.g. glacier motion (see Sect. 4.3.2). By running templatematch on the stable bedrock we are able to determine the amount and type of motion experienced in the camera. We then use this information to "update" the view direction of the model camera.
In practice, we project the master image pixel coordinates to the new model camera and optimise the camera view using optimizecam, by minimising the reprojection error (see http://imgraft.glaciology.net/). Instead of correcting the image, we correct the camera orientation based on the offset. Our approach is advantageous since we only save a text file of model camera parameters, rather than a large corrected image.

Template matching
This stage of the method refers directly to the templatematch function of ImGRAFT, which has been discussed in the previous section in relation to bedrock motion. Here we apply the tempaltematch to the moving ice surface (Fig. 3c). ImGRAFT tracks features between image pairs, image A and image B (Figs. 3 and 5) by a process called templatematch (see Sect. 4.3.1 and Table 1). Image A refers to the template image and image B refers to the search image, which together form an image pair. Image pairs are any combination of images from the data set, where generally image A is the first image in time and image B is the later image. The optimal time interval between the image pairs varies depending on how much motion is expected and what the resolution the image is. In our case, an interval of 1 week is a good balance between expected motion (approximately 5-7 m total), the resolution of the camera and limited change in the appearance of surface texture between the image acquisitions. ImGRAFT uses the NCC algorithm as a measure of template similarity, which generally performs well (Heid and Kääb, 2012) but other measures such as phase correlation and optical flow analysis have been suggested in the literature (Ahn and Box, 2010;Ahn and Howat, 2011;Heid and Kääb, 2012;Vogel et al., 2012). We hope to include more templatematch methods in future versions of ImGRAFT.
The NCC method is sensitive to changing light conditions around the feature and to reduce this effect we only choose image pairs where illumination is similar, i.e. we only use image pairs taken at the same time of day. One way to reduce false matches is by reducing the size of the search region and centring it on a prior estimate of the location in the second image in the image pair, image B (Fig. 3). In our case the prior guess is based on the centre coordinate of the template in image A, reprojected to the view from the model camera from image B. This prior guess accounts for camera motion, but as mentioned previously, it could be improved with a background ice flow estimate. We obtain subpixel displacements by bi-cubic intensity interpolation as used in Debella-Gilo and Kääb (2011), followed by local weighing of the NCC peak (3-by-3 pixel) neighbourhood.
It is common to track points on a regular grid based on pixel coordinates. However, due to the geometry of the glacier this corresponds to an irregular grid in 3-D space, often characterised by gaps in the velocity field. This is because the grid is not fixed in space but rather image specific as a result of camera motion. Instead we use a static, regular (25 m) geographic grid (Fig. 5) to track our templates on the ice, thereby consistently tracking the same coordinate in each image pair, rather than tracking a feature through time. This allows for a better comparison between velocity fields from different time periods.

Georectification and displacement
Oblique imagery lacks crucial spatial information needed to extract useful quantitative distance (dimension) information as the image is a 2-D representation of a 3-D landscape (Corripio, 2004;Härer et al., 2013). Georectification is the process whereby we assign a 3-D real world coordinate to the corresponding pixel in the 2-D image (Fig. 3d). The model camera directly allows us to calculate the 2-D pixel coordinates of any 3-D world coordinates. However, we are interested in the 3-D coordinates of features in the image, i.e. we want to make an "inverse" projection using inverseproject, from 2-D pixel coordinates to 3-D coordinates, constrained to the visible part of the DEM. In principle the inverseproject function performs a form of ray tracing to generate the 3-D coordinates of the 2-D points using the two model cameras from each of the images in the image pairs and the DEM. In ImGRAFT, we project the visible part of the DEM to 2-D camera view coordinates, and use standard interpolation techniques to obtain 3-D coordinates for any image pixel. The georectification process is carried out on the offset data for all image pair combinations in the time-lapse time series. The actual displacement is calculated as the difference between the 3-D points between image A and image B. It is this displacement that is used in the following velocity calculation.

Velocity calculation
The feature tracking returns the pixel coordinates of the feature in the image pair ( image A and image B). The corresponding 3-D world coordinates are obtained using the inverse projection of the model cameras (Fig. 3d). The velocities can then be calculated from the change in geographic location and dividing the displacement by the time interval between the images in the image pair.
Having very oblique angles from the camera to the measurement surface can amplify errors along the look vector. In such cases it can be beneficial to only look at the velocity  Table 2. (b) Example of the raw velocity field produced as output from the ImGRAFT toolbox. (c) Standard deviation plot of velocity fields produced from all the image pairs in Table 2. (d) Median velocity plot for all the image pairs presented in Table 2. Note that we use the median over the mean as it is robust to outliers. component along the flow direction (i.e. local slope direction or along the centreline). We do not show our data projected in the downslope projection as due to the complex nature of the ice flow in our study site, we risk losing valuable flow information contained within the full velocity component.
In the example presented in the study, we display the unfiltered data points in Fig. 6. Some filters can be applied to the velocity fields, including correlation coefficient (CC) thresholds and signal to noise ratio (SNR) threshold. However, we experienced that this removed more positive matches than mismatches, therefore we decided to carry out our calculations on unfiltered data. It is possible to test varying CC and SNR filter thresholds, for example a strict CC threshold set at 0.9, where any matched points with a correlation peak below 0.9 would be removed, thereby leaving only the matches with the highest correlation according to the NCC algorithm used in the templatematch (Fig. 3d).

Quality control
Subsets of the velocity field were extracted from regions of slow, high and medium flow. The slow flow regions around the margins was one obvious quality control area where one expects the highest drag from the valley sides. We assume that small errors will propagate to areas furthest away from the camera, and therefore we begin our quality control checks here. We inspect the velocity fields to identify the velocity profile across the glacier. If the margin areas indicate high velocities, similar to those in the centre of the glacier, it is indicative of a potential error in our processing chain of that image pair; for example, if one of the model camera view direction parameters is incorrect. In order to correct for this, the processing highlighted in the second processing stage in (Fig. 3b) must be re-run to reproduce the offset between stable features. This can then be used to recalculate the model camera. If this does not solve the anomalous velocity pattern 32 A. Messerli and A. Grinsted: ImGRAFT: image georectification and feature tracking toolbox then there may be a problem with the image itself. Light fog and rain can cause problems in the feature tracking of the features in the image leading to mismatches. If this problem persists after re-running the processing, the velocity field is removed.

Error estimation
There are multiple sources of errors that propagate through the processing chain and end up in the final 3-D positions and velocity estimates. Errors in the GCPs will result in errors in the model master camera and consequently all other model cameras referenced to it. Uncertainties in the templatematch of the stable rock features will propagate to the model cameras. Uncertainties in the pixel coordinates of features, model cameras, and the DEM will as a result propagate through the inverse projections to the estimate of the 3-D position. All of these uncertainties can be accounted for by Monte Carlo sampling of the uncertainties provided the uncertainties in the GCPs, DEM, and the uncertainty of the feature tracking. This can be incorporated into Im-GRAFT processing chain, and an example is given in the frequently asked question section of the ImGRAFT website (http://imgraft.glaciology.net/documentation). A related, more approximate method is to use an error budget approach. In our case we have a near perfect DEM and a strongly constrained viewing geometry, and the errors in our estimates are dominated by errors in feature tracking, and a simpler more empirical approach can be used. In our Engabreen example, we estimate the uncertainty from the sample variance between independent estimates of the velocity. We expect this estimate to be greater than the actual uncertainty as it also includes the variance due to real velocity variations between samples. When we estimate the error variance, it is therefore desirable to choose a set of independent velocity samples with little velocity variability. This is accomplished by choosing independent image pairs with a high degree of temporal overlap. Ideally all the samples used in the calculation should span an equal temporal range, however here we estimate the error from image pairs spanning 5 to 10 days (Table 2). A more in-depth analysis of the actual error estimates is presented in the results (Sect. 6). Error estimation is not directly incorporated into the toolbox, instead we present our case specific method outlined above as a suggestion for similar cases. Error estimations for other studies should be addressed on a case-by-case basis and the most appropriate method of error estimation should be applied. As ImGRAFT is open source, it allows users to incorporate the most appropriate error estimation method for their data set.

Results
ImGRAFT produces consistent velocity fields over the mid icefall section of Engabreen. They match the expected flow pattern of a small alpine glacier, for which we expect slow flow at the margins where there is highest friction along the valley walls and fastest flow in the centre of the glacier. Im-GRAFT is able to capture the specific velocity pattern at Engabreen, which include the extensional and compressional flow as the ice flows in, through and out of the icefalls. The icefalls can be clearly seen on the image in Fig. 6a, where there are large crevasse fields indicate the icefalls. The uppermost icefall ends as the camera viewshed begins, the flow then enters an overdeepening in the upper-centre of the viewshed, before finally entering the lower icefall in the centre of the viewshed before ending in a compressional zone at the base of the icefall. Subset 1 in Fig. 6 is located at the base of the lower icefall. The high flow can be separated into two distinct areas; firstly the edge of the upper icefall flowing into an overdeepening, and secondly the ice flowing out of the overdeepening into the lower icefall as demonstrated by the yellow/orange areas in Fig. 6b. In these areas the ice experiences extensional flow and in the examples in Fig. 6 the glacier achieved speeds of between 60 and 80 cm day −1 , which contrasts to the compressional flow experienced in the overdeepening and the base of the icefall where speeds are typically around 50 cm day −1 (see Table 2). The velocity fields produced using ImGRAFT match well with the velocity pattern observed in two SAR velocity fields (Unpublished data, Schellenberger, 2014) covering the same area. Similar magnitudes of velocity averaging around 50 cm day −1 in the same viewshed area are observed. As well as comparing to the SAR maps, we also improve the existing surface velocity estimates from Engabreen (Jackson et al., 2005). We significantly improve the temporal coverage due to the high number of images acquired each day. We achieve a dense velocity field at Engabreen, albeit over a smaller area than Jackson et al. (2005). The Jackson et al. (2005) study uses IMCORR software, to feature track two orthorectified aerial images from 2002. They found that the central part of the glacier was moving slower than the margins. Our results and preliminary SAR data indicate that this is likely an artefact of the processing due to the long time interval between image acquisitions (> than 20 days). This long interval between images results in features deforming beyond recognition and are thus no longer feasibly trackable.
We estimate the associated error in the velocities in Table 2, following the approach outlined in Sect. 5. We take the standard deviation of all the velocity estimates presented in Table 2, which results in a conservative estimate of the uncertainty for the image pair spanning the longest time interval (ID: 1191x and 1292y). We assume that the error in the displacement is constant and the error in the velocity fields scales directly with the time interval, which we use to estimate the uncertainty for the shorter time intervals. Table 2 summarises all of the image pairs used in the error analysis. All image pair IDs that have the suffix x consist of pairs taken at 11:04 CEST, and all IDs that have the suffix y are image pairs taken at 14:04 CEST. These periods are chosen as they both have high quality images available with similar lighting both at 11:04 and at 14:04 CEST. We selected a small area (black box, Fig. 6) for more detailed analysis. This area was chosen as it appears not to be affected substantially by changes in illumination. Therefore, we attribute any existing error in this region to the templatematch function. This is indicated by the low standard deviation of velocity estimates here (Fig. 6c). Here we compare the offset time period velocity calculations for all image pairs presented in Fig. 2. Figure 6d is the median plot for all the ten time periods together. The final column in Table 2 shows the overall spread in medians between all the time periods. When we only compare the same hourly periods in Table 2, then the range is lower at only 5 cm day −1 . This is our simple error estimation of the velocity calculation from our feature tracking processing chain. The average velocity for the whole feature tracked area is approximately 60 cm day −1 therefore our er-ror estimate of 5 cm day −1 equates to a rough error estimate of ±8 % error. This error estimate is based on unfiltered data where known mismatched points are included. It is expected that the error will reduce if known mismatches are removed through filtering as mentioned in the methods section.
Further results using ImGRAFT to produce velocity fields over Greenland can be found in the following study (Messerli et al, 2014).

Conclusions
We present a flexible, open source Image GeoRectification And Feature Tracking toolbox (ImGRAFT), We apply it to our test case, Engabreen, an outlet glacier in Arctic Norway. ImGRAFT incorporates all the processing steps needed to transform monoscopic, terrestrial, oblique images into a velocity field. ImGRAFT assimilates the rectification of the images and subsequent feature tracking into one toolbox. These features are advantageous, as current existing software tend to address only one of these processes, and thereby require numerous software to complete the entire processing. Furthermore, the source code is freely available and adaptable, allowing the user to tailor the toolbox to meet their specific processing needs, whilst all being contained within the MAT-LAB environment. Additional benefits are the inclusion of a full distortion model, which opens up the possibility to use images taken on lower quality cameras with lower quality lenses. This significantly increases the diversity of the toolbox as it accommodates a wide range of image sources and possibilities for feature tracking. We offer suggestions of how to prepare the images and DEM correctly for input to ImGRAFT and additionally provide a comprehensive online documentation (http://imgraft.glaciology.net/) and demonstrations. We provide some guidelines for error assessment within the context of our Engabreen example, in which we propose an empirical approach for error assessment that incorporates the accumulated errors throughout the processing chain. Our continuously updated online documentation offers users further pre/post-processing tips and other example cases. Our aim is to allow for further algorithm development and improvement through our own efforts and those within the user community. ImGRAFT provides a flexible, adaptable tool to process large volumes of imagery with a high degree of automation, in order to obtain quantitative data in the form of displacement. It produces consistent, velocity fields that require minimal post-processing and filtering. ImGRAFT has been developed with a focus on glaciological applications, and in this paper we only consider terrestrial based imagery. However, it has also been tested on a variety of satellite data, with encouraging results (e.g. using Landsat-8, Messerli et al, 2014). Applications to other mass movement environments is achievable with slight modifications of the processing chain.