Interoperability with the Earth Science Remote Access Tool (ESRAT)

Robert Raskin and Elaine Dobinson
Physical Oceanography Distributed Active Archive Center (DAAC)
Jet Propulsion Laboratory
Pasadena, CA 91109

The Earth Science Remote Access Tool (ESRAT) is an http-based client-server application that facilitates internet access to distributed Earth Science raster data. ESRAT addresses the variety of data formats used in the Earth Sciences by providing a common data model with translators for the models inherent in many standard formats. It also enables applications normally using local data in a particular format to access remote datasets in that or other supported formats.

The client side of ESRAT features a Java applet GUI that allows the user to specify spatial/temporal regions of interest. A query to the server returns a summary of datasets satisfying the search criteria, displayed visually as layers that can be interactively selected. Selected data subsets are retrieved by the server and loaded directly into an application package (MATLAB, in this case) by invoking a helper application in a web browser. Spatial/temporal subsetting is carried out on the server side prior to data transmission, reducing bandwidth requirements. Remote data from multiple sources and in multiple formats can be combined readily into a single MATLAB script.

The server side includes master directory, dataset catalog, and data access servers, implemented as C++ CGI programs. The master directory contains a list of dataset holdings at local or remote sites and their associated URLs. The dataset catalogs contain the spatial/temporal bounds of each data subset; currently, data in swath, grid, and point network models are supported. For swath data, each cross-track is individually cataloged in space and time. When a user requests a spatial/temporal subset of swath data, the server concatenates any contiguous cross-tracks satisfying the selection criteria, and returns a swath polygon to the client.

ESRAT currently is built on the Distributed Oceanographic Data System (DODS) developed at the University of Rhode Island and the Massachusetts Institute of Technology [1]. DODS provides the capability to:

To add a new supported format, translation between the format's data model and DODS's data model must be provided. One of the lessons learned in the development of ESRAT was that 100% interoperability between all formats is non-trivial. However, partial interoperability is sufficient for most Earth Science applications, as certain translation combinations are not likely, in practice.

In the near future, we expect to convert the entire tool to Java. Java provides an unprecedented level of interoperability because both its bytecode and its internal representation of numbers are platform independent. This feature permits Java's data model (in terms of arrays, classes, and streams) to be used as the intermediate data model (when enhanced with class libraries developed for spatial data types). This approach will increase the number of supported clients and servers by leveraging on the work of software vendors developing Java translators/interfaces. Java has a further advantage of allowing users to package executable code as metadata that can accompany the dataset. The code might be used to properly geolocate, subset, interpret, or analyze the data.

We also plan to convert our catalog databases to a commercial object/relational database system (ORDBMS) with spatial data type extensions. This will provide greater GIS functionality, including better support for spatial operations on swaths. A global quadtree representation of the catalog entries is being designed to provide efficient global searches when the system is expanded to include large volumes of data.

This work is being funded by the EOSDIS Prototype Office and is ongoing.

Reference:
[1] Gallagher, J. and G. Milkowski, "Data Transport Within the Distributed Oceanographic Data System", 4th International World Wide Web Conference Proceedings, 691-700, 1995.