The Data Locator project is basically a search engine for finding GRIB and NetCDF meterological data here at ESRL / GSD. For example, when hundreds of megabytes of FIM data is generated from a FIM model run, this data (and metadata) is registered in our THREDDS servers and our MySQL database, making it searchable via a web client or web service. Here is what the data flow looks like:
Data Flow Diagram

The Data Locator software consists of three main software components.
- THREDDS Spider - this Java program creates an http connection to a THREDDS catalog, and walks through the catalog XML, extracting relevant metadata and inserting this information into the mySQL database.
- Data Locator Web Service - this Java web service accepts calls from client programs (or other web services) and searches the mySQL database (metadata) for matches. It then constructs a list of calls to a WCS that will return the requested data. It passes this list of calls (WCS URLs) back to the calling program. For details on how to invoke this web service, look here.
- Data Locator Web Client - this HTML and JSP based web application provides a web form for specifying search criteria (e.g. the longitude-latitude bounding box, the catalog(s) to search, the date(s) and time(s)), and then invokes the Data Locator Web Service to find GRIB/NetCDF meterological datasets that match the search criteria. It displays the matching datasets as links on the web page. Users can simply download the subsetted data (as a NetCDF), or they can view this data on a map in the web page, or on Google Earth. To see this web client in action, look here.
Web Client Process Diagram

Software Flow Diagram

Libraries and Software Used
|
| Tomcat 6 |
MySQL 5 |
| Axis 2 |
ncWMS BETA1.0 |
| Java 6 |
JDOM |
| SQLExecutor 1.38 |
commons-httpclient-3.1.jar |
| THREDDS Data Server 3.1613 |
Google Earth |
Software Documentation
|
| Source code Javadocs |
Ant build.xml |
| |
|
|
|