NOAA

Dislaimer

Privacy Policy

NOAA
website
ESRL website
GSD website

Accessibility statement
 

 

 
About

The Data Locator project is basically a search engine for finding GFS meterological data here at ESRL / GSD. Some of this data is found on our public file system, some on our mass store, and some is on NCDC's Nomads HTTP server and indexed on its Nomads Thredds server. The metadata that can be extracted from those four locations is inserted into our MySQL database, making it searchable via a web client or web service. Here is what the data flow looks like:

GFS Data Flow Diagram
Data Flow Diagram

The Data Locator software consists of three main software components.

  1. Four Java Spider Programs
    1. Nomads THREDDS Spider creates an http connection to a NCDC Nomads THREDDS catalog, and walks through the catalog XML, extracting relevant metadata and inserting this information into the MySQL database.
    2. Nomads HTTP Spider scans all the files on the Nomads HTTP server for files that match the spec (e.g. *.sanl, *.sfcanl) and extracts what information it can out of the filename (e.g. start Z in hours) and the file information (e.g. file size in MB), inserting this information into the MySQL database.
    3. Public File System Spider scans the dirs/files starting at a top level directory and extracts what information it can out of the filename (e.g. start Z in hours) and the system file information (e.g. file size in MB), inserting this information into the MySQL database.
    4. Mass Store Spider scans the dirs/files starting at a top level directory of the mass store and extracts what information it can out of the filename (e.g. start Z in hours) and the system file information (e.g. file size in MB), inserting this information into the MySQL database.
  2. Data Locator Web Service - this Java web service accepts calls from client programs (or other web services) and searches the mySQL database (metadata) for matches. It then returns a list of matching file paths or URLs to the requested data. For details on how to invoke this web service, look here.
  3. Data Locator Web Client - this HTML and JSP based web application provides a web form for specifying search criteria (e.g. the catalog(s) to search, the date(s) and time(s)), and then invokes the Data Locator Web Service to find meterological datasets that match the search criteria. It displays the matching datasets as links on the web page.


Web Client Process Diagram

Process Diagram


Software Flow Diagram

FlowDiagram


Libraries and Software Used
 
     Tomcat 6 MySQL 5
     Axis 2 THREDDS Data Server 3.1701
     Java 6 JDOM
     SQLExecutor 1.41 commons-httpclient-3.1.jar
     NCDC Nomads Server  

Software Documentation
 
     Source code Javadocs Ant  build.xml
   

 

 

   
 
  Page last updated on May 12, 2009