Making your own weather forecast Part 2

Now we need to download some data to process.

One thing that tripped me up initially (being relatively naive about how weather models worked) was the simple fact the WRF system requires the output of other models to run a prediction. Why? WRF is what they call a mesoscale or local forecasting model, meaning it doesn’t try to understand what the whole world is doing, it simply focuses on one little area (such as a state or region). Therefore it requires other global models (such as the GFS) telling it what’s going on outside its focused area (what weather patterns are going into the forecast area, for example). So… WRF takes as input a global model and does a far more detailed analysis on the local level to give a forecast better than what the GFS alone could do (in theory).*

So let’s download GFS data for our model run. What I do is simply head on over to NOAA and find a range of GFS products (products = output) I’d like to process here. A ton of data is stored here, and you’ll see there are multiple folders with the same date. It’s really kind of a dumping grounds of output data and can be quite daunting to get your bearings. So let’s slowly walk through it.

At first you’ll see a range of folders, these are the output of GFS runs done at different times during the day, so in theory the latest one is going to be the most accurate. For me — doing this on August 22nd of 2014, the following folders appear waaaay at the bottom of the page:

wrf_dates

You see there’s a GFS model run for the 22nd of August at 00-hours, 06-hours and at 12-hours and 18-hours. Let’s jump into the gfs.2014082218 folder (18-hours) since that’d be the best available forecast we have. Doing so puts me at this URL:

http://www.ftp.ncep.noaa.gov/data/nccf/com/gfs/prod/gfs.2014082218/

Now, there’s a TON of data in here. It’s all various output from global model runs, but what we’re looking for is data packaged into a particular data format (grib2). All of the files have a special name format (so you can do regular expression matches to get the exact data you need), but for our purposes let’s simply do a search for files containing the character sequence “pgrbf“.

Doing this search results in files being highlighted as such:

wrf_files

We’re interested in the 16 megabyte files ending in *.grib2. So let’s go ahead and download the files from time 00 to time 09 (the ones we see in the snippet above).

To do so on Linux, I typically move into the directory containing all my other directories (such as WPS and WRFV3) and create a new one called Data. From within this directory I then execute wget to download everything, so something like this:

...
wget http://www.ftp.ncep.noaa.gov/data/nccf/com/gfs/prod/gfs.2014081712/gfs.t12z.pgrbf00.grib2
...

…and so on until all the files I’m interested in are downloaded (typically I write a script to do this, but that’s up to you). When done, the directory should look like this:

shell1

 

And now the magic begins.

  1. Move into the WPS directory and execute the ./link_grib.csh script. What this will do is generate symbolic links to the grib2 data files within the WPS directory. So we execute ./link_grib.csh ../DATA/
  2. Listing the directory contents now reveals a sequence of files named GRIBFILE.*** where *** is a unique moniker suffix. These files are symbolic links to your recently downloaded GRIB2 data files.
  3. Now we’re going to unpackage (or “ungrib”) these datafiles. To do so, we’re going to use a tool called ungrib.exe. Ungrib.exe uses a configuration file to know the fields to unpack from the GFS GRIB2 files, so we’ll create a link to that configuration file by typing ln -s ungrib/Variable_Tables/Vtable.GFS Vtable.
  4. One last step before running ungrib.exe: we need to now update the namelist.wps configuration file so ungrib knows what range of data to unpack. Unfortunately it’s not immediately obvious as to what dates are encoded within the GFS files, so you’ll need to use a utility to inspect the files and return the dates they harbor called wgrib2. Instructions for compilation is here. For me this was painless (I also copied the binary to /usr/local/bin).To find the start time, execute this command against the GFS forecast file at 00 and 09 hours:start_end_dates…then take the output dates and update the namelist.wps file within the WPS directory. The contents of the file should look like this (given the output we get above):
    &share
    max_dom = 1,
    start_date = '2014-08-22_18:00:00',
    end_date = '2014-08-23_03:00:00',
    interval_seconds = 10800
    /
    
  5. Now launch ./ungrib.exe. When all is done you should have a series of output files beginning with “FILE:”. These are the unpackaged GRIB2 files in an intermediate format for use by WPS.
  6. We now need to launch geogrid.exe, which is used to generate an understanding of the local terrain for the weather model run. To do so we need to, again, update our namelist.wps configuration file with the geospatial coordinates of the region by placing the following immediately after the share section from the previous step (be sure to update geog_data_path to point to the directory within which you unpacked the geog data from the previous post). NOTE: For Lincoln, NE (my hometown), you can simply copy/paste what I put here, otherwise update with your own coordinates (the lat/lon values should be the only ones you need to change).
    &geogrid
    e_we = 75,
    e_sn = 70,
    dx = 30000,
    dy = 30000,
    map_proj = 'lambert',
    ref_lat = 40.8106,
    ref_lon = -96.6803,
    truelat1 = 40.8106,
    truelat2 = 40.8106,
    stand_lon = -96.6803,
    geog_data_path = '/home/brush/Downloads/weather/geog/'
    /
  7. Now run geogrid.exe.  As it’s executing you’ll see it process a number of geographical datapoints in preparation of the full run.
  8. metgrid.exe is then ran. This bridges the gap between the GFS data and geographical data by horizontally interpolating the former over the later, so basically it connects the two together. This requires another addition to namelist.wps, albeit a shorter and simpler one:
    &metgrid
    fg_name = 'FILE'
    io_form_metgrid = 2,
    /
    

    Now execute metgrid.exe.

  9. We are now done with the WPS directory, meaning we have prepped all of the data and are now ready for the actual model run. To do this we will update one more configuration file and execute two applications located within the WRFV3/test/em_real directory. So change into that directory and open up namelist.input. Within this file you’ll immediately see a bunch of variables, but luckily we only have to change a few. Update the following fields under the “&time_control” section:
    run_days = 0,
    run_hours = 9,
    run_minutes = 0,
    run_seconds = 0,
    start_year = 2014,
    start_month = 08,
    start_day = 22,
    start_hour = 18,
    start_minute = 00,
    start_second = 00,
    end_year = 2014,
    end_month = 08,
    end_day = 23,
    end_hour = 03,
    end_minute = 00,
    end_second = 00,
    

    Now update the following fields under the “&domains” section…

    time_step = 180,
    time_step_fract_num = 0,
    time_step_fract_den = 1,
    max_dom = 1,
    e_we = 75,
    e_sn = 70,
    e_vert = 35,
    p_top_requested = 5000,
    dx = 30000, 10000, 3333.33,
    dy = 30000, 10000, 3333.33,
  10. To launch the two apps responsible for actually generating the model we need to install mpirun (this is part of MPI, a system that enables distributed computing), and you can do so with your favorite package manager (I installed openmpi).
  11. Copy all of the files generated by metgrid (starting with “met_em”) into your current directory.
  12. Run  mpirun -np 1 real.exe
  13. Then run mpirun -np 8 wrf.exe ..sit back and wait….
  14. When done you should have a file starting with “wrfout”; this is the model output! You’re done! You can then work with this file using any number of WRF post processing utilities and languages.

* It’s worth noting WRF is capable of doing a global analysis by way of “nesting forecasts”, but it’s not very well supported and not recommended for accurate forecasts (most people I’ve talked to consume GFS or other global model output).

Making your own weather forecast Part 1

It’s totally possible given the WRF model (which stands for the Weather Research and Forecast). Here I’m going to give you a step-by-step guide to downloading the 3.6.1 version of the software, installing it, and then also downloading the needed data to make your own custom forecast. Since I live in Nebraska, I’m going to set it up to do a model run for Lincoln (the capital), but you can easily make the necessary changes to do the run for whatever location you’d like.

The Prerequisites: 

First, a few things need to be in order. This stuff is meant for Linux. You can apparently run it on Windows using Cygwin, but it appears to be happiest when installed and ran under a Linux distribution. Mine is Ubuntu 14.04, 64-bit. Your mileage may vary depending on which flavor of Linux you use, but for any Debian-based systems these instructions should go smoothly. It’s worth noting I’m by no means an expert in Linux and I was able to get this thing going just fine; so hopefully this guide will help for even novices.

You’re going to need some disk space, not a lot, but you will need a few Gig to spare. That’s because the data you’ll be downloading to get this thing to run isn’t insignificant, and neither is the data it generates.

One final thing to bear in mind is this system, as is common for many “systems” running on Linux it seems, is made up of multiple programs, all working in concert with one another. Typically the output from one is taken and fed into another, and so the chain continues. It can be slightly daunting at first to know who is talking to whom, so I recommend glancing around this page and reading up on the 30,000-foot view (I won’t talk about it here since it’s fairly well described in the link provided). The reason I bring this up is because there is no single source code package, you’ll need to download and compile and setup multiple programs (such is life); yet they all fall under the title of the “Model System” as I’m describing it.

The Setup: 

One final thing before we get started — I wrote this guide alongside doing another install (for a second time on a different Linux box). The order isn’t necessarily important as it’s just how I did it…if you notice any irregularities and would like to mix it up a bit, it’s likely you’ll be able to do so.

  1. Download the source for the latest version of the WPS program here.
  2. Download the source for the latest version of the WRF program here.
  3. Download the geographical input data for the model system here. A couple things to note about this step — first, what you’re downloading here is terrain data, etc. that allows the model to properly build a view of the environment around which it’s generating its predictions. Second, it’s really not important to know what this data contains (unless you really want to). And finally, they don’t make it easy to know what, exactly, you are to download (as there are different versions of this for different versions of the software). If the page to which I link to above hasn’t changed much, the link to click for the data is titled “Download Complete Dataset”, and it’s actually the header for a table column.
  4. Now untar/gzip the files you’ve downloaded. I downloaded everything into a single directory to make management easier. Once all is unpackaged, there are three folders for me named geog, WPS and WRFV3 (hint, to stay in sync with this tutorial, you should rename and folders appropriately).
  5. Next we download some third-party packages needed for building the WPS/WRF programs. Start by downloading HDF5 (a library for manipulating data formats used by WRF) from here. Once again, untar/gzip the file. Move into the directory and configure the install by typing ./configure –prefix=/usr/local. Once this is done, type make install to install the package. NOTE: Make sure you’re running sudo.
  6. There’s also a third party package called NetCDF also used for processing specialized data formats used by WRF. There appears to be a version for both the C and Fortran compilers, for completeness I downloaded and installed both (though I have a suspicion only the Fortran version is needed). To download the C and Fortran versions go here. Unpackage them and for each ./configure –prefix=/usr/local.  And, yet again for each, make install.
  7. Before proceeding you’ll have to set an environment variable that’ll be used by the compilation scripts to locate netcdf. If you’re using bash as your shell, type export NCARG_ROOT=/usr/local. 
  8. NOTE: If you don’t have gfortran –the GNU Fotran compiler — installed, you should do so now (on Ubuntu, you can use the aptitude or apt-get package manager to do so). NOTE: The following compile scripts are written for CSH, not bash, so you’ll have to make sure you have csh installed (again, use apt-get, aptitude or your package manager of choice to get it).
  9. Now cd into the WRFV3 directory and type ./configure. When executing the script for me I am prompted twice, once for the location to the /include directories for NetCDF and again for the /lib directories to NetCDF. Type them in accordingly. Next you’ll be prompted for the build chain to use when compiling the WRFV3 package – I chose option #32 (x86_64 Linux, gfortran compiler with gcc   (serial)). One final prompt comes your way, asking for nesting options (I chose option #1, or basic). Now you’ll want to actually do the compilation, and since we’ll be running real world simulations, we’ll tell the build system to set things up accordingly by typing ./compile em_real. And away it goes….
  10. Now to install some more third-party packages required for compiling the WPS package. Go here and search for the phrase “external libraries”, you’ll see three: Jasper, PNG and zlib. Download each of them, ./configure –prefix=/usr/local and make install each accordingly (seriously, this was a pretty easy setup for me).
  11. Next cd into the WPS directory and type ./configure. Like with the WRFV3 script, this will prompt you for the location of the netcdf include and library directories, which you will specify just the same. It will then prompt you for the compilers (both Fortran and C) to use for building WPS. On my system (see above) I chose option #1 (Linux x86_64, gfortran    (serial)). Once you select your option, it’ll prompt you to type “compile” to start the compilation process and you should do that now.
  12. If all compiled well, you’re done!

In my next post I’ll describe how to download data and actually stitch all of these programs together.