Work with Daily NASA VIIRS Surface Reflectance Data

This tutorial demonstrates how to work with the daily Suomi NPP NASA VIIRS Surface Reflectance (VNP09GA.001) data product.

NASA's Land Processes Distributed Active Archive Center (LP DAAC) distributes surface reflectance data products from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument aboard the joint NOAA/NASA Suomi National Polar-orbiting Partnership (Suomi NPP) satellite. The Suomi NPP NASA VIIRS Surface Reflectance products are archived and distributed in the HDF-EOS5 file format. The Hierarchical Data Format version 5 (HDF5) is a NASA selected format of choice for standard science product archival and distribution and is the underlying format for HDF-EOS5. HDF-EOS is the standard format and I/O library for the Earth Observing System (EOS) Data and Information System (EOSDIS). HDF-EOS5 extends the capabilities of the HDF5 storage format, adding support for the various EOS data types (e.g., point, swath, grid) into the HDF5 framework.

Example

Convert VNP09GA files into quality-filtered vegetation indices and examine the trend in vegetation greenness during July 2018 to observe drought in Europe.

Data used in the example:

Data Product: VIIRS/NPP Surface Reflectance Daily L2G Global 1km and 500m SIN Grid Version 1 (VNP09GA.001)
- Science Dataset (SDS) layers:
  - SurfReflect_M3_1
  - SurfReflect_M4_1
  - SurfReflect_M5_1
  - SurfReflect_M7_1

Topics Covered

Getting Started
1. Import Packages
2. Set Up the Working Environment
3. Retrieve Files
Importing and Interpreting Data
1. Open a VIIRS HDF5-EOS File and Read File Metadata
2. Subset SDS Layers and read SDS Metadata
Generating an RGB Composite
1. Apply a Scale Factor
2. Create an Image
Quality Filtering
1. Decode Bit Values
2. Apply Quality and Land/Water Masks
Functions
1. Defining Functions
2. Executing Functions
Visualizing Results
1. Plot Quality-Filtered VIs
2. Scatterplot Comparison of NDVI vs. EVI
Georeferencing
Exporting Results
1. Set Up a Dictionary
2. Export Masked Arrays as GeoTIFFs
Working with Time Series
1. Automation of Steps 2-5, 7-8
2. Time Series Analysis
3. Time Series Visualization

Prerequisites

A NASA Earthdata Login account is required to download the data used in this tutorial. You can create an account at the link provided.

Python Version 3.6.1

h5py
numpy
matplotlib
pandas
datetime
skimage
GDAL To execute Sections 7-9, the Geospatial Data Abstraction Library (GDAL) is required.

Procedures

This tutorial uses the VNP09GA.001 tile h18v03 product on July 2, 2018. Use this link to download the file directly from the LP DAAC Data Pool.
Section 9 features time series data, and you will need to download all of the files in order to execute it. All of the files used in Section 9 can be downloaded via either of following approaches:
1. text file containing links to each of the files on the LP DAAC Data Pool.
2. NASA Earthdata Search using the query in the link provided.
Copy/clone/download the NASA VIIRS Tutorial repo, or the desired tutorial from the LP DAAC Data User Resources Repository.
The source code used to generate this tutorial can be downloaded from: Jupyter Notebook

Note: This tutorial was developed for NASA VIIRS Daily Surface Reflectance HDF-EOS5 files and should only be used for the VNP09GA.001 product.

Python Environment Setup

1. It is recommended to use Conda, an environment manager, to set up a compatible Python environment. Download Conda for your OS. Once you have Conda installed, Follow the instructions below to successfully setup a Python environment on Windows, MacOS, or Linux.

2. Setup

Using your preferred command line interface (command prompt, terminal, cmder, etc.) type the following to successfully create a compatible python environment:
conda create -n viirstutorial -c conda-forge --yes python=3.7 h5py numpy pandas matplotlib scikit-image gdal
conda activate viirstutorial
jupyter notebook

If you do not have jupyter notebook installed, you may need to run:
- conda install jupyter notebook

Tip: Having trouble activating your environment, or loading specific packages once you have activated your environment? Try the following: Type: 'conda update conda'

If you prefer to not install Conda, the same setup and dependencies can be achieved by using another package manager such as pip and the requirements.txt file listed above.
Access additional information on setting up and managing Conda environments.

Still having trouble getting a compatible Python environment set up? Contact LP DAAC User Services.

1. Getting Started

1a. Import Packages

Import the python packages required to complete this tutorial:

import h5py, os
import numpy as np
import matplotlib.pyplot as plt
from osgeo import gdal, gdal_array
import datetime as dt
import pandas as pd
from skimage import exposure

1b. Set Up the Working Environment

Define the input directory containing the VNP09GA.001 files, and create an output directory to store the results from this tutorial.

Change the inDir variable below to the path of the directory where your data is stored (e.g., C:/Data/VIIRS/VNP09A1/).

inDir = input("Main input directory: ")                                # Update to dir on your OS containing VIIRS files
os.chdir(inDir)                                                         # Change to working directory
outDir = os.path.normpath(os.path.split(inDir)[0]+os.sep+'output')+'\\' # Set output directory
if not os.path.exists(outDir): os.makedirs(outDir)                      # Create output directory

If you plan to execute this tutorial on your own OS, `inDir` above needs to be changed.

1c. Retrieve Files

In this section, a VNP09GA .h5 file has been downloaded to the inDir defined above.

You will need to download the VNP09GA.001 tile h18v03 file on July 2, 2018. Use this link to download the file directly from the LP DAAC Data Pool: https://e4ftl01.cr.usgs.gov/VIIRS/VNP09GA.001/2018.07.02/VNP09GA.A20181….

Make sure that you download the file into the inDir directory defined above to follow along in the tutorial.

inFile = 'VNP09GA.A2018183.h18v03.001.2018184074906.h5' # File used in example
outName = inFile.rsplit('.', 1)[0]                      # Parse out the file extension, keep file name
outName

'VNP09GA.A2018183.h18v03.001.2018184074906'

The standard format for VIIRS filenames is as follows:

VNP09GA: Product Short Name
A2018183: Julian Date of Acquisition (AYYYYDDD)
h18v03: Tile Identifier (horizontalXXverticalYY)
001: Collection Version
2018184074906: Julian Date of Production (YYYYDDDHHMMSS)

Below, take the Julian Date of Acquisition and convert to calendar date using the datetime package.

yeardoy = inFile.split('.')[1][1:]                                  # Split filename to retrieve acquisition date
date = dt.datetime.strptime(yeardoy, '%Y%j').strftime('%m/%d/%Y')   # Convert YYYYDDD to MM/DD/YYYY
date

'07/02/2018'

2. Importing and Interpreting Data

2a. Open a VIIRS HDF-EOS5 File and Read File Metadata

Read in a VNP09GA HDF-EOS5 file using the h5py package.

f = h5py.File(inFile, "r") # Read in VIIRS HDF-EOS5 file and set to variable

Navigate the HDF-EOS5 file below.

list(f.keys()) # List groups in file

['HDFEOS',
 'HDFEOS INFORMATION',
 'SensorAzimuth_c',
 'SensorZenith_c',
 'SolarAzimuth_c',
 'SolarZenith_c',
 'SurfReflect_I1_c',
 'SurfReflect_I2_c',
 'SurfReflect_I3_c',
 'SurfReflect_M10_c',
 'SurfReflect_M11_c',
 'SurfReflect_M1_c',
 'SurfReflect_M2_c',
 'SurfReflect_M3_c',
 'SurfReflect_M4_c',
 'SurfReflect_M5_c',
 'SurfReflect_M7_c',
 'SurfReflect_M8_c',
 'SurfReflect_QF1_c',
 'SurfReflect_QF2_c',
 'SurfReflect_QF3_c',
 'SurfReflect_QF4_c',
 'SurfReflect_QF5_c',
 'SurfReflect_QF6_c',
 'SurfReflect_QF7_c',
 'iobs_res_c',
 'nadd_obs_row_1km',
 'nadd_obs_row_500m',
 'obscov_1km_c',
 'obscov_500m_c',
 'orbit_pnt_c']

The VNP09GA HDF-EOS5 file contains two groups in which data and metadata are stored.

First, the HDFEOS group contains the VNP09GA SDS layers.

list(f['HDFEOS']) # List contents of HDFEOS group

['ADDITIONAL', 'GRIDS']

Continue into the GRIDS directory.

grids = list(f['HDFEOS']['GRIDS']) # List contents of GRIDS directory
grids

['VNP_Grid_1km_2D', 'VNP_Grid_500m_2D']

The VNP09GA data product provides data for three imagery bands (I1, I2, I3) at nominal 500 meter resolution (~ 463 meter) and nine moderate-resolution bands (M1, M2, M3, M4, M5, M7, M8, M10, M11) at nominal 1 kilometer (km) (~926 meter) resolution. This example uses the nominal 1 km M bands.

list(f['HDFEOS']['GRIDS']['VNP_Grid_1km_2D']) # List contents of 1km directory

['Data Fields']

Below, list the SDS contained in the HDF-EOS5 file at 1 km resolution.

list(f['HDFEOS']['GRIDS']['VNP_Grid_1km_2D']['Data Fields']) # List Data Fields

['SensorAzimuth_1',
 'SensorZenith_1',
 'SolarAzimuth_1',
 'SolarZenith_1',
 'SurfReflect_M10_1',
 'SurfReflect_M11_1',
 'SurfReflect_M1_1',
 'SurfReflect_M2_1',
 'SurfReflect_M3_1',
 'SurfReflect_M4_1',
 'SurfReflect_M5_1',
 'SurfReflect_M7_1',
 'SurfReflect_M8_1',
 'SurfReflect_QF1_1',
 'SurfReflect_QF2_1',
 'SurfReflect_QF3_1',
 'SurfReflect_QF4_1',
 'SurfReflect_QF5_1',
 'SurfReflect_QF6_1',
 'SurfReflect_QF7_1',
 'num_observations_1km',
 'obscov_1km_1',
 'orbit_pnt_1']

Second, the HDFEOS INFORMATION group contains the metadata for the file. The StructMetadata.0 object stores the file metadata.

list(f['HDFEOS INFORMATION']) # List contents of HDFEOS group

['StructMetadata.0']

Below, read the file metadata and store as a list.

fileMetadata = f['HDFEOS INFORMATION']['StructMetadata.0'][()].split()   # Read file metadata
fileMetadata = [m.decode('utf-8') for m in fileMetadata]                 # Clean up file metadata
fileMetadata[0:33]                                                       # Print a subset of the entire file metadata record

['GROUP=SwathStructure',
 'END_GROUP=SwathStructure',
 'GROUP=GridStructure',
 'GROUP=GRID_1',
 'GridName="VNP_Grid_1km_2D"',
 'XDim=1200',
 'YDim=1200',
 'UpperLeftPointMtrs=(0.000000,6671703.118000)',
 'LowerRightMtrs=(1111950.519667,5559752.598333)',
 'Projection=HE5_GCTP_SNSOID',
 'ProjParams=(6371007.181000,0,0,0,0,0,0,0,0,0,0,0,0)',
 'SphereCode=-1',
 'GridOrigin=HE5_HDFE_GD_UL',
 'GROUP=Dimension',
 'OBJECT=Dimension_1',
 'DimensionName="YDim"',
 'Size=1200',
 'END_OBJECT=Dimension_1',
 'OBJECT=Dimension_2',
 'DimensionName="XDim"',
 'Size=1200',
 'END_OBJECT=Dimension_2',
 'END_GROUP=Dimension',
 'GROUP=DataField',
 'OBJECT=DataField_1',
 'DataFieldName="num_observations_1km"',
 'DataType=H5T_NATIVE_SCHAR',
 'DimList=("YDim","XDim")',
 'MaxdimList=("YDim","XDim")',
 'CompressionType=HE5_HDFE_COMP_DEFLATE',
 'DeflateLevel=8',
 'TilingDimensions=(30,1200)',
 'END_OBJECT=DataField_1']

Identify all the objects in the VNP09GA HDF-EOS5 file below.

h5_objs = []            # Create empty list
f.visit(h5_objs.append) # Walk through directory tree, retrieve objects and append to list
h5_objs

['HDFEOS',
 'HDFEOS/ADDITIONAL',
 'HDFEOS/ADDITIONAL/FILE_ATTRIBUTES',
 'HDFEOS/GRIDS',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M10_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M11_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M8_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF6_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/num_observations_1km',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/obscov_1km_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/orbit_pnt_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I1_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I2_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I3_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/iobs_res_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/num_observations_500m',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/obscov_500m_1',
 'HDFEOS INFORMATION',
 'HDFEOS INFORMATION/StructMetadata.0',
 'SensorAzimuth_c',
 'SensorZenith_c',
 'SolarAzimuth_c',
 'SolarZenith_c',
 'SurfReflect_I1_c',
 'SurfReflect_I2_c',
 'SurfReflect_I3_c',
 'SurfReflect_M10_c',
 'SurfReflect_M11_c',
 'SurfReflect_M1_c',
 'SurfReflect_M2_c',
 'SurfReflect_M3_c',
 'SurfReflect_M4_c',
 'SurfReflect_M5_c',
 'SurfReflect_M7_c',
 'SurfReflect_M8_c',
 'SurfReflect_QF1_c',
 'SurfReflect_QF2_c',
 'SurfReflect_QF3_c',
 'SurfReflect_QF4_c',
 'SurfReflect_QF5_c',
 'SurfReflect_QF6_c',
 'SurfReflect_QF7_c',
 'iobs_res_c',
 'nadd_obs_row_1km',
 'nadd_obs_row_500m',
 'obscov_1km_c',
 'obscov_500m_c',
 'orbit_pnt_c']

2b. Subset SDS Layers and read SDS Metadata

Identify and generate a list of all the SDS layers in the HDF-EOS5 file.

# Search for SDS with 1km or 500m grid
all_datasets = [obj for grid in grids for obj in h5_objs if isinstance(f[obj],h5py.Dataset) and grid in obj] 
all_datasets

['HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M10_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M11_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M8_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF6_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/num_observations_1km',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/obscov_1km_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/orbit_pnt_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I1_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I2_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I3_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/iobs_res_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/num_observations_500m',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/obscov_500m_1']

Read in the near infrared (SurfReflect_M7), red (SurfReflect_M5), green (SurfReflect_M4), and blue (SurfReflect_M3) SDS layers from the HDF-EOS5 file. These layers will be used later in the tutorial to calculate daily vegetation indices.

r = f[[a for a in all_datasets if 'M5' in a][0]] # M5 = Red
g = f[[a for a in all_datasets if 'M4' in a][0]] # M4 = Green
b = f[[a for a in all_datasets if 'M3' in a][0]] # M3 = Blue
n = f[[a for a in all_datasets if 'M7' in a][0]] # M7  = NIR

List the attributes for one of the SDS.

list(r.attrs) # List attributes

['FILL_VALUES',
 'Offset',
 'Scale',
 '_FillValue',
 'long_name',
 'units',
 'valid_range']

Extract the scale factor and fill value from the SDS metadata.

scaleFactor = r.attrs['Scale'][0]    # Set scale factor to a variable
fillValue = r.attrs['_FillValue'][0] # Set fill value to a variable

3. Generating an RGB Composite

3a. Apply a Scale Factor

Extract the data for each SDS and apply the scale factor defined in the previous section. The scale factor is the same for all surface reflectance M-bands, so you can simply use the scale factor from the red band for the other bands.

red = r[()] * scaleFactor
green = g[()] * scaleFactor
blue = b[()] * scaleFactor
nir = n[()] * scaleFactor

Create an RGB array using np.dstack() and set the fill values equal to nan.

rgb = np.dstack((red,green,blue))            # Create RGB array
rgb[rgb == fillValue * scaleFactor] = 0      # Set fill value equal to 0

3b. Create an Image

Before filtering the data for poor quality observations such as clouds, visualize the non-filtered RGB image using the matplotlib package.

# Set plots inside of the Jupyter Notebook
%matplotlib inline

Below, use the skimage package's exposure module (http://scikit-image.org/docs/dev/api/skimage.exposure.html) to apply a contrast stretch and gamma correction to the RGB composite for a better looking image.

p2, p98 = np.percentile(rgb, (2, 98))                              # Calculate 2nd,98th percentile for updating min/max vals
rgbStretched = exposure.rescale_intensity(rgb, in_range=(p2, p98)) # Perform contrast stretch on RGB range
rgbStretched = exposure.adjust_gamma(rgbStretched, 0.5)            # Perform Gamma Correction

Note: stretching the array above inherently changes the values of the array, and those values should be used for visualization purposes only.

Take the stretched RGB composite and plot the results.

fig = plt.figure(figsize =(10,10))                           # Set the figure size
ax = plt.Axes(fig,[0,0,1,1]) 
ax.set_axis_off()                                            # Turn off axes
fig.add_axes(ax)
ax.imshow(rgbStretched, interpolation='bilinear', alpha=0.9) # Plot a natural color RGB
fig.savefig('{}{}_RGB.png'.format(outDir, outName))          # Export natural color RGB as jpeg

The natural color RGB composite shows some cloud cover over VIIRS tile h18v03. This tile covers parts of England, France, Belgium, the Netherlands, Germany, Poland, Denmark, Norway, and Sweden.

Above, the natural color RGB composite shows some cloud cover over VIIRS tile h18v03. This tile covers parts of England, France, Belgium, the Netherlands, Germany, Poland, Denmark, Norway, and Sweden. Notice the dry conditions (tan-brown) particularly evident over Denmark, Germany, and Poland.

del rgbStretched # Remove the rgb array (no longer needed)

4. Quality Filtering

This section demonstrates how to decode bit values, determine which values to keep and which values to mask, and how to apply masks to the VNP09GA data arrays. Begin by importing the Quality Flag 5 from the VNP09GA HDF-EOS5 file.

qf = f[[a for a in all_datasets if 'QF5' in a][0]][()] # Import QF5 SDS

Surface Reflectance Quality Flag QF5 contains an overall quality determination (good/bad) for each 1 km surface reflectance band (M1-M7).

Above are the bit field mappings for Quality Flag layer 5. This table can also be found in the VIIRS SR Version 1.7 User Guide under Section 3.3, Table 14 (pp. 34-35). Below, use the 4th, 5th, 6th, and 7th bits to assure that the overall quality for the bands used in this example (M3-5, M7) are set to 0, or Good.

Note: Remember that bits are read from right to left (bit 0 is the Least Significant Bit).

4a. Decode Bit Values

Below, set the number of bits and create a list of all possible combinations (bit values). Based on the table above, only include pixels where the QF5 layer values are equal to 0 for bit numbers 4-7, meaning that the overall quality for bands M3-M5 and M7 are set to Good.

bits = 8                        # Define number of bits
vals = list(range(0,(2**bits))) # Generate a list of all possible bit values 
goodQF = []                     # Create an empty list used to store bit values where bits 4-7 = 0

Next, loop through all possible values, convert numeric to binary, find values where bits 4-7 = 0000, and add them to the list of good values.

for v in vals:  
    bitVal = format(vals[v],'b').zfill(bits)             # Convert to binary based on values and # of bits defined above:
    if bitVal[0:4] == '0000':                            # Keep if bit-words 5-8 to all = 0
        goodQF.append(vals[v])                           # Append to list
        print('\n' + str(vals[v]) + ' = ' + str(bitVal)) # print good quality values

0 = 00000000

1 = 00000001

2 = 00000010

3 = 00000011

4 = 00000100

5 = 00000101

6 = 00000110

7 = 00000111

8 = 00001000

9 = 00001001

10 = 00001010

11 = 00001011

12 = 00001100

13 = 00001101

14 = 00001110

15 = 00001111

Notice above, all of the values determined to be good quality contain 0000 reading left to right, which represent bits 4-7.

4b. Apply Quality and Land/Water Masks

Use np.ma.MaskedArray() and the list of good quality values to apply a QA mask to the surface reflectance data from the QF5 layer.

# Apply mask to each SR SDS based on the qf layer and the list of good quality values generated above
red = np.ma.MaskedArray(red, np.in1d(qf, goodQF, invert = True))    
green = np.ma.MaskedArray(green, np.in1d(qf, goodQF, invert = True))    
blue = np.ma.MaskedArray(blue, np.in1d(qf, goodQF, invert = True))    
nir = np.ma.MaskedArray(nir, np.in1d(qf, goodQF, invert = True))

Below, run through the same steps as above, this time using the Surface Reflectance Quality Flag QF2, which contains a land/water mask that can be used to mask out water pixels in the surface reflectance arrays. More information on QF2 can be found in Section 3.3, Table 11 (pp. 32-33) in the VIIRS SR Version 1.7 User Guide.

qf2 = f[[a for a in all_datasets if 'QF2' in a][0]][()]   # Import QF5 SDS
land = []                                                 # Create an empty list used to store bit values classified as land

Below, exclude each value where bits 0-2 are classified as:

010: Inland water
011: Sea Water

for v in vals:  
    bitVal = format(vals[v],'b').zfill(bits)           # Convert to binary based on values and # of bits defined above:
    if bitVal[-3:] != '010' and bitVal[-3:] != '011' : # Keep all values NOT equal to inland or sea water
        land.append(vals[v])                           # Append to list

# Apply mask to each SR SDS based on the qf2 layer and the list of land values generated above
red = np.ma.MaskedArray(red, np.in1d(qf2, land, invert = True))    
green = np.ma.MaskedArray(green, np.in1d(qf2, land, invert = True))    
blue = np.ma.MaskedArray(blue, np.in1d(qf2, land, invert = True))    
nir = np.ma.MaskedArray(nir, np.in1d(qf2, land, invert = True))

5. Functions

5a. Defining Functions

Below, define two functions for calculating the Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI).

In Python, to create a function, first define (def) the function which includes a name ndviCalc() and add variable names for the inputs needed to be passed into the function (red, nir). Adding return on the next line means that every time that the ndviCalc() function is executed, it will return the results of the equation ((nir - red)/(nir + red)) based on the input provided (red, nir).

# Define a function to calculate NDVI
def ndviCalc(red, nir):
    return ((nir - red)/(nir + red))

# Define a function to calculate EVI
def eviCalc(red, nir, blue):
    return (2.5 * (nir - red)/(nir + 6 * red - 7.5 * blue + 1))

5b. Executing Functions

Next, apply the functions to the masked surface reflectance data by passing the bands needed to calculate each Vegetation Index.

ndvi = ndviCalc(red, nir)                                   # Calculate NDVI from red/NIR bands
evi = eviCalc(red, nir, blue)                               # Calculate EVI from red, NIR, and blue bands
ndvi[np.where(np.logical_or(ndvi < 0, ndvi > 1)) ] = np.nan # Exclude VIs outside of range
evi[np.where(np.logical_or(evi < 0, evi > 1)) ] = np.nan    # Exclude VIs outside of range

6. Visualizing Results

6a. Plot Quality-Filtered VIs

In this section, begin by highlighting the functionality of the matplotlib plotting package. First, make basic plots of the NDVI and EVI results. Next, combine both plots into a single image and add some additional parameters to the plot. Finally, export the completed plot to an image file.

fig = plt.figure(figsize=(10,10)) # Set the figure size (x,y)
fig.set_facecolor("black")        # Set the background color to black
plt.axis('off')                   # Remove axes from plot
plt.imshow(ndvi, cmap = 'YlGn');  # Plot the array using a color map

A basic matplotlib plot of the quality filtered, water masked NDVI array centered over Germany and Denmark in Europe.

Above, adding ; to the end of the call suppresses the text output from appearing in the notebook. The output is a basic matplotlib plot of the quality filtered, water masked NDVI array centered over Germany and Denmark in Europe. Below, create the same plot for EVI.

fig = plt.figure(figsize=(10,10)) # Set the figure size (x,y)
fig.set_facecolor("black")        # Set the background color to black
plt.axis('off')                   # Remove axes from plot
plt.imshow(evi, cmap = 'YlGn');   # Plot the array using a color map

Next, add some additional parameters to the visualization and combine both plots into a single figure. Export the figure as a .png file to the output directory created earlier in the tutorial.

t = 'VIIRS Vegetation Indices'                                                           # Set title to a variable
figure, axes = plt.subplots(nrows=1, ncols=2, figsize=(20, 8), sharey=True, sharex=True) # Set subplots with 1 row & 2 columns
axes[0].axis('off'), axes[1].axis('off')                                                 # Turn off axes' values
figure.set_facecolor("black")                                                            # Set the background color to black
figure.suptitle('{}: {}'.format(t, date), fontsize=24, fontweight='bold', color="white") # Add a title for the plots
axes[0].set_title('NDVI', fontsize=18, fontweight='bold', color="white")                 # Set the first subplot title
axes[1].set_title('EVI', fontsize=18, fontweight='bold', color="white")                  # Set the second subplot title
axes[0].imshow(ndvi, vmin=0.1, vmax=0.9, cmap='YlGn');                                   # Plot original data
axes[1].imshow(evi, vmin=0.1, vmax=0.9, cmap='YlGn');                                    # Plot Quality-filtered data
figure.savefig('{}VIIRS_VIs.png'.format(outDir), facecolor = "black")                    # Set up filename, export to png

6b. Scatterplot Comparison of NDVI vs. EVI

Plot all NDVI vs. EVI values for the quality filtered, land-only data derived from VIIRS Surface Reflectance on July 2, 2018.

t = 'VIIRS Surface Reflectance-derived EVI vs. NDVI'                    # Set title string to variable
figure, axes = plt.subplots(figsize=(10,8))                             # Set the figure size
figure.suptitle('{}: {}'.format(t, date),fontsize=22,fontweight='bold') # Add a title
figure.set_facecolor("white")                                           # Set background color to white
axes.set_ylabel('EVI',fontsize=18,fontweight='bold')                    # Set y axis labels
axes.set_xlabel('NDVI',fontsize=18,fontweight='bold'),                  # Set x axis labels
axes.scatter(ndvi,evi, alpha=1, c='darkgreen', s=1);                    # Plot values for NDVI and EVI

7. Georeferencing

In this section, identify the upper-left longitude and latitude coordinates from the metadata, the output pixel size, and the proj.4 string that will later be used to export the quality filtered, land-only VIIRS Vegetation Indices data that were created earlier in the tutorial as GeoTIFFs.

ulc = [i for i in fileMetadata if 'UpperLeftPointMtrs' in i][0]    # Search file metadata for the upper left corner of the file
ulcLon = float(ulc.split('=(')[-1].replace(')', '').split(',')[0]) # Parse metadata string for upper left corner lon value
ulcLat = float(ulc.split('=(')[-1].replace(')', '').split(',')[1]) # Parse metadata string for upper left corner lat value

Currently, VIIRS HDF-EOS5 files do not contain information regarding the spatial resolution of the dataset within, so here set to the standard nominal resolution of 1 km for this product.

yRes, xRes = -926.6254330555555,  926.6254330555555 # Define the x and y resolution   
geoInfo = (ulcLon, xRes, 0, ulcLat, 0, yRes)        # Define geotransform parameters

The VNP09GA.001 product is referenced to a projected Sinusoidal coordinate system identical to the one used for the Moderate Resolution Imaging Spectroradiometer (MODIS) collection, defined below.

prj = 'PROJCS["unnamed",\
GEOGCS["Unknown datum based upon the custom spheroid", \
DATUM["Not specified (based on custom spheroid)", \
SPHEROID["Custom spheroid",6371007.181,0]], \
PRIMEM["Greenwich",0],\
UNIT["degree",0.0174532925199433]],\
PROJECTION["Sinusoidal"], \
PARAMETER["longitude_of_center",0], \
PARAMETER["false_easting",0], \
PARAMETER["false_northing",0], \
UNIT["Meter",1]]'
print(prj)

PROJCS["unnamed",GEOGCS["Unknown datum based upon the custom spheroid", DATUM["Not specified (based on custom spheroid)", SPHEROID["Custom spheroid",6371007.181,0]], PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Sinusoidal"], PARAMETER["longitude_of_center",0], PARAMETER["false_easting",0], PARAMETER["false_northing",0], UNIT["Meter",1]]

8. Exporting Results

8a. Set Up a Dictionary

In this section, create a dictionary containing each of the arrays that will be exported as GeoTIFFS, including the Surface Reflectance bands, the derived vegetation indices, and quality flag layers used above and define a band name for each that will be used to set the output filename.

params = {'red':{'data':red, 'band': 'M5'}, 'green':{'data':green, 'band': 'M4'},'blue':{'data':blue, 'band': 'M3',},
          'ndvi':{'data':ndvi, 'band': 'NDVI'},'evi':{'data':evi, 'band': 'EVI'}, 'qf':{'data':qf, 'band': 'QF'},
          'qf2':{'data':qf2, 'band': 'QF2'}, 'rgb':{'data':rgb, 'band': 'RGB'}}

8b. Export Masked Arrays as GeoTIFFs

Now that the data have been imported, subset, filtered, and functions have been executed, steps to export the results as masked GeoTIFFs using a for loop are provided in this section.

# Loop through each item in dictionary created above
for p in params:
    try: 
        data = params[p]['data']                                                            # Define the array to be exported
        data[data.mask == True] = fillValue                                                 # Set masked values = to fill value
    except: AttributeError
    outputName = os.path.normpath('{}{}_{}.tif'.format(outDir, outName, params[p]['band'])) # Generate output filename
    nRow, nCol = data.shape[0], data.shape[1]                                               # Define rows/cols from array size
    dataType = gdal_array.NumericTypeCodeToGDALTypeCode(data.dtype)                         # Define output data type
    driver = gdal.GetDriverByName('GTiff')                                                  # Select GDAL GeoTIFF driver
    if p == 'rgb':                                                                          # Diff process for exporting RGB
        data = params[p]['data']                                                            # Define the array to be exported
        dataType = gdal_array.NumericTypeCodeToGDALTypeCode(data.dtype)                     # Define output data type
        options = ['PHOTOMETRIC=RGB', 'PROFILE=GeoTIFF']                                    # Set options to RGB TIFF
        outFile = driver.Create(outputName, nCol, nRow, 3, dataType, options=options)       # Specify parameters of the GeoTIFF
        for b in range(data.shape[2]):                                                      # loop through each band (3)
            outFile.GetRasterBand(b+1).WriteArray(rgb[:,:,b])                               # Write to output bands 1-3
            outFile.GetRasterBand(b+1).SetNoDataValue(0)                                    # Set fill value for each band
        outFile.GetRasterBand(1).SetColorInterpretation(gdal.GCI_RedBand)                   # Define red band
        outFile.GetRasterBand(2).SetColorInterpretation(gdal.GCI_GreenBand)                 # Define green band
        outFile.GetRasterBand(3).SetColorInterpretation(gdal.GCI_BlueBand)                  # Define blue band
    else:
        outFile = driver.Create(outputName, nCol, nRow, 1, dataType)                        # Specify parameters of the GeoTIFF
        band = outFile.GetRasterBand(1)                                                     # Get band 1
        band.WriteArray(data)                                                               # Write data array to band 1
        band.FlushCache                                                                     # Export data
        band.SetNoDataValue(float(fillValue))                                               # Set fill value
    outFile.SetGeoTransform(geoInfo)                                                        # Set Geotransform
    outFile.SetProjection(prj)                                                              # Set projection
    outFile = None                                                                          # Close file
    print('Processing: {}'.format(outputName))                                              # Print the progress

9. Working with Time Series

Here, automate steps 2-5, 7-8 from above to get data for the entire month of July for visualization and statistical analysis of the progression of drought conditions in Europe.

You will need to download the VNP09GA data product for tile h18v03 for the month of July 2018 to complete Section 9. There are multiple options for downloading these files, including via a text file containing links to each of the files on the LP DAAC Data Pool, or through NASA Earthdata Search using the query provided in the link. Make sure that you download the files into the inDir directory defined above to complete this section.

The cell below will find all the .h5 files in that directory, create a list of all the file names, and print the results.

fileList = [file for file in os.listdir() if file.endswith('.h5') and file.startswith('VNP09GA')] # Search for .h5 files in current directory
for f in fileList: print(f)

VNP09GA.A2018182.h18v03.001.2018183084726.h5
VNP09GA.A2018183.h18v03.001.2018184074906.h5
VNP09GA.A2018184.h18v03.001.2018185074620.h5
VNP09GA.A2018185.h18v03.001.2018186075154.h5
VNP09GA.A2018186.h18v03.001.2018187181948.h5
VNP09GA.A2018187.h18v03.001.2018190194607.h5
VNP09GA.A2018188.h18v03.001.2018190220857.h5
VNP09GA.A2018189.h18v03.001.2018190213502.h5
VNP09GA.A2018190.h18v03.001.2018191075356.h5
VNP09GA.A2018191.h18v03.001.2018192075252.h5
VNP09GA.A2018192.h18v03.001.2018193085141.h5
VNP09GA.A2018193.h18v03.001.2018194083824.h5
VNP09GA.A2018194.h18v03.001.2018195081724.h5
VNP09GA.A2018195.h18v03.001.2018196075344.h5
VNP09GA.A2018196.h18v03.001.2018197081921.h5
VNP09GA.A2018197.h18v03.001.2018199192103.h5
VNP09GA.A2018198.h18v03.001.2018222183129.h5
VNP09GA.A2018199.h18v03.001.2018200091624.h5
VNP09GA.A2018200.h18v03.001.2018201080455.h5
VNP09GA.A2018201.h18v03.001.2018202072301.h5
VNP09GA.A2018202.h18v03.001.2018203073754.h5
VNP09GA.A2018203.h18v03.001.2018204141303.h5
VNP09GA.A2018204.h18v03.001.2018205083730.h5
VNP09GA.A2018205.h18v03.001.2018206080321.h5
VNP09GA.A2018206.h18v03.001.2018207164400.h5
VNP09GA.A2018207.h18v03.001.2018208073329.h5
VNP09GA.A2018208.h18v03.001.2018209083918.h5
VNP09GA.A2018209.h18v03.001.2018210082556.h5
VNP09GA.A2018210.h18v03.001.2018211084421.h5
VNP09GA.A2018211.h18v03.001.2018214203714.h5
VNP09GA.A2018212.h18v03.001.2018214223310.h5

date = [] # Create empty list to store dates of each file
i = 0     # Set up iterator for automation in cell block below

9a. Automation of Steps 2-5, 7-8

This section provides steps to loop through each daily VNP09GA file from July 2018 and import the data, scale the data, apply a land/water mask and filter by quality, calculate VIs, georeference, and export as GeoTIFFs.

Ultimately, near the end of the code block below, append each file into a 3D array containing all NDVI observations for July 2018.

for t in fileList:
    yeardoy = t.split('.')[1][1:]                                                                  # Split name,retrieve ob date
    outName = t.rsplit('.', 1)[0]                                                                  # Keep filename for outname
    date1 = dt.datetime.strptime(yeardoy,'%Y%j').strftime('%m/%d/%Y')                              # Convert date
    date.append(date1)                                                                             # Append to list of dates
    f = h5py.File(os.path.normpath(t),"r")                                                             # Read in VIIRS HDF-EOS5 file
    h5_objs = []                                                                                   # Create empty list
    f.visit(h5_objs.append)                                                                        # Retrieve obj append to list
    fileMetadata = f['HDFEOS INFORMATION']['StructMetadata.0'][()].split()                         # Read file metadata
    fileMetadata = [m.decode('utf-8') for m in fileMetadata]                                       # Clean up file metadata
    allSDS = [o for grid in grids for o in h5_objs if isinstance(f[o],h5py.Dataset) and grid in o] # Create list of SDS in file
    r = f[[a for a in allSDS if 'M5' in a][0]]                                                     # Open SDS M5 = Red
    g = f[[a for a in allSDS if 'M4' in a][0]]                                                     # Open SDS M4 = Green
    b = f[[a for a in allSDS if 'M3' in a][0]]                                                     # Open SDS M3 = Blue
    n = f[[a for a in allSDS if 'M7' in a][0]]                                                     # Open SDS M7  = NIR
    red = r[()] * scaleFactor                                                                      # Apply scale factor
    green = g[()] * scaleFactor                                                                    # Apply scale factor
    blue = b[()] * scaleFactor                                                                     # Apply scale factor
    nir = n[()] * scaleFactor                                                                      # Apply scale factor
    rgb = np.dstack((red,green,blue))                                                              # Create RGB array
    rgb[rgb == fillValue * scaleFactor] = 0                                                        # Set fill value equal to nan
    qf = f[[a for a in allSDS if 'QF5' in a][0]][()]                                               # Import QF5 SDS
    red = np.ma.MaskedArray(red, np.in1d(qf, goodQF, invert = True))                               # Apply QF mask to red data
    green = np.ma.MaskedArray(green, np.in1d(qf, goodQF, invert = True))                           # Apply QF mask to green data
    blue = np.ma.MaskedArray(blue, np.in1d(qf, goodQF, invert = True))                             # Apply QF mask to blue data
    nir = np.ma.MaskedArray(nir, np.in1d(qf, goodQF, invert = True))                               # Apply QF mask to NIR data
    qf2 = f[[a for a in allSDS if 'QF2' in a][0]][()]                                              # Import QF2 SDS                                                                  # Append to list
    red = np.ma.MaskedArray(red, np.in1d(qf2, land, invert = True))                                # Apply QF mask to red data
    green = np.ma.MaskedArray(green, np.in1d(qf2, land, invert = True))                            # Apply QF mask to green data
    blue = np.ma.MaskedArray(blue, np.in1d(qf2, land, invert = True))                              # Apply QF mask to blue data
    nir = np.ma.MaskedArray(nir, np.in1d(qf2, land, invert = True))                                # Apply QF mask to NIR data
    ndvi = ndviCalc(red, nir)                                                                      # Calculate NDVI
    evi = eviCalc(red, nir, blue)                                                                  # Calculate EVI
    ndvi[np.where(np.logical_or(ndvi < 0, ndvi > 1)) ] = np.nan                                    # Set fill value equal to nan
    evi[np.where(np.logical_or(evi < 0, evi > 1)) ] = np.nan                                     # Set fill value equal to nan
    p2, p98 = np.percentile(rgb, (2, 98))                                                          # Calc val at 2nd/98th % 
    rgbStretched = exposure.rescale_intensity(rgb, in_range=(p2, p98))                             # Contrast stretch RGB range
    rgbStretched = exposure.adjust_gamma(rgbStretched, 0.5)                                        # Perform Gamma Correction
    fig = plt.figure(figsize =(10,10))                                                             # Set the figure size
    ax = plt.Axes(fig,[0,0,1,1]) 
    ax.set_axis_off()                                                                              # Turn off axes
    fig.add_axes(ax)
    ax.imshow(rgbStretched, interpolation='bilinear', alpha=0.9)                                   # Plot a natural color RGB
    fig.savefig('{}{}_RGB.png'.format(outDir, outName))                                            # Export  RGB as png
    plt.close(fig)
    params = {'red':{'data':red, 'band': 'M5'}, 'green':{'data':green, 'band': 'M4'},              # Create dict for each layer
              'blue':{'data':blue, 'band': 'M3',}, 'ndvi':{'data':ndvi, 'band': 'NDVI'},
              'evi':{'data':evi, 'band': 'EVI'}, 'qf':{'data':qf, 'band': 'QF'},
              'qf2':{'data':qf2, 'band': 'QF2'}, 'rgb':{'data':rgb, 'band': 'RGB'}}
    for p in params:
        try: 
            data = params[p]['data']                                                               # Define array to be exported
            data[data.mask == True] = fillValue                                                    # Masked values = fill value
        except: AttributeError
        outputName = os.path.normpath('{}{}_{}.tif'.format(outDir, outName, params[p]['band']))    # Generate output filename
        nRow, nCol = data.shape[0], data.shape[1]                                                  # Define row/col from array
        dataType = gdal_array.NumericTypeCodeToGDALTypeCode(data.dtype)                            # Define output data type
        driver = gdal.GetDriverByName('GTiff')                                                     # Select GDAL GeoTIFF driver
        if p == 'rgb':                                                                             # Diff for exporting RGBs
            data = params[p]['data']                                                               # Define the array to export
            dataType = gdal_array.NumericTypeCodeToGDALTypeCode(data.dtype)                        # Define output data type
            options = ['PHOTOMETRIC=RGB', 'PROFILE=GeoTIFF']                                       # Set options to RGB TIFF
            outFile = driver.Create(outputName, nCol, nRow, 3, dataType, options=options)          # Specify parameters of GTIFF
            for b in range(data.shape[2]):                                                         # loop through each band (3)
                outFile.GetRasterBand(b+1).WriteArray(rgb[:,:,b])                                  # Write to output bands 1-3
                outFile.GetRasterBand(b+1).SetNoDataValue(0)                                       # Set fill val for each band
            outFile.GetRasterBand(1).SetColorInterpretation(gdal.GCI_RedBand)                      # Define red band
            outFile.GetRasterBand(2).SetColorInterpretation(gdal.GCI_GreenBand)                    # Define green band
            outFile.GetRasterBand(3).SetColorInterpretation(gdal.GCI_BlueBand)                     # Define blue band
        else:
            outFile = driver.Create(outputName, nCol, nRow, 1, dataType)                           # Specify parameters of GTIFF
            band = outFile.GetRasterBand(1)                                                        # Get band 1
            band.WriteArray(data)                                                                  # Write data array to band 1
            band.FlushCache                                                                        # Export data
            band.SetNoDataValue(float(fillValue))                                                  # Set fill value
        outFile.SetGeoTransform(geoInfo)                                                           # Set Geotransform
        outFile.SetProjection(prj)                                                                 # Set projection
        outFile = None                                                                             # Close file
    if i == 0:                                                                                     # Set up 3d raster stack
        ndviTS = ndvi[np.newaxis,:,:]                                                              # Add 3rd axis for stacking 
    else:
        ndvi  = ndvi[np.newaxis,:,:]                                                               # Add a third axis
        ndviTS = np.ma.append(ndviTS,ndvi, axis=0)                                                 # Append to 'master' 3d array
    print('Processed file: {} of {}'.format(i+1,len(fileList)))                                    # Print the progress
    i += 1                                                                                         # increase iterator by one

Processed file: 1 of 31
Processed file: 2 of 31
Processed file: 3 of 31
Processed file: 4 of 31
Processed file: 5 of 31
Processed file: 6 of 31
Processed file: 7 of 31
Processed file: 8 of 31
Processed file: 9 of 31
Processed file: 10 of 31
Processed file: 11 of 31
Processed file: 12 of 31
Processed file: 13 of 31
Processed file: 14 of 31
Processed file: 15 of 31
Processed file: 16 of 31
Processed file: 17 of 31
Processed file: 18 of 31
Processed file: 19 of 31
Processed file: 20 of 31
Processed file: 21 of 31
Processed file: 22 of 31
Processed file: 23 of 31
Processed file: 24 of 31
Processed file: 25 of 31
Processed file: 26 of 31
Processed file: 27 of 31
Processed file: 28 of 31
Processed file: 29 of 31
Processed file: 30 of 31
Processed file: 31 of 31

Notice above that the for loop also stacked the monthly observations into a three-dimensional array that we will use to generate time series visualizations in the final section.

ndviTS = np.ma.MaskedArray(ndviTS, np.in1d(ndviTS, fillValue, invert = False)) # Mask fill values in NDVI time series (3d array)

9b. Time Series Analysis

This section provides steps for how to calculate basic statistics on the NDVI time series created above, export the results to a .csv file, and visualize the mean and standard deviation of NDVI over Central Europe to gain a better understanding of the progression of drought in the region.

First set a pandas dataframe to store the statistics.

# Create dataframe with column names
stats_df = pd.DataFrame(columns=['File Name', 'Dataset', 'Date', 'Count', 'Minimum', 'Maximum', 'Range','Mean','Median',
                        'Upper Quartile', 'Lower Quartile', 'Upper 1.5 IQR', 'Lower 1.5 IQR', 'Standard Deviation', 'Variance'])

Statistics: The statistics calculated below include box plot statistics, such as the total number of values/pixels (n), the median, and the interquartile range (IQR) of the sample data for values within a feature from a selected date/layer. Also calculated are the `Lower 1.5 IQR`, which represent the lowest datum still within 1.5 IQR of the lower quartile, and the `Upper 1.5 IQR`, which represent the highest datum still within 1.5 IQR of the upper quartile. Additional statistics generated on each layer/observation/feature combination include min/max/range, mean, standard deviation, and variance.

Below, use Numpy functions to calculate statistics on the quality-masked, land-only data. The format( , '4f') command will ensure that the statistics are rounded to four decimal places.

i = 0                                                                     # Set up iterator
for n in ndviTS:                                                          # Loop through each layer (day) in NDVI time series
    ndvi = n.data[~n.mask]                                                # Exclude poor quality and water (masked) pixels
    ndvi = ndvi[~np.isnan(ndvi)]                                          # Exclude nan pixel values
    n = len(ndvi)                                                         # Count of array
    min_val = float(format((np.min(ndvi)), '.4f'))                        # Minimum value in array
    max_val = float(format((np.max(ndvi)), '.4f'))                        # Maximum value in array
    range_val = (min_val, max_val)                                        # Range of values in array
    mean = float(format((np.mean(ndvi)), '.4f'))                          # Mean of values in array
    std = float(format((np.std(ndvi)), '.4f'))                            # Standard deviation of values in array
    var = float(format((np.var(ndvi)), '.4f'))                            # Variance of values in array
    median = float(format((np.median(ndvi)), '.4f'))                      # Median of values in array
    quartiles = np.percentile(ndvi, [25, 75])                             # 1st (25) & 3rd (75) quartiles of values in array 
    upper_quartile = float(format((quartiles[1]), '.4f'))                      
    lower_quartile = float(format((quartiles[0]), '.4f'))                      
    iqr = quartiles[1] - quartiles[0]                                     # Interquartile range
    iqr_upper = upper_quartile + 1.5 * iqr                                # 1.5 IQR of the upper quartile
    iqr_lower = lower_quartile - 1.5 * iqr                                # 1.5 IQR of the lower quartile
    top = float(format(np.max(ndvi[ndvi <= iqr_upper]), '.4f'))           # Highest datum within 1.5 IQR of upper quartile
    bottom = float(format(np.min(ndvi[ndvi>=iqr_lower]),'.4f'))           # Lowest datum within 1.5 IQR of lower quartile
    d = {'File Name':(fileList[i]),'Dataset': 'NDVI','Date': date[i],
         'Count':n,'Minimum':min_val,'Maximum':max_val,'Range':range_val, 
         'Mean':mean,'Median':median,'Upper Quartile':upper_quartile,
         'Lower Quartile':lower_quartile,'Upper 1.5 IQR':top, 
         'Lower 1.5 IQR':bottom,'Standard Deviation':std, 'Variance':var} 
    stats_df = stats_df.append(d, ignore_index=True)                      # Append the dictionary to the dataframe
    i = i + 1                                                             # Add one to iterator

# Remove variables that are no longer needed
del d, n, min_val, max_val, range_val, mean, std, var, median, quartiles, upper_quartile 
del bottom,  axes, yeardoy, lower_quartile, iqr, iqr_upper, iqr_lower, top

Next, export the Pandas dataframe containing the statistics to a csv file.

stats_df.to_csv('{}{}-Statistics.csv'.format(outDir, outName), index = False) # Export statistics to CSV

Use the previously calculated statistics to plot a time series of Mean NDVI for the month of July 2018. Taking the figure one step further, plot the standard deviation as a confidence interval around the mean NDVI as a time series.

np.arange(0,len(ndviTS),1)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

fig = plt.figure(1, figsize=(15, 10))                                                           # Set the figure size
fig.set_facecolor("white")                                                                      # Set the background color
dates = np.arange(0,len(ndviTS),1)                                                              # Arrange all dates
ax = fig.add_subplot(111)                                                                       # Create a subplot
ax.plot(stats_df['Mean'], 'k', lw=2.5, color='black')                                           # Plot as a black line
ax.plot(stats_df['Mean'], 'bo', ms=10, color='lightgreen')                                      # Plot as a green circle

# Add and subtract the std dev from the mean to form the upper and lower intervals for the plot, then plot linear regression
ax.fill_between(dates,stats_df.Mean-stats_df['Standard Deviation'],stats_df.Mean+stats_df['Standard Deviation'],color='#254117') 
ax.plot(np.unique(dates), np.poly1d(np.polyfit(dates, stats_df['Mean'], 1))(np.unique(dates)), lw=3.5, c='yellow'); 
ax.set_xticks((np.arange(0,len(fileList))))                                                     # Set the x ticks
ax.set_xticklabels(date, rotation=45,fontsize=12)                                               # Set the x tick labels
ax.set_yticks((np.arange(0.4,1.01, 0.1)))                                                       # Arrange the y ticks
ax.set_yticklabels(np.arange(0.4,1.01, 0.1),fontsize=12,fontweight='bold')                      # Set the Y tick labels
ax.set_xlabel('Date',fontsize=16,fontweight='bold')                                             # Set x-axis label
ax.set_ylabel("{}({})".format('NDVI', 'unitless'),fontsize=16,fontweight='bold')                # Set y-axis label
ax.set_title('VIIRS Surface Reflectance-derived NDVI: July 2018',fontsize=22,fontweight='bold') # Set title
fig.savefig('{}{}_{}_Mean_TS.png'.format(outDir, outName, 'NDVI'), bbox_inches='tight')         # Set up filename and export

Above, based on the linear regression trend line, it appears that NDVI is decreasing throughout the month of July 2018 over Germany, Denmark, and the surrounding countries. This could indicate that the drought conditions were persisting throughout the month.

9c. Time Series Visualization

Now import the imageio package to create a gif of the daily RGB composites from earlier in the tutorial to visually observe drought conditions taking place over the month of July 2018.

import imageio                                                                 # Import package to create gif
gifList = [file for file in os.listdir(outDir) if file.endswith('RGB.png')]    # Search output dir for rgb png files 
images = []                                                                    # Create empty list
for filename in gifList:images.append(imageio.imread(outDir + filename))       # Append images to list
imageio.mimsave(outDir + 'VIIRS_RGB_July2018.gif', images, 'GIF', duration=1) # Export gif, edit speed using 'duration'

Lastly, visualize the gif created above in the Jupyter Notebook.

from IPython.display import Image, display               # Import packages   
with open(outDir + 'VIIRS_RGB_July2018.gif','rb') as f:  # Open gif
    display(Image(data=f.read(), format='png'))          # Display gif

Note: If the gif won't display, try increasing the notebook data rate limit when starting a new jupyter notebook: `jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10`

Citations

Krehbiel, C., (2018). Working with Daily NASA VIIRS Surface Reflectance Data [Jupyter Notebook]. NASA EOSDIS Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, USA. Accessed Month, DD, YYYY. https://git.earthdata.nasa.gov/projects/LPDUR/repos/nasa-viirs-tutorial…

Vermote, E., Franch, B., Claverie, M. (2016). VIIRS/NPP Surface Reflectance Daily L2G Global 1km and 500m SIN Grid V001 [Data set]. NASA EOSDIS Land Processes DAAC. doi: 10.5067/VIIRS/VNP09GA.001

Relevant Data and Tools

Product	Long Name
VNP09GA.001	VIIRS/NPP Surface Reflectance Daily L2G Global 1 km and 500 m SIN Grid
Data Pool	n/a
NASA Earthdata Search	n/a
Data Prep Scripts	n/a

Resources

Tutorial
Jupyter Notebook

Details

Last Updated

June 16, 2025

Published

Sept. 11, 2018

Data Archive

Land Processes DAAC (LP DAAC)