« NSF Unidata Update:... | Main | AWIPS Tips: Using... »

Convolutional Neural Networks (CNNs) for Earth Systems Science

06 June 2024

By Thomas Martin

Convolutional Neural Networks (CNNs) are a powerful class of deep learning models widely applied in Earth science for image analysis, classification, and regression problems. Leveraging the Keras framework in python, CNNs can efficiently process and extract spatial features from 2D and 3D remote sensing, model output, and other Earth Systems Science (ESS) data types.

An important feature of CNNs for ESS is their relative scale invariance (i.e. where specific features are on a 2/3D array), a characteristic that emerges from their architectural design. Scale invariance is facilitated by the use of local receptive fields, allowing the network to analyze specific regions of input data. By operating on local regions rather than the entire image, CNNs effectively capture features at different scales within the data. Additionally, CNNs employ weight sharing, where the same set of weights is applied across various spatial locations. This weight sharing mechanism enables the network to detect features regardless of their position in the input image, contributing to overall scale invariance. In addition, CNN architectures typically incorporate pooling layers, which downsample feature maps, reducing spatial dimensions while retaining essential information. This downsampling process enhances the network's focus on salient features while diminishing sensitivity to small spatial variations, further reinforcing its scale invariance.

(click to enlarge)

The image at right, from Visual Guide to Applied Convolution Neural Networks, shows how the filtering process works for a CNN. After a filter (or kernel) size is chosen, the filter array is populated with random values, then multiplied with each sub-array of matching size in the image (the convolution step) to create a feature map. This process is then repeated with a new set of filter values. After a large number of convolutions have been completed and the feature maps constructed, the algorithm chooses the filter array that results in the best match with existing training data. In this case, the algorithm identifies features that match those in a training set picturing dogs, and makes a prediction about whether the image being processed also represents a dog.

CNNs bear resemblance to standard filtering analysis, primarily through their shared use of convolutional operations. In both approaches, the convolution operation serves as a core mechanism for feature extraction. Moreover, both CNNs and standard filtering analysis operate hierarchically, capturing spatial hierarchies of features within the input. Through multiple convolutional layers, CNNs progressively extract higher-level features by amalgamating information from lower-level features, mirroring the hierarchical processing seen in standard filtering analysis.

While CNNs are well loved, they do have downsides:

Data Requirements: CNNs need large labeled datasets for training
Overfitting Risk: CNNs are prone to overfitting, where they memorize training data rather than generalize to new examples.
Interpretability Challenges: While some XAI (explainable AI) techniques exist to interpret input and outputs to CNNs, these tools are not perfect.
Generally not an appropriate model choice for tabular datasets.

These are not the only downsides, but things to keep in mind for your specific project.

Quick Code Block

CNN's in Keras 3.0 can be defined in around 10 lines of code:

  # Define a sequential model  model = keras.Sequential([  # First convolutional layer with 32 filters of size 5x5 and same padding  Conv2D(32, (5, 5), padding='same', strides=(1, 1)),  # Exponential Linear Unit (ELU) activation function for non-linearity  ELU(),  # Second convolutional layer with 32 filters of size 5x5 and same padding  Conv2D(32, (5, 5), padding='same'),  # ELU activation function  ELU(),  # Third convolutional layer with 1 filter of size 5x5 and same padding  Conv2D(1, (5, 5), padding='same'),  # No activation since we are solving a regression problem  ])

If you want to explore more on how this CNN was used to predict future pressure levels, take a look at the WeatherBench notebook:

ESS Research that uses CNNs

Figure from article #1 at left.

Additional Resources for Learning about CNNs

Thomas Martin is an AI/ML Software Engineer at the NSF Unidata Program Center. Have questions? Contact support-ml@unidata.ucar.edu or book an office hours meeting with Thomas on his Calendar.

Posted by Unidata News