Satellite imagery powers a number of downstream applications at Climate; its value extends beyond image-based scouting maps. We use it to derive new data layers to map and monitor environmental or crop conditions at near real time globally. Those data layers then can be fed into other agronomic models like seeds, fungicide, or irrigation recommendations.
A Critical Layer in Analysis Ready Data
Cloud and cloud shadow masks are the most critical components of what we refer to as the Unusable Data Mask (UDM), which enables high-quality data from satellite imagery that is used for downstream analysis. Optical satellite sensors cannot “see” through clouds, and cloud shadows bring huge artifacts to the reflectance value of ground targets. The size, shape, form, and height of clouds (and by their nature their shadows) vary by latitude, terrain, and micro-climate.
We can focus our attention better on insights and the development of further recommendations for farmers through the use of Analysis Ready Data (ARD) Processes, which require an accurate, scalable cloud and shadow detection solution at a resolution smaller than the field level (a.k.a. the subfield scale).
Figure 1: Cloud and shadow problem on remote sensing image at subfield scale
While most satellite data providers usually supply UDM(s) with cloud information, cloud shadow information is often not available. In addition, the accuracy of vendor-provided UDM(s) is often not optimized at the subfield scale, creating greater potential for interpretation error in action-driven digital agriculture applications.
Cloud and Cloud Shadow Detection Model
Deep learning models, state-of-the-art methods to solve many image-based object detection and segmentation problems, are a promising technique for detecting cloud cover in satellite imagery. These models can extract both spatial features and spectral features to effectively detect clouds and their shadows.
An encoder-decoder framework (a variant of SegNet) is used for cloud segmentation. Figure 2 shows the encoder-decoder framework, where encoder is a typical convolutional neural network without a fully connected layer, whereas the decoder maps the low resolution feature map from the encoded image to original size. The model is also designed to handle images with arbitrary size so it can be applied more flexibly.
Figure 2: cloud detection with encoder-decoder framework
For shadow segmentation, we stacked not only the raw bands, but also a binary cloudmask band together as inputs. There are some false positive detections on hill shadow and bodies of water, necessitating that we also do a post-processing operation on cloud shadows.
We determined that we could leverage the relationship between cloud and cloud shadow based on sunlight, satellite sensor angles, and cloud height at the time of image acquisition for more accurate shadow detection. Specifically, if no cloud is found close to any detected shadow, the shadow is a false detection. By leveraging geometries, the number of false positives can be largely reduced.
Figure 3: simplified geometric relationship between cloud and shadow
Model Evaluation
The improvement of our cloud detection model will greatly enhance the experience a farmer has with the Climate FieldView™ platform by enabling more accurate analysis of imagery. Our platform users, and all the services within the platform that benefit from our cloud detection algorithms, will see fewer clouds and shadows being falsely detected.
Our deep learning-based cloudmask outperforms traditional machine learning-based cloud masks in terms of both precision and recall. Approximately 80% of our customers will see improvements related to cloud masks if the new model is deployed! This also results in performance improvements in any downstream models that use satellite images as input during science research.
Figure 4: two examples of cloud mask comparison with ML and DL methods
Model Deployment at Global Scale
Climate consumes tens of thousands of satellite field images per day. The cloud and cloud shadow identification algorithm needs to be deployed as a part of a scalable ARD processing pipeline. We proposed a processing pipeline that could apply deep-learning models to cut up large images and stitch them back together with few image artifacts, and to scale to many image sizes.
The journey to develop cloud and cloud shadow detection solutions highlights a typical set of problems, where efficiently scaling our analysis globally will enable production implementation of data-driven algorithms. This project requires cross-domain knowledge, including data science, machine learning, remote sensing, atmospheric science, agronomy, and engineering infrastructure. At Climate we are working on many more projects which improve our customers’ experience by applying data science in agriculture.