Innovative Approaches to Crop Classification Using AI
Written on
Chapter 1: Introduction to the AI4FoodSecurity Challenge
Machine Learning competitions provide an excellent opportunity to delve into new methodologies and concepts without needing a specific business context. Platforms like Kaggle are well-known for hosting open competitions where teams can showcase their solutions.
In this article, I will discuss the AI4FoodSecurity challenge, in which our team, EagleEyes, achieved second place on the leaderboard. Here’s what this article will cover:
- Overview of the AI for Earth Observation Platform
- Challenge Goals: Classifying crop types in South Africa and Germany
- Dataset: Satellite imagery and vegetation indices
- Model: Utilizing the Pixel-Set Encoder with Lightweight Temporal Attention
- Evaluation of our team’s approach
The code for replicating our solution is accessible on GitHub.
This video provides insights into crop type mapping through deep learning with TensorFlow.
Chapter 2: Understanding the AI4FoodSecurity Challenge
The AI4FoodSecurity Challenge was organized by AI4EO, an initiative aimed at merging Artificial Intelligence with Earth Observation. Supported by the European Space Agency and various industry and academic partners, AI4EO announces several challenges each year, offering well-curated training datasets and evaluations by experts. The top-performing teams receive rewards from sponsors, including access to computational resources, high-quality satellite imagery, and internship opportunities at prestigious research institutions.
In the AI4FoodSecurity challenge, participants were provided with satellite datasets from South Africa and Germany, including three different types of satellite data:
- Sentinel-1: Radar data
- Sentinel-2: Visual and infrared data
- Planet: Visual and infrared data
All datasets spanned an entire vegetation period, along with agricultural field boundaries and ground truth labels that indicated crop types for each agricultural field.
Chapter 3: Data Exploration and Challenges
The primary objective of the challenge was to determine the crop type using satellite imagery. In South Africa, the test dataset was from the same year but a different location, while in Germany, the dataset came from a neighboring area in the subsequent year, introducing both spatial and temporal shifts.
Focusing on the South African dataset, we began by analyzing the frequency of each crop type. Notably, the "Lucerne/Medics" class was disproportionately represented compared to others.
Addressing class imbalance is crucial, as it can negatively impact algorithm performance. This can be mitigated by employing weighted loss functions during model training.
Chapter 4: Normalized Vegetation Index (NDVI) Analysis
Different crop types exhibit distinct visual characteristics throughout their growth cycles. For instance, wheat fields are vibrant green early in the season but turn yellow before harvest, while meadows maintain their green appearance throughout.
We computed the Normalized Vegetation Index (NDVI), which combines red and infrared data, commonly used to identify vegetation in satellite imagery.
By comparing the NDVI averaged over all instances of crop types, we observed that Lucerne/Medics could be visually differentiated from the other four crops. We opted to utilize only the Planet and Sentinel-1 datasets due to the cloud coverage in the Sentinel-2 data that complicated processing.
Chapter 5: Model Implementation and Evaluation
We decided to implement the existing model known as the Pixel-Set Encoder with Lightweight Temporal Attention (PselTae). The model's architecture is based on the following principles:
- Randomly extract pixels from agricultural fields and calculate their summary statistics.
- Analyze time series data of these statistics using lightweight temporal attention.
PselTae has several advantages over traditional neural network architectures. It reduces sensitivity to spurious spatial patterns by not enforcing spatial dependencies between neighboring pixels. Additionally, its lightweight attention mechanism requires less training data, making it suitable for datasets with fewer than 10,000 samples.
To address class imbalance, we utilized focal loss and employed 5-fold cross-validation to maximize the limited dataset size. We developed a separate PselTae model for each satellite dataset due to their varying spatial and temporal resolutions. Ultimately, we combined the predictions from all models to form a unified crop type forecast.
This video covers crop detection from satellite imagery using deep learning techniques.
Chapter 6: Results and Future Directions
We present our results for crop type classification in South Africa:
Due to the privacy of the test data, we lack direct ground truth labels for comparison. However, our high score allowed us to achieve second place on the leaderboard.
For further insights, check out our implementation on GitHub:
- GitHub - crlna16/ai4foodsecurity: AI4FoodSecurity Challenge: Identify crops within season and in…
For additional context on environmental data science, refer to my previous article:
- Environmental Data Science: An Introduction: Examples, challenges, and perspectives for working with environmental data.
References
- Data License: Data processed from MLHUB. Planet data CC-BY-SA-2.0; Sentinel-1 data CC-BY-4.0.