AI Models and Agriculture: How to Ensure Accurate Data Collection

August 07, 2023

The first blog in the Artificial Intelligence in Agriculture series, Artificial Intelligence: Leading the way Forward for Precision Agriculture, deals with the evolution of Precision Agriculture and how AI tools are pushing technologies forward, making them more accessible to farmers. In this second blog, AI Models and Agriculture: How to Ensure Accurate Data Collection, we will focus on the singular aspects of the agriculture industry that make the application of AI methodologies more complex than in other industries.

The blog series is based on Taranis’ recent white paper, Agriculture and Artificial Intelligence: How to Effectively Implement AI in Precision Agriculture.

Data, Data, and More Data

The most important factor when determining the performance of an AI model is the volume and quality of the data. The model’s ability to recognize and group things correctly depends on how closely the training data matches the real data. Therefore, gathering a rich and diverse dataset, and subsequently annotating (i.e. labeling the data and helping machines make sense of text, video, image, or audio data) this data, occupies a significant amount of time and effort from AI practitioners.

In an agricultural setting, quality crop imagery means capturing data from a wide range of scenarios and environmental conditions, including varying lighting, different seed varieties, soil types, and geographical regions. Certain inherent characteristics of agriculture create obstacles in the collection of this necessary data. Below, we will discuss two of these challenges.

Timing is Everything

The first hurdle during data collection in agriculture is the short time window in which we must collect the data. Subsequently, we must train the AI model in a similarly short time frame.

Let’s look at the scenario of smart agriculture technology used in the early detection of weeds and diseases to understand this issue better.

When using imagery to detect weeds and diseases, AI practitioners must collect data at specific intervals during the plants’ emergence and growth cycle. Timing is key, in order to detect problems in advance to alert ag advisors and farmers. The images need to be collected at relevant growth stages of the crop when emerging crops, weeds, and diseases are visible.

Row crops, such as corn and soybeans, grow quickly during the season; therefore, during various growth stages, the window to gather weed and disease data can range from one to three weeks annually.

The challenge of getting to fields during the brief windows and collecting sufficient training data, i.e., aerial imagery, is further compounded by two more factors:

The time window varies for each field, dependent on the planting date and crop type
In order for the data to be useful, one must collect images across a wide geographical region, e.g., the U.S. Corn Belt

Collecting the data is challenging because of these geographic and time requirements, yet critical for AI models to perform properly.

Data: Don’t Drift Away

In the previous AI in Agriculture blog, we explained that AI systems process large amounts of labeled data, analyze the data by searching for correlations and patterns, and use those patterns to predict future scenarios. The AI learning process involves collecting data and then creating rules for how to turn the data into actionable information. The rules, known as algorithms, provide computing devices with specific instructions for how to complete a task.

If the model experiences a significant change in the input data, it will cause a shift in data distributions. If the data collected varies from the training data, the models might not function optimally.

This phenomenon is called data drift. Data drift is a significant issue in agriculture because biological systems are dynamic and constantly in flux. When data drift occurs, the model must be retrained, or it will not perform properly.

Here are two examples of factors that change from season to season, potentially causing data drift in agricultural AI models:

Weeds – Weed species may change in a field over time, for various reasons, such as, strains of weed species can develop an immunity to herbicides, new herbicides become available on the market, or growers change their farming practices, for example, adopting no-till. These changes will alter the weed distribution and, hence, impact the accuracy of the data training set.

Crop Diseases – Diseases may fluctuate, and new diseases may appear and spread during a growing season. For example, a fungal disease of corn, Tar Spot, was first discovered in the U.S. in 2015 and has since spread throughout much of the Midwest. Adopting different seed strains may also increase or decrease the prevalence of certain diseases.

In the diagram, we see how data can drift over the course of two seasons. It contains a collection of images that show symptoms of the disease Brown Spot in soybean, gathered over two successive seasons, and projected onto a two-dimensional graph. The color of each data point corresponds to the season in which the data was collected.

Although the images are all from the same crop (soybeans), and exhibit the same disease symptoms (Brown Spot), there is a clear drift in the distribution of the data from one season to the next, and this drift is likely to increase over successive seasons.

Because of data drift, it is necessary to re-train AI models continuously to adapt to the changing data distributions and avoid degradation of performance.

So What’s the Solution?

The application of AI in agriculture can be complex because of the dynamic nature of the biological systems at play. The spread of weeds and diseases is impossible to predict and varies from season to season. Each field’s growing season is different, making the window to collect data complex.

How do we capture the right data at the right time?

In the next blog, we will discuss the methods Taranis uses to overcome these challenges.

Stay tuned for our next blog, where we delve into solutions applied by Taranis to overcome the complexities of using AI in Agriculture. You can also read the entire white paper here.

AI Models and Agriculture: How to Ensure Accurate Data Collection

Data, Data, and More Data

Timing is Everything

Data: Don’t Drift Away

So What’s the Solution?

RELATED ARTICLES

Artificial Intelligence: Leading the Way Forward for Precision Agriculture

Is It Legal for Drones to Fly Over My Farm?

Get the latest updates from Taranis