Building a flexible high-volume pipeline
Uploading large volumes of data and training machine learning models requires a lot of processing power, which can be costly. As a growing startup, Taranis did not want to invest millions of pounds on infrastructure, but still needed to handle large volumes of data. “Each of our drone flights collects around 10,000 images, and each image is between 10MB and 20MB,” says Eli. “We looked for a way of getting those images into our system as quickly as possible, as well as improving our machine learning training performance.”
“Agriculture is a seasonal business, so we have certain months of peak activity followed by quiet months, and we also have peaks throughout the day. During quiet times, we can scale back all our high-level Compute Engine GPU resources automatically so we don’t have to prepare our system in advance.” Eli Bukchin, Co-founder and CTO, Taranis
To do that, Taranis migrated to GCP. “The Google data centers in South America, Australia, and Europe offer really fast connectivity, and can handle the large volumes of drone images we upload. We have a throughput of around 30TB in total,” says Eli. “For processing the images once they are uploaded, we use V100 GPUs on Compute Engine. We have the capacity to easily scale from 1,000 to 4,000 V100s, and the system scales automatically when new images arrive. Information is deduced from the images, which is then uploaded to a Cloud SQL database before being served to our customers. We also have an image processing pipeline on Kubernetes Engine for our satellite images, in addition to using Cloud Functions and Cloud Pub/Sub.”
Flexibility and scalability are key to the Taranis image serving pipeline. “Agriculture is a seasonal business, so we have certain months of peak activity followed by quiet months, and we also have peaks throughout the day,” says Eli. “During quiet times, we can scale back all our high-level Compute Engine GPU resources automatically so we don’t have to prepare our system in advance.”
Preventing crop loss with AI
“For our machine learning model training pipeline, we use TensorFlow,” Eli explains. “Choosing TensorFlow has helped us develop our models rapidly, as there is a lot of support available through the open source community. To develop our models, we use tens of millions of photographs that we have collected over the past year and a half, which have been analyzed and tagged. Each photo might have up to a thousand items of interest, such as insect damage or leaf discoloration, so the data volumes are really significant. In total, we have processed around 100 million distinct features in around 700,000 images.”
The insights provided through Taranis’s dashboards give farmers the information they need to intervene early and prevent crop loss. “We enable farmers to target problems with concrete solutions, like adding fertilizer in a particular area that is low in nutrients. We are also looking to integrate our platform with autonomous farm machinery so a farmer can deploy the solution in a few clicks.”
“Our focus for the upcoming year is further geographical expansion, and continuously improving our machine learning models, as well as working on categorizing new ways to detect diseases. Thanks to Google Cloud Platform, we’re able to focus on the bigger picture.” Ofir Schlam, Co-founder and CEO, Taranis
A faster, more stable infrastructure
Thanks to GCP, Taranis is now able to upload its drone images effectively. “Before, uploading was three or four times slower, and could take up to a day rather than a few hours,” says Eli. “The difference is really substantial.”
Moving to GCP has also helped Taranis to reduce time spent on operations and simplify system updates. “We now release features almost continuously,” says Eli. “With Kubernetes, we can run parallel new and old versions, so we have fewer downtimes and we don’t have to schedule updates. This allows us to improve the product faster. We can test better and get faster feedback on new products and services.”
“Our sales team doesn’t need to worry about bringing a new customer on board, our infrastructure can scale to meet the workload generated by millions more acres. We are also able to work with customers in locations that were previously unreachable,” he adds. Migrating to GCP has also helped Taranis to reduce its costs. “Our cost per photo taken is now ten times lower,” says Eli.
Now, Taranis is planning to start using more data analytics tools on GCP. “We’re currently considering how Cloud Bigtable, BigQuery, and Cloud Dataflow might fit with our business needs,” says Eli. “As we run a lot of training instances, we need a way to analyze that data better, so we’re currently building the GCP infrastructure to gain BI insights into our own system.”
“Our focus for the upcoming year is further geographical expansion, and continuously improving our machine learning models, as well as working on categorizing new ways to detect diseases,” says Ofir Schlam, co-founder and CEO at Taranis. “Thanks to Google Cloud Platform, we’re able to focus on the bigger picture.”