Bayer Crop Sciences Cultivates Analytics for Crops, Seeds, and Digital Farming
Artificial intelligence for Precision Ag, and continuous learning and improvement
The End-to-end Data Pipeline
Michelle Lacy, data strategy lead for R&D in the Plant Biotechnology Division finds her place between business and IT. "My team is trying to bring data, data science, and IT together."
She says that sometimes, why a data scientist or IT professional is doing something, can get lost. “They’re more focused on building that product or model. So my job is to bring that context and keep them focused. ‘You’re doing this because we’re trying to protect our soybean franchise or we are working with the climate corporation to provide digital solutions to the farmers.'"
Its data-centric philosophy is relatively new to Bayer Crop Sciences, says Ms. Lacy. “Because we’re talking about tailored solutions, and a tailored solution combines all of the pillars within Bayer, you need an end-to-end data pipeline. What is your crop protection, your germ plasm, your trait? You've got to bring all those together. Starting at discovery all the way to handing off to product supply, you’ve got to know what the data flow is end-to-end.
Visualization, Virtualization, Adoption
Spotfire software has been used at Bayer since 2002, even before it was a TIBCO product. The same is true for data virtualization software. Where TIBCO Spotfire visual analytics solved the problem of getting users closer to their data, TIBCO Data Virtualization software solved the problem of understanding data sources and schemas and bringing them together without altering the physical data source.
When Bayer first acquired Spotfire software, there were 40 users; Today there are 14,000. In 2009, the company staged a bake off and invited TIBCO’s biggest competitors. "We gave them all the same dataset, and we had a checklist,” says Lacy. The company knew how Spotfire software performed and wanted to see how the others compared.
"Scalability was huge, and I’m not going to pick on any of the vendors, but one of them dropped out extremely quickly. The way TIBCO uses infrastructure and architecture makes it very attractive, because some of the competitors performed just as well as TIBCO, but you need a lot of resources to keep them running because they store everything. You always run out of disk space. TIBCO uses client and server resources and database resources, all three of those, so you can have a very performant platform that doesn’t cost, you know, a trillion dollars. Spotfire software blows everyone out of the water."
In 2010, the Bayer team knew the Spotfire solution filled a gap, but didn't know how big the gap was, how many people it affected across a multitude of backgrounds. After a Spotfire "showcase," usage grew to 300 within the year. A plan was needed to keep the solution well supported. "You really have to pay attention to make sure the hardware and the performance is at a high level. And train users," says Lacy. "You have to keep showing the power of the platform to keep adoption on the rise: 'This is the power of it. This is how it helps you.'"
Artificial Intelligence for Precision Agriculture
Bayer's customers are mostly tech-savvy farmers who strive for very high, very cost-effective productivity. The trend everyone is chasing, "Precision Ag," which places crops in the right place at the right time, requires a lot of data.
Traditionally a farmer would plant a field, spray the entire field with fertilizer and pesticide, and use full-coverage irrigation. But fields are not uniform, and plants need different conditions depending on where they are in the field. Some don't need fertilizer, some need more water, some less.
"Using image analytics, you can look at your field as a whole and then overlay the different data layers: soil, irrigation, fertilization, and say, 'All right. I understand my field now,'" says Lacy.
Bayer uses drones taking high-definition pictures to monitor crops. "And are you going to have someone look at each of these images? That just kills the ROI of that entire effort. With AI and algorithms, we can bring in these images that are huge and generate the data that would be very difficult for a human to do by themselves. That is pretty powerful, and it's the foundation of our image analytics platform."
Analytics Use Cases
Image analytics can also help Bayer understand plant traits. For areas damaged by tornados, for example, drone images, stitched together using AI, show which crops stood up and which crops fell over during storms. “We can say, 'These are the ones that we’re going to take forward because with all the wind damage, they're still standing.' It's hard to see that at our level, but overhead images are great."
Bayer also uses Spotfire analytics to track crops around the world and react proactively: determine if any equipment is needed, monitor weather conditions, check the status of planting, and other factors.
The chemistry organization had only one statistician. "People were doing what we called ‘practicing statistics without a license,' a lot of weird stuff," says Lacy. With Spotfire software's use of the R language, Lacy’s team got the statistician to put models in R, and they deployed them as standard applications along with data virtualization. "Everyone used the same models. It was consistent, and worked with just a button push. We got them to stop the pipeline they were using, and we simplified it with Spotfire analytics."
FAIR Principles for Everyone
Bayer's data principles: findable, accessible, interoperable, and reusable (FAIR) sets out a sort of a bill of rights. Users should be able to find their data very easily. Following cyber security policies, they should be able to access the data needed to make decisions. Data should be interoperable, so it can be used across all kinds of users. And it should be reusable. "We don't want folks redoing the same experiment to generate their own data," says Lacy. "You want that data out there so it can be reused for different types of analyses. If you're in environmental protection, you look at the data very differently than someone in chemistry, but it's the same data type."
Regarding data types, Ms. Lacy also comments about the need for an open platform that has no restrictions. “Different functional groups generate data differently, so you have to have a platform that moves at the speed of the organization."
Process Governance and Improvement
With 14,000 users all trying to be innovative, how do you make sure they're not coming to the wrong conclusions or looking at data in a way that it wasn't meant to be looked at? Ms. Lacy says that governance is a huge part of that, and creating a culture that understands all the personas: the data scientist and data engineer, and the research scientist and operations person. Her team wants to understand what they all do with data, their pain points and resource gaps, and if there's a better way, help them.
"We don't have unlimited data scientists and statisticians, but Spotfire software gives us a platform where we can reuse knowledge, put an algorithm behind a button, so they can feel confident that they're using the correct algorithm for the data even though they don't have a data scientist in their group. Those are the ways that we try to govern. We don't govern the people, we govern the process."
Continuous Learning for Continuous Improvement
Bayer employs a cyclical test and learn pipeline. "You do your experiment, you take your measurements, you get the results, you make a decision. You’re always learning. Even after we have a product out the door, what did we learn, what did the farmers tell us about it, we feed that data right back into it."
Understanding and feeding the entire data pipeline also brings Bayer users closer, because knowing who's ahead of you (in terms of analyses and results) and who’s behind you in the pipeline can make a huge difference on many levels.