As the pace of business continues to accelerate, software and data science teams find themselves under pressure to deliver more business value in less time. Software publishers and enterprise development teams have attempted to address the issue with Agile development practices which are cross-functional in nature, although Agile practices do not guarantee that the code running on a developer's machine will work the same way in production. DevOps closes the gap by promoting collaboration between development and IT operations and enabling project visibility across development and IT operations, which accelerates the delivery of better quality software.
Data scientists and data science teams often face challenges that are similar to the challenges software development teams face. For example, some of them lack the cross-functional collaboration and support they need to ensure their work is timely and actually provides business value. In addition, their algorithms and models don't always operate as they should in production because conditions or the data have changed.
"For all the work data scientists put into designing, testing and optimizing their algorithms, the real tests come when they are put into use," said Michael Fauscette, chief research officer at business solutions review platform provider G2 Crowd. "From Facebook's newsfeed to stock market 'flash crashes,' we see what happens when algorithms go bad. The best algorithms must be continuously tested and improved."
DevOps practices can help data scientists address some of the challenges they face, but it's not a silver bullet. Data science has some notable differences that also need to be considered.
Following are a few things data scientists and their organizations should consider.
Achieve More Consistent Results and Predictability
Like application software, models may run well in a lab environment, but perform differently when applied in production.
"Models and algorithms are software [so] data scientists face the traditional problems when moving to production – untracked dependencies, incorrect permissions, missing configuration variables," said Clare Gollnick, CTO and chief data scientist at dark web monitoring company Terbium Labs. "The ‘lab to real world’ problem is really a restatement of the problem of model generalization. We build models based on historical, sub-sampled data [and then expect that model] to perform on future examples even if the context changes over time. DevOps can help close this gap by enabling iterative and fast hypothesis testing [because] 'fail fast' has nice parallels to the ‘principle of falsifiability’ in science. If [a hypothesis] is wrong, we should reject [it] quickly and move on."
One reason a model may fail to generalize is overfitting, which occurs when a model is so complex that it starts finding patterns in noise. To prevent that result, data scientists use methods including out-of-sample testing and cross-validation. Those methods, which are familiar to data scientists, are part of the model-building process, according to Jennifer Prendki, head of Search and Smarts Engineering at enterprise software company Atlassian.
"The biggest challenge, model-wise, comes from non-stationary data. Due to seasonality or other effects, a model that performed well yesterday can fail miserably tomorrow," she said. "Another challenge comes from the fact that models are trained on historical (static) data and then applied in runtime. This can lead to performance issues as data scientists are not used to thinking about performance."
Enable More Consistent Processes
DevOps provides developers and IT operations professionals with visibility into what the other is doing to enable lifecycle views and approaches versus the traditional hand-offs that tend to cause dissention, finger-pointing and rework. Data scientists are involved in problem-solving lifecycles from the formation of a hypothesis to hypothesis testing, data collection, analysis and insights, but they may lack the collaboration and support they need from other parts of the organization.
"[Data scientists] may be amazing at designing and training a model, but making it production-ready, tested and deployable easily is generally something new," said Russell Smith, co-founder and CTO of quality assurance platform provider Rainforest QA. "If you find a data scientist that can do that, you've found the new unicorn."
Data failures can happen randomly over time, particularly when one is working in a system that uses external data. According to Terbium Labs' Clare Gollnick, an extreme or a problematic data point might not be observed for days or weeks, so a data-focused DevOps culture needs to rely even more heavily on continuous monitoring as part of the lifecycle and feedback loop.
Machine learning systems are no exception because they need to be retrained on a regular basis. The frequency with which a model is trained is usually defined arbitrarily by the data scientist who developed that model, according to Atlassian's Jennifer Prendki.
"Machine Learning models are probabilistic [so] a model doesn’t suddenly ‘stop working;’ its predictions get progressively worse over time and therefore it is hard to make a call when the model needs revision," she said. "A DevOps-type approach is definitely valuable; however, the challenge is based on the ability to identify meaningful and appropriate metrics to monitor the system."
Improve Quality
DevOps and other modern software development practices including continuous delivery, emphasize the need for continuous testing. Similarly, data scientists should be monitoring models and algorithms more often than they do.
"Testing is a massive looming weak spot for data science," said Rainforest QA's Russell Smith. "Testing, especially when you're deploying changes, will give you and your team the confidence things are working as expected. Continuous testing can also help [ensure] that models that generally receive ever-changing or new content are behaving as expected, [which] is especially applicable if the models are training themselves or are re-trained. Currently, this is only happening with the most advanced teams, but it should be a much wider practice."
Testing is less straightforward in data science than it is in software development, however. For one thing, the definition of success in data science is vague, according to Terbium Labs' lare Gollnick.
"Ground truth is often not known, so there is nothing concrete to test against," said Gollnick. "We may choose instead to seek probabilistic improvement. The stochastic nature of these metrics can make automated tests difficult, if not entirely elusive. By necessity, we rely more heavily on continuous monitoring than continuous testing."
Break Down Organizational Barriers
Developers and IT operations have traditionally been at odds because their responsibilities were divided: Developers built software, and operations ensured it ran in production. Similarly, data scientists may find themselves at odds with others in the organization, including developers and operations, which might be aided by DevOps practices.
Tensions may arise between software engineers and data scientists because their orientations differ. Clare Gollnick of Terbium Labs said data science is trying to ascertain if something works in a particular way while traditional engineering testing attempts to prove that something does work in a particular way.
Rainforest QA's Russell Smith sees friction between data scientists and operations. Unless data scientists are doing their own ops or they've embraced DevOps, someone else has to deploy, run and monitor their systems.
Advance Security
As software teams have endeavored to deliver software faster, more types of testing have "shifted left," which means developers aren't just running unit tests (which has historically been the case), they're now running other type of tests including performance testing, load testing and, more recently, security testing. The Shift Left trend doesn't mean that testers, quality assurance or security professionals aren't needed, it simply means if security is built into products from the beginning, fewer issues arise later in the software lifecycle that tend to delay software delivery.
lare Gollnick of Terbium Labs said the Shift Left movement is causing data scientists to engage more in engineering thinking than security thinking, but security may be next as the cycle continues, particularly given the importance of data security.
Atlassian addresses machine learning-related security by building models on a tenant-by-tenant basis in order to avoid accidental transfer learning that might cause information leaks from one company to another. In the case of enterprise search, it would be possible for data scientists to build models using all the data available and, based on permission settings, filter out the results a specific user is not authorized to see. While the approach may seem sound, part of the information available in the data used to train the model is actually learned by the algorithm and transferred to the model. So, either way that makes it possible for the user to infer the content of the forbidden pages.
Lisa Morgan, https://www.informationweek.com