Building stochastic software is hard

Traditional software development is about programming a system to reach deterministic outcomes. But machine learning is about building software with stochastic outcomes. And building stochastic software, especially in organizations that are used to building deterministic software, is hard.

Challenge 1: Communicating Impact and Value

Organizations measure developer productivity on features shipped and quality of code based on bug-free software. But these metrics don’t make sense for machine learning teams, where they could spend months iterating on a model, perhaps taking different approaches to prediction and layering a combination of heuristics and ML-based techniques. How do you know you what you are building is on the right track, and how do you communicate to stakeholders what exactly you spent months doing?

I have found that it’s helpful to separate out the development efforts that are focused on enabling machine learning as opposed to actually doing machine learning. This allows you to set very different metrics for success for the two workstreams.

The second thing that is quite critical is to treat software bugs differently from prediction errors. Software bugs occur when a deterministic system doesn’t do what was expected. Prediction errors occur when a stochastic system makes an error. And with prediction errors, is the error really an error or did the model do exactly what you trained it to do and maybe the real world doesn’t quite operate as you expected?

Once you have this structure in place, it becomes easier to communicate regular progress to stakeholders. You can show your different approaches taken, iterations made, and the rationale behind your decisions. It also allows you to provide more context behind the performance metrics of your model. Ultimately, this helps stakeholders understand the complexity and effort involved in building and improving stochastic software.

Challenge 2: Building a continuous feedback loop

The other challenge of building stochastic software is the criticality of closing the feedback loop with your end user. Testing that your model is working is not as simple as reviewing a few test cases to ensure the system works as intended. Determined users will usually flag software bugs when they hit errors in your software, but they are much less likely to flag incorrect predictions. How do you know what you built is even working and how can you improve it without significant testing?

To address this issue, it is critical that you implement a mechanism for users to provide feedback on model predictions, directly within your product UX. Ideally, there is little friction or effort required from the user to give you feedback and bonus points if feedback is built into their workflow or interactions with your software.

Once you do get feedback, it’s important to review the user feedback and incorporate it into model enhancements and improvements on a systematic basis. Otherwise you are just shipping poor quality software into the ether. From conversations with ML engineers, it's crazy how often this happens with ML teams.

Model maintenance is a lot less exciting than building new models, so it's often hard to keep really smart ML engineers excited about model improvements. But if you find a way to tie back model improvements to business objectives or product KPIs, it becomes a lot easier to justify time spent on model improvements to the team and other stakeholders.

In summary, it’s important for organizations that want to pursue ML projects to acknowledge that building stochastic software is an ongoing process that requires continuous improvement and adaptation. It may take time and iteration to achieve the desired outcomes, and having organizational support and understanding of this iterative nature is crucial for success.

© Malavika Balachandran Tadeusz.RSS