ResearchProduction
RequirementsSOTA model performance on benchmark datasetsDifferent stakeholders have different requirements
Computational PriorityFast training, High throughputFast inference, Low latency
DataStaticConstantly shifting
FairnessOften not a focusMust be considered
InterpretabilityOften not a focusMust be considered

Factors

Requirements

Researchers’ most commons objective is model performance on benchmark datasets, which often leads to techniques that make models too complex to be useful in the wild. In production, there are many stakeholders involved and each has their own requirements such as latency, throughput, updatability, profit, etc. When developing a ML project, understanding requirements (and how strict they are) from all stakeholders involved is important.

Computational Priorities

Focusing too much on the model development part and not enough on the model deployment / maintenance part is a common mistake. During model development, training is the bottleneck, whilst after deployment inference is the bottleneck.

Once the model is deployed into the real world, latency matters a lot as users nowadays are impatient. When thinking about latency, remember that latency is not an individual number but a statistic.

Data

Datasets used in research phase are often static, clean and well-formatted, whereas in production, data is a lot more noisy, unstructured, and constantly shifting (if available, that is). Privacy and regulatory concerns must be considered also when working with users’ data.

Fairness

ML algorithms do not predict the future, but encode the past, thus perpetuating the biases in the data and more. You or someone in your life might already be a victim of biased algorithms without even knowing it.

Further Reading: Weapons of Math Destruction by Cathy O’Neil

Interpretability

Researchers are not inventivised to work on model interpretability as most ML research is evaluated on model performance and performance only. However, interpretability is a requirement for most ML use cases because:

  • Model trustability: Model interpretability ensures both business leaders and end users to understand why a decision was made, and detect potential biases.
  • Updatability: Developers can easily debug and improve a model.