What is AI Inference?
Definition
Inference is the process of running a trained AI model to generate predictions, decisions, or outputs on new, unseen data. While training teaches a model to recognize patterns, inference is when that learned knowledge is applied in the real world to process user requests, classify data, generate text, or make decisions.
Training vs Inference
| Aspect | Training | Inference |
|---|---|---|
| When | During model development | During production use |
| Frequency | One-time or periodic | Continuous, real-time |
| Compute | Intensive, expensive | Lighter, but frequent |
| Output | Model weights | Predictions |