In this talk, we challenge the common notion that parameters and operation counts are primary indicators of AI models performance. We reveal that the effective assessment of AI efficiency goes beyond just model size. We’ll navigate through alternative indicators suitable for live-processing environments, consider how performance can vary across different hardware, and explore the role of input batching in the threshold-latency tradeoff. Gain valuable insights through profiling in our quest for better performance measurement approaches. Join us if you are curious to understand the inner workings of AI models, irrespective of your current AI experience!