Also it's not just about running an obviously worse quant. Running different GPU...

		KeplerBoy 39 days ago \| parent \| context \| favorite \| on: Kimi vendor verifier – verify accuracy of inferenc... Also it's not just about running an obviously worse quant. Running different GPU kernels / inference engines also matters. It's easy to write an implementation that is faster and thus cheaper but numerically much noisier / less accurate.