Based out of Seattle, Washington, OctoML was founded by the team that created Apache TVM, an open source machine learning compiler that optimizes models so they can run efficiently in the cloud or on edge devices (CPUs, GPUs, NPUs, Accelerators), saving time, energy and cost.

Our customers upload their trained deep learning models (from TensorFlow, PyTorch, Keras, ONNX, MxNet, etc) to the Octomizer (our SaaS product based on Apache TVM), which outputs an optimized and packaged version of the model fine-tuned to the target hardware. Our platform also benchmarks the model on various target hardware options, allowing you to choose the most cost efficient device (while meeting latency and throughput requirements) for your use case.

Solution Briefs

  • thumbnail: OctoML - Optimizing Compiler for Deep Learning
    OctoML - Optimizing Compiler for Deep Learning

    At OctoML we’re making machine learning faster, more portable, and easier to deploy.

    Learn More

Insights

  • Deploy ML Models Faster on Arm Using Apache TVM with OctoML Arm Tech Talk
    Deploy ML Models Faster on Arm Using Apache TVM with OctoML

    In this webinar, OctoML shows you how to solve machine learning deployment challenges by making ML models faster and easier to put into production.

    Learn More
  • ML inside ML compilers using open source TVM with OctoML Arm Tech Talk
    ML inside ML compilers using open source TVM with OctoML

    Coffee talk about machine learning compilers with Jason Knight, Co-founder, and CPO at OctoML.

    Learn More