Running Phi-3-vision via ONNX on Jetson Platform

Copper Contributor

Jul 23, 2024

I kept receiving so many errors following these steps, I re-flashed my Jetson, to help point out the missing steps for a Jetson Orin Nano Dev Kit also running Jetpack 6. I am assuming Your instructions are based on a system that has done some of these steps some earlier time.

The first thing is that the user is not automatically added to the existing docker group,

sudo usermod -aG docker $USER

Next cmake must be updated to 3.26 in order to execute build.py

sudo apt purge cmake
pip install --upgrade cmake

Logoff and log and cd back to onnxruntime-genai directory so cmake is usable.
removed --parallel flag- probably best to leave this out side the terminal text as its not used in the described environment.

python3 build.py --use_cuda --cuda_home /usr/local/cuda-12.2 --skip_tests --skip_csharp

The compiled files will be located in the build/Linux/RelWithDebInfo/wheel directory, and we only need the .whl file. called (onnxruntime_genai_cuda-0.4.0.dev0-cp310-cp310-linux_aarch64.whl) in my case.
Install seems to go fine.
Phi3 model downloads successfully from hugging face and Example Scripts download fine.
Running

python3 phi3v.py -m cuda-int4-rtn-block-32
#Results In
Loading model...
Traceback (most recent call last):
  File "/home/travis/phi3v.py", line 66, in <module>
    run(args)
  File "/home/travis/phi3v.py", line 16, in run
    model = og.Model(args.model_path)
onnxruntime_genai.onnxruntime_genai.OrtException: CUDA execution provider is not enabled in this build.

I am unsure where to go from here or what step is missing. Please keep in mind this is a 100% fresh build.

Blog Post

Running Phi-3-vision via ONNX on Jetson Platform