Hi, I'm Jambo a Microsoft Learn Student Ambassador. This article aims to run the quantized Phi-3-vision model in ONNX format on the Jetson platform and successfully perform inference for image+te...
git clone https://github.com/microsoft/onnxruntime-genai
cd onnxruntime-genai
git checkout 940bc102a317e886f488ad5e120533b96a34ddcd
ONNXRuntime
wget http://jetson.webredirect.org:8000/jp6/cu124/onnxruntime-gpu-1.19.0.tar.gz
mkdir ort
tar -xvf onnxruntime-gpu-1.19.0.tar.gz -C ort
mv ort/include/onnxruntime/onnxruntime_c_api.h ort/include/
rm -rf ort/include/onnxruntime/
If you encounter the following issues when runtime, please try Install cuDNN 9
If the above process still doesn’t work, you will have to compile ONNXRuntime yourself to get the library and compile onnx-genai. I will also update my article.
I am still looking for a more convenient process, but the most convenient way is through making a Docker image or having the ONNX team provide a precompiled whl. Currently, Dusty has contacted me and expressed interest in creating the image, but this will take time. I also appreciate you trying my method, as it helps me identify where the issues are.