I execute the command:wasmedge --dir .:. --nn-preload default:GGML:AUTO:
Phi-3-mini-4k-instruct-q4.gguf llama-api-server.wasm -p phi-3-chat but
the command say: unknown option nn-preload
I have downloaded phi 3 mini 128k instruct model in 2 safetensors files
using model =
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-128k-instruct",device_map="cuda",torch_dtype="auto",trust_remote_code=True,).
How to merge those files for inference (mayne merge is not needed). Just
how ...
Latest Comments