Adam Treat
d90d003a1d
Latest rebase on llama.cpp with gguf support.
2023-10-05 18:16:19 -04:00
Aaron Miller
0bc2274869
bump llama.cpp version + needed fixes for that
2023-08-31 15:29:54 -04:00
Aaron Miller
57dc0c8953
adjust eval buf sizes to pass long input test
2023-06-30 21:07:21 -03:00
Aaron Miller
8d19ef3909
backend: factor out common elements in model code ( #1089 )
...
* backend: factor out common structs in model code
prepping to hack on these by hopefully making there be fewer places to fix the same bug
rename
* use common buffer wrapper instead of manual malloc
* fix replit compile warnings
2023-06-28 17:35:07 -07:00
Aaron Miller
28d41d4f6d
falcon: use *model-local* eval & scratch bufs ( #1079 )
...
fixes memory leaks copied from ggml/examples based implementation
2023-06-27 16:09:11 -07:00
Aaron Miller
198b5e4832
add Falcon 7B model
...
Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin
2023-06-27 14:06:39 -03:00