Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							ea2ced8c8b 
							
						 
					 
					
						
						
							
							fix json EOF handling  
						
						
						
						
					 
					
						2025-02-27 14:34:13 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							cc6f995795 
							
						 
					 
					
						
						
							
							fix #includes  
						
						
						
						
					 
					
						2025-02-27 11:50:16 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							d4e9a6177b 
							
						 
					 
					
						
						
							
							finished initial impl of /show and tested -> hangs!  
						
						
						
						
					 
					
						2025-02-26 20:01:58 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							7ce2ea57e0 
							
						 
					 
					
						
						
							
							implement /api/show (not tested)  
						
						
						
						
					 
					
						2025-02-26 19:47:25 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							85eaa41e6d 
							
						 
					 
					
						
						
							
							base url should include /api/  
						
						
						
						
					 
					
						2025-02-26 17:18:56 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							e872f1db2d 
							
						 
					 
					
						
						
							
							undercores to dashes  
						
						
						
						
					 
					
						2025-02-26 17:14:39 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							86de26ead2 
							
						 
					 
					
						
						
							
							implement and test /api/tags  
						
						
						
						
					 
					
						2025-02-26 16:58:00 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							4c5dcf59ea 
							
						 
					 
					
						
						
							
							rename the class to "OllamaClient"  
						
						
						
						
					 
					
						2025-02-26 16:48:05 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							06475dd113 
							
						 
					 
					
						
						
							
							WIP: use Boost::json for incremental parsing and reflection  
						
						
						
						
					 
					
						2025-02-26 15:57:11 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							927e963076 
							
						 
					 
					
						
						
							
							parse the JSON response  
						
						
						
						
					 
					
						2025-02-25 16:11:40 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							b5144decde 
							
						 
					 
					
						
						
							
							fix #includes  
						
						
						
						
					 
					
						2025-02-25 14:58:04 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							407cb81725 
							
						 
					 
					
						
						
							
							stop using C++20 modules  
						
						... 
						
						
						
						2025 is too soon to use C++ features from 2020 without running into bugs
in every build tool that touches the project. 
						
						
					 
					
						2025-02-25 12:10:40 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							1699e77e97 
							
						 
					 
					
						
						
							
							WIP: get build working on macOS  
						
						
						
						
					 
					
						2025-02-25 12:10:09 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							ebe6352fc8 
							
						 
					 
					
						
						
							
							WIP (hit a clang bug causing an incorrect compiler error)  
						
						
						
						
					 
					
						2025-02-25 12:10:08 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							196c387bf7 
							
						 
					 
					
						
						
							
							WIP: bring back old backend so we can test the gpt4all-chat build  
						
						
						
						
					 
					
						2025-02-25 12:01:37 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							729a5b0d9f 
							
						 
					 
					
						
						
							
							ollama-hpp immediately segfaulted. will try something else  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-25 12:00:48 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							f4a350d606 
							
						 
					 
					
						
						
							
							WIP: working fmt dep  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-25 12:00:48 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							c6d0a1f2b9 
							
						 
					 
					
						
						
							
							enable color diagnostics with ninja  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-25 12:00:48 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							b194d71e86 
							
						 
					 
					
						
						
							
							WIP: backend dependencies  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-25 12:00:48 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
						
						
							
						
						
							8e94409be9 
							
						 
					 
					
						
						
							
							WIP: gpt4all backend stub  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-25 12:00:05 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							96aeb44210 
							
						 
					 
					
						
						
							
							backend: build with CUDA compute 5.0 support by default ( #3499 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-19 11:27:06 -05:00 
						 
				 
			
				
					
						
							
							
								ThiloteE 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							02e12089d3 
							
						 
					 
					
						
						
							
							Add Granite arch to model whitelist ( #3487 )  
						
						... 
						
						
						
						Signed-off-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-12 14:17:49 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							22ebd42c32 
							
						 
					 
					
						
						
							
							Misc fixes for undefined behavior, crashes, and build failure ( #3465 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-06 11:22:52 -05:00 
						 
				 
			
				
					
						
							
							
								ThiloteE 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6ef0bd518e 
							
						 
					 
					
						
						
							
							Whitelist OLMoE and Granite MoE ( #3449 )  
						
						... 
						
						
						
						Signed-off-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-02-04 18:00:07 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							343a4b6b6a 
							
						 
					 
					
						
						
							
							Support DeepSeek-R1 Qwen ( #3431 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2025-01-29 09:51:50 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0c70b5a5f4 
							
						 
					 
					
						
						
							
							llamamodel: add missing softmax to fix temperature ( #3202 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-12-04 10:56:19 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							225bf6be93 
							
						 
					 
					
						
						
							
							Remove binary state from high-level API and use Jinja templates ( #3147 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Co-authored-by: Adam Treat <treat.adam@gmail.com> 
						
						
					 
					
						2024-11-25 10:04:17 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f07e2e63df 
							
						 
					 
					
						
						
							
							Use the token cache to infer greater n_past and reuse results ( #3073 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-10-31 11:19:12 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c3357b7625 
							
						 
					 
					
						
						
							
							Enable more warning flags, and fix more warnings ( #3065 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-10-18 12:11:03 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8e3108fe1f 
							
						 
					 
					
						
						
							
							Establish basic compiler warnings, and fix a few style issues ( #3039 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-10-09 09:11:50 -04:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ea1ade8668 
							
						 
					 
					
						
						
							
							Use different language for prompt size too large. ( #3004 )  
						
						... 
						
						
						
						Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-09-27 12:29:22 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f9d6be8afb 
							
						 
					 
					
						
						
							
							backend: rebase llama.cpp on upstream as of Sep 26th ( #2998 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-09-27 12:05:59 -04:00 
						 
				 
			
				
					
						
							
							
								Ikko Eltociear Ashimine 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1047c5e038 
							
						 
					 
					
						
						
							
							docs: update README.md ( #2979 )  
						
						... 
						
						
						
						Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Signed-off-by: AT <manyoso@users.noreply.github.com>
Co-authored-by: AT <manyoso@users.noreply.github.com> 
						
						
					 
					
						2024-09-23 16:12:52 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							69782cf713 
							
						 
					 
					
						
						
							
							chat(build): fix broken installer on macOS ( #2973 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-09-20 15:34:20 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							39005288c5 
							
						 
					 
					
						
						
							
							server: improve correctness of request parsing and responses ( #2929 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-09-09 10:48:57 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ca151f3519 
							
						 
					 
					
						
						
							
							repo: organize sources, headers, and deps into subdirectories ( #2917 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-27 17:22:40 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6518b33697 
							
						 
					 
					
						
						
							
							llamamodel: use greedy sampling when temp=0 ( #2854 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-13 17:04:50 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7463b2170b 
							
						 
					 
					
						
						
							
							backend(build): set CUDA arch defaults before enable_language(CUDA) ( #2855 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-13 14:47:48 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							971c83d1d3 
							
						 
					 
					
						
						
							
							llama.cpp: pull in fix for Kompute-related nvidia-egl crash ( #2843 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-13 11:10:10 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							26113a17fb 
							
						 
					 
					
						
						
							
							don't use ranges::contains due to clang incompatibility ( #2812 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-08 11:49:01 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							de7cb36fcc 
							
						 
					 
					
						
						
							
							python: reduce size of wheels built by CI, other build tweaks ( #2802 )  
						
						... 
						
						
						
						* Read CMAKE_CUDA_ARCHITECTURES directly
* Disable CUBINs for python build in CI
* Search for CUDA 11 as well as CUDA 12
Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-07 11:27:50 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							be66ec8ab5 
							
						 
					 
					
						
						
							
							chat: faster KV shift, continue generating, fix stop sequences ( #2781 )  
						
						... 
						
						
						
						* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift 
						
						
					 
					
						2024-08-07 11:25:24 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							51bd01ae05 
							
						 
					 
					
						
						
							
							backend: fix extra spaces in tokenization and a CUDA crash ( #2778 )  
						
						... 
						
						
						
						Also potentially improves accuracy of BOS insertion, token cache, and logit indexing.
Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-08-01 10:46:36 -04:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8c834a5177 
							
						 
					 
					
						
						
							
							Update llama.cpp to include upstream Llama 3.1 RoPE fix. ( #2758 )  
						
						... 
						
						
						
						Signed-off-by: Adam Treat <treat.adam@gmail.com> 
						
						
					 
					
						2024-07-27 14:14:19 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2a7fe95ff4 
							
						 
					 
					
						
						
							
							llamamodel: always print special tokens ( #2701 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-22 13:32:17 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4ca1d0411f 
							
						 
					 
					
						
						
							
							llamamodel: add DeepSeek-V2 to whitelist ( #2702 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-22 13:32:04 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							290c629442 
							
						 
					 
					
						
						
							
							backend: rebase llama.cpp submodule on latest upstream ( #2694 )  
						
						... 
						
						
						
						* Adds support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Kompute support)
* Also enables Kompute support for StarCoder2, XVERSE, Command R, and OLMo
* Includes a number of Kompute resource management fixes
Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-19 14:52:58 -04:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ca72428783 
							
						 
					 
					
						
						
							
							Remove support for GPT-J models. ( #2676 )  
						
						... 
						
						
						
						Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-17 16:07:37 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6cb3ddafd6 
							
						 
					 
					
						
						
							
							llama.cpp: update submodule for CPU fallback fix ( #2640 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-10 17:56:19 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							bd307abfe6 
							
						 
					 
					
						
						
							
							backend: fix a crash on inputs greater than n_ctx ( #2498 )  
						
						... 
						
						
						
						This fixes a regression in commit 4fc4d94b ("fix chat-style prompt
templates (#1970 )"), which moved some return statements into a new
function (LLModel::decodePrompt) without making them return from the
parent as well.
Signed-off-by: Jared Van Bortel <jared@nomic.ai> 
						
						
					 
					
						2024-07-01 11:33:46 -04:00