Commit Graph

231 Commits

Author SHA1 Message Date
DenOfEquity 4f825bc070 fix zero'd cond_t5 device (#2701)
could be created on the wrong device if user has enough VRAM for Forge to run text encoders on GPU.
just move it to the cond_l device
2025-02-27 22:01:35 +00:00
DenOfEquity 3ce373939e this is useful (spiece.model for SD3 T5 tokenizer)(#2699) 2025-02-27 20:40:19 +00:00
DenOfEquity f23bc80d2f SD3+ (#2688)
Co-authored-by: graemeniedermayer graemeniedermayer@users.noreply.github.com
2025-02-27 17:54:44 +00:00
DenOfEquity 8dd92501e6 Add SDXL refiner model (#2686)
add sdxlrefiner
adjust some Settings
custom CLIP-G support
2025-02-25 10:49:47 +00:00
DenOfEquity 184bb04f8d increased support for custom CLIPs (#2642)
increased support for custom CLIPs
more forms recognised
now can be applied to sd1.5, sdxl, (sd3)
2025-02-21 12:01:39 +00:00
DenOfEquity 4a30c15769 update for Save Checkpoint button (#2636)
function in Checkpoint Merger previously produced unusable checkpoints for non-Flux architectures because keys had names not recognised by the model loader
2025-02-09 13:21:15 +00:00
hako-mikan daee4c0d8f Add force refresh to LoRA Loader refresh function (#2584) 2025-01-28 16:04:44 -05:00
DenOfEquity 2051c4100b Apply emphasis mode changes immediately (#2423)
Previously emphasis mode changes applied only on model load.
Also added handling for mode 'None' (literal interpretation), which previously gave results identical to mode 'Ignore'.
2024-12-08 19:25:39 +00:00
DenOfEquity 19a9a78c9b fix/workaround for potential memory leak (#2315)
unload old models, based on reference count <= 2
(in practise only noticed extra copies of JointTextEncoder, never KModel or IntegratedAutoencoderKL)
#2281 #2308 and others
2024-11-14 22:05:54 +00:00
catboxanon 6e1a7908b4 Fallback to estimated prediction if .yaml prediction not suitable (#2273) 2024-11-06 23:49:18 -05:00
catboxanon f4afbaff45 Only use .yaml config prediction if actually set (#2272) 2024-11-06 23:44:12 -05:00
catboxanon 05b01da01f Fix ldm .yaml loading typo (#2248) 2024-11-02 10:29:13 -04:00
catboxanon 90a6970fb7 Compatibility for ldm .yaml configs (#2247) 2024-11-02 10:16:51 -04:00
layerdiffusion d50f390c7e rewrite mu shift 2024-10-31 22:57:26 -07:00
catboxanon 6f4350d65f Fix typo in .yaml config load (#2228) 2024-10-31 07:09:27 -04:00
catboxanon b691b1e755 Fix .yaml config loading (#2224) 2024-10-30 16:18:44 -04:00
Symbiomatrix 41a21f66fd Restore embedding filepath. (#2030) 2024-10-26 23:53:48 +01:00
catboxanon edeb2b883f Fix loading diffusers format VAEs (#2171) 2024-10-24 14:27:57 -04:00
catboxanon edc46380cc Automatically enable ztSNR when applicable (#2122) 2024-10-19 20:33:34 -04:00
catboxanon aba35cde5f Fix Zero Terminal SNR option 2024-10-19 12:28:14 -04:00
catboxanon f620f55e56 Fallback to already detected prediction type if none applicable found 2024-10-19 07:47:56 -04:00
catboxanon 5ec47a6b93 Fix model prediction detection
Closes #1109
2024-10-19 06:18:02 -04:00
DenOfEquity cc378589a4 (T5) pad chunks to length of largest chunk (#1990)
When the prompt is chunked using the BREAK keyword, chunks will be padded to the minimum size of 256 tokens - but chunks can be longer. torch.stack then fails if all chunks are not the same size, so find the largest and pad all to match.
#1988 (doesn't quite ID the real issue, prompts longer than 255 tokens work fine)
2024-10-07 11:37:06 +01:00
DenOfEquity 2467c88c50 fix for XPU (#1997)
use float32 for XPU, same as previous fix for MPS
2024-10-06 14:33:47 +01:00
Conor Nash 8bd7e0568f Get Flux working on Apple Silicon (#1264)
Co-authored-by: Conor Nash <conor@nbs.consulting>
2024-09-13 15:40:11 +01:00
Panchovix c13b26ba27 Rephrase low GPU warning (#1761)
Make emphasis that it's performance degradation, and for this diffusion process.
2024-09-09 14:30:22 -03:00
layerdiffusion efe6fed499 add a way to exchange variables between modules 2024-09-08 20:22:04 -07:00
layerdiffusion f40930c55b fix 2024-09-08 17:24:53 -07:00
layerdiffusion 44eb4ea837 Support T5&Clip Text Encoder LoRA from OneTrainer
requested by #1727
and some cleanups/licenses
PS: LoRA request must give download URL to at least one LoRA
2024-09-08 01:39:29 -07:00
layerdiffusion a8a81d3d77 fix offline quant lora precision 2024-08-31 13:12:23 -07:00
layerdiffusion 79b25a8235 move codes 2024-08-31 11:31:02 -07:00
layerdiffusion 33963f2d19 always compute on-the-fly lora weights when offload 2024-08-31 11:24:23 -07:00
layerdiffusion 70a555906a use safer codes 2024-08-31 10:55:19 -07:00
layerdiffusion 1f91b35a43 add signal_empty_cache 2024-08-31 10:20:22 -07:00
layerdiffusion ec7917bd16 fix 2024-08-30 15:37:15 -07:00
layerdiffusion d1d0ec46aa Maintain patching related
1. fix several problems related to layerdiffuse not unloaded
2. fix several problems related to Fooocus inpaint
3. Slightly speed up on-the-fly LoRAs by precomputing them to computation dtype
2024-08-30 15:18:21 -07:00
layerdiffusion f04666b19b Attempt #1575 2024-08-30 09:41:36 -07:00
layerdiffusion 4c9380c46a Speed up quant model loading and inference ...
... based on 3 evidences:
1. torch.Tensor.view on one big tensor is slightly faster than calling torch.Tensor.to on multiple small tensors.
2. but torch.Tensor.to with dtype change is significantly slower than torch.Tensor.view
3. “baking” model on GPU is significantly faster than computing on CPU when model load.

mainly influence inference of Q8_0, Q4_0/1/K and loading of all quants
2024-08-30 00:49:05 -07:00
layerdiffusion 3d62fa9598 reduce prints 2024-08-29 20:17:32 -07:00
layerdiffusion 95e16f7204 maintain loading related
1. revise model moving orders
2. less verbose printing
3. some misc minor speedups
4. some bnb related maintain
2024-08-29 19:05:48 -07:00
layerdiffusion d339600181 fix 2024-08-28 09:56:18 -07:00
layerdiffusion 81d8f55bca support pytorch 2.4 new normalization features 2024-08-28 09:08:26 -07:00
layerdiffusion 0abb6c4686 Second Attempt for #1502 2024-08-28 08:08:40 -07:00
layerdiffusion f22b80ef94 restrict baking to 16bits 2024-08-26 06:16:13 -07:00
layerdiffusion 388b70134b fix offline loras 2024-08-25 20:28:40 -07:00
layerdiffusion b25b62da96 fix T5 not baked 2024-08-25 17:31:50 -07:00
layerdiffusion cae37a2725 fix dequant of unbaked parameters 2024-08-25 17:24:31 -07:00
layerdiffusion 13d6f8ed90 revise GGUF by precomputing some parameters
rather than computing them in each diffusion iteration
2024-08-25 14:30:09 -07:00
lllyasviel f82029c5cf support more t5 quants (#1482)
lets hope this is the last time that people randomly invent new state dict key formats
2024-08-24 12:47:49 -07:00
layerdiffusion f23ee63cb3 always set empty cache signal as long as any patch happens 2024-08-23 08:56:57 -07:00