Skip to content

feat: Gemma 4 support#591

Open
giladgd wants to merge 9 commits intomasterfrom
gilad/gemma4
Open

feat: Gemma 4 support#591
giladgd wants to merge 9 commits intomasterfrom
gilad/gemma4

Conversation

@giladgd
Copy link
Copy Markdown
Member

@giladgd giladgd commented Apr 6, 2026

Description of change

  • feat: Gemma 4 support
  • feat: automatically enable flash attention when optimal
  • feat: improve inference performance when a grammar is active
  • feat: more precise resource usage estimation
  • feat: resource usage capping
  • feat: useMmap: "auto"
  • feat: support Q1_0 quant
  • fix: MXFP4_MOE quant name
  • fix: Vulkan backend successful load detection even when no devices are available

Resolves #594
Fixes #600

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • npm run format to apply eslint formatting
  • npm run test passes with this change
  • This pull request links relevant issues as Fixes #0000
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change
  • The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

@giladgd giladgd requested a review from ido-pluto April 6, 2026 20:17
@giladgd giladgd self-assigned this Apr 6, 2026
@0x7s0lt1
Copy link
Copy Markdown

🙏

1 similar comment
@EnderBoy9217
Copy link
Copy Markdown

🙏

@giladgd
Copy link
Copy Markdown
Member Author

giladgd commented Apr 22, 2026

Still working on it, hope to finish in the next few days or so.
It’s going to be a bigger change than I initially planned but will significantly improve general stability and performance.

@giladgd giladgd marked this pull request as ready for review April 28, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants