Google Gemma Models: lightweight, state-of-the-art open models from Google

Select Task
Select Gemma Model

Pointers

  • First response after model change will be slower (model loading lazily).
  • Switching models clears chat history.
  • Larger models need more memory but give better results.
Examples
0.1 2

Temperature: Lower values make the output more deterministic.

0.1 1

Top P: Lower values make the output more focused.

1 100

Top K: Lower values make the output more focused.

1 2

Repetition Penalty: Penalizes repeated tokens to reduce repetition in the output.

512 2048

Max Tokens: Sets the maximum number of tokens the model can generate in one response.