Model Parameters

Updated Jun 13, 2026 ·

Overview

Most default settings are already good when we start testing LM Studio, but we can adjust them to have a more predictable output.

Setting	Purpose	Effect
Temperature	Controls randomness in output generation	Lower values produce more predictable responses, while higher values produce more varied and creative responses
Sampling (Top K, Top P, Min P)	Controls which token candidates are considered	Limits or filters possible next tokens before one is selected
Repeat Penalty	Reduces repeated words or phrases	Discourages the model from generating the same tokens repeatedly
CPU Threads	Controls CPU usage during inference	Determines how many CPU threads are used when generating responses
w
These settings affect how the model chooses the next word or token in its response.

Recap: Token Generation

A model generates text one token at a time. A token can be a word, part of a word, or a symbol.

For example, if the prompt is:

The sky is

The model may consider outputs like:

blue
clear
visible
bright

Each possible token has a probability. The model then chooses from those possible tokens based on the generation settings.

Temperature

Temperature controls how random or predictable the output is.

Temperature	Behavior	Result
`0`	No randomness	Same output every time
`0.1`	Very predictable	Mostly safe and consistent
`1`	More random	More varied and creative
Higher values	Very random	Less predictable output

Low temperature makes the model choose the most likely tokens more often. High temperature makes the model consider more varied tokens.

Top K

Top K limits how many token choices are considered.

Top K value	Meaning
`1`	Only the most likely token is used
`5`	Only the top 5 tokens are considered
`40`	The top 40 tokens are considered

If Top K is set to 1, the model becomes very predictable because it always chooses the most likely token.

Top P

Top P considers tokens based on combined probability.

Top P value	Meaning
`0.5`	Only enough tokens to reach 50% probability are considered
`0.9`	More tokens are considered
`0.95`	Even more token options are included

Top P is also called nucleus sampling. It helps balance quality and variety.

Min P

Min P removes tokens that are too unlikely.

Low probability tokens are ignored
Helps reduce strange outputs
Works together with Top K and Top P

For example, if a token has only a very small chance of being useful, Min P can remove it from the list of possible choices.

Repeat Penalty

Repeat penalty discourages the model from repeating the same words or phrases too much.

Helps avoid repeated text
Useful for longer responses
Default value is usually fine

Most of the time, you do not need to change repeat penalty.

Example: Using Temperature

In the example below, the same prompt is tested with different temperature values.

Prompt:

The sky is

Expected output with temperature 0:

The sky is blue.

If you run the same prompt again with temperature 0, you should usually get the same result.

If you increase the temperature to 1, you might get a different response because higher temperature allows more variation.

Simple Settings Guide

Goal	Recommendation
Predictable answers	Lower temperature
Creative writing	Higher temperature
Consistent testing	Temperature `0` or Top K `1`
Balanced output	Use defaults
Less repetition	Keep repeat penalty enabled

For most use cases, the default settings are already good. Change them only when you need a specific behavior.

Overview​

Recap: Token Generation​

Temperature​

Top K​

Top P​

Min P​

Repeat Penalty​

Example: Using Temperature​

Simple Settings Guide​