`cortex models`

This command allows you to start, stop, and manage various local or remote model operations within Cortex.

Usage:

info

You can use the --verbose flag to display more detailed output of the internal processes. To apply this flag, use the following format: cortex --verbose [subcommand].

MacOs/Linux
Windows


cortex models [options] [subcommand]


cortex.exe models [options]

Options:

Option	Description	Required	Default value	Example
`-h`, `--help`	Display help information for the command.	No	-	`-h`

Subcommands:

`cortex models get`

info

This CLI command calls the following API endpoint:

Get Model

This command returns a model detail defined by a model_id.

Usage:

MacOs/Linux
Windows


cortex models get <model_id>


cortex.exe models get <model_id>

For example, it returns the following:


{
  "ai_template":"<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n","created":9223372036854775888,"ctx_len":4096,"dynatemp_exponent":1.0,"dynatemp_range":0.0,"engine":"llama-cpp","files":["models/cortex.so/llama3.2/3b-gguf-q4-km/model.gguf"],"frequency_penalty":0.0,"gpu_arch":"","id":"Llama-3.2-3B-Instruct","ignore_eos":false,"max_tokens":4096,"min_keep":0,"min_p":0.05000000074505806,"mirostat":false,"mirostat_eta":0.10000000149011612,"mirostat_tau":5.0,"model":"Llama-3.2-3B-Instruct","n_parallel":1,"n_probs":0,"name":"llama3.2:3b-gguf-q4-km","ngl":29,"object":"model","os":"","owned_by":"","penalize_nl":false,"precision":"","presence_penalty":0.0,"prompt_template":"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_message}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n","quantization_method":"","repeat_last_n":64,"repeat_penalty":1.0,"result":"OK","seed":-1,"stop":["<|eot_id|>"],"stream":true,"system_template":"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n","temperature":0.69999998807907104,"text_model":false,"tfs_z":1.0,"top_k":40,"top_p":0.89999997615814209,"typ_p":1.0,"user_template":"<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n","version":"2"
}

info

This command uses a model_id from the model that you have downloaded or available in your file system.

Options:

Option	Description	Required	Default value	Example
`model_id`	The identifier of the model you want to retrieve.	Yes	-	`mistral`
`-h`, `--help`	Display help information for the command.	No	-	`-h`

`cortex models list`

info

This CLI command calls the following API endpoint:

List Model

This command lists all the downloaded local and remote models.

Usage:

MacOs/Linux
Windows


cortex models list [options]


cortex.exe models list [options]

For example, it returns the following:w


+---------+---------------------------------------------------------------------------+
| (Index) | ID                                                                        |
+---------+---------------------------------------------------------------------------+
| 1       | llama3.2:3b-gguf-q4-km                                                    |
+---------+---------------------------------------------------------------------------+
| 2       | tinyllama:1b-gguf                                                         |
+---------+---------------------------------------------------------------------------+
| 3       | TheBloke:Mistral-7B-Instruct-v0.1-GGUF:mistral-7b-instruct-v0.1.Q2_K.gguf |
+---------+---------------------------------------------------------------------------+

Options:

Option	Description	Required	Default value	Example
`-h`, `--help`	Display help for command.	No	-	`-h`

`cortex models start`

info

This CLI command calls the following API endpoint:

Start Model

This command starts a model defined by a model_id.

Usage:

MacOs/Linux
Windows


cortex models start [options] <model_id>


cortex.exe models start [options] <model_id>

info

This command uses a model_id from the model that you have downloaded or available in your file system.

Options:

Option	Description	Required	Default value	Example
`model_id`	The identifier of the model you want to start.	Yes	`Prompt to select from the available models`	`mistral`
`-h`, `--help`	Display help information for the command.	No	-	`-h`

`cortex models stop`

info

This CLI command calls the following API endpoint:

Stop Model

This command stops a model defined by a model_id.

Usage:

MacOs/Linux
Windows


cortex models stop <model_id>


cortex.exe models stop <model_id>

info

This command uses a model_id from the model that you have started before.

Options:

Option	Description	Required	Default value	Example
`model_id`	The identifier of the model you want to stop.	Yes	-	`mistral`
`-h`, `--help`	Display help information for the command.	No	-	`-h`

`cortex models delete`

info

This CLI command calls the following API endpoint:

Delete Model

This command deletes a local model defined by a model_id.

Usage:

MacOs/Linux
Windows


cortex models delete <model_id>


cortex.exe models delete <model_id>

info

This command uses a model_id from the model that you have downloaded or available in your file system.

Options:

Option	Description	Required	Default value	Example
`model_id`	The identifier of the model you want to delete.	Yes	-	`mistral`
`-h`, `--help`	Display help for command.	No	-	`-h`

`cortex models update`

info

This CLI command calls the following API endpoint:

Update Model

This command updates the model.yaml file of a local model.

Usage:

MacOs/Linux
Windows


cortex models update [options]


cortex.exe models update [options]

Options:

Option	Description	Required	Default value	Example
`-h`, `--help`	Display help for command.	No	-	`-h`
`--model_id REQUIRED`	Unique identifier for the model.	Yes	-	`--model_id my_model`
`--name`	Name of the model.	No	-	`--name "GPT Model"`
`--model`	Model type or architecture.	No	-	`--model GPT-4`
`--version`	Version of the model to use.	No	-	`--version 1.2.0`
`--stop`	Stop token to terminate generation.	No	-	`--stop "</s>"`
`--top_p`	Sampling parameter for nucleus sampling.	No	-	`--top_p 0.9`
`--temperature`	Controls randomness in generation.	No	-	`--temperature 0.8`
`--frequency_penalty`	Penalizes repeated tokens based on frequency.	No	-	`--frequency_penalty 0.5`
`--presence_penalty`	Penalizes repeated tokens based on presence.	No	`0.0`	`--presence_penalty 0.6`
`--max_tokens`	Maximum number of tokens to generate.	No	-	`--max_tokens 1500`
`--stream`	Stream output tokens as they are generated.	No	`false`	`--stream true`
`--ngl`	Number of generations in parallel.	No	-	`--ngl 4`
`--ctx_len`	Maximum context length in tokens.	No	-	`--ctx_len 1024`
`--engine`	Compute engine for running the model.	No	-	`--engine CUDA`
`--prompt_template`	Template for the prompt structure.	No	-	`--prompt_template "###"`
`--system_template`	Template for system-level instructions.	No	-	`--system_template "SYSTEM"`
`--user_template`	Template for user inputs.	No	-	`--user_template "USER"`
`--ai_template`	Template for AI responses.	No	-	`--ai_template "ASSISTANT"`
`--os`	Operating system environment.	No	-	`--os Ubuntu`
`--gpu_arch`	GPU architecture specification.	No	-	`--gpu_arch A100`
`--quantization_method`	Quantization method for model weights.	No	-	`--quantization_method int8`
`--precision`	Floating point precision for computations.	No	`float32`	`--precision float16`
`--tp`	Tensor parallelism.	No	-	`--tp 4`
`--trtllm_version`	Version of the TRTLLM library.	No	-	`--trtllm_version 2.0`
`--text_model`	The model used for text generation.	No	-	`--text_model llama2`
`--files`	File path or resources associated with the model.	No	-	`--files config.json`
`--created`	Creation date of the model.	No	-	`--created 2024-01-01`
`--object`	The object type (e.g., model or file).	No	-	`--object model`
`--owned_by`	The owner or creator of the model.	No	-	`--owned_by "Company"`
`--seed`	Seed for random number generation.	No	-	`--seed 42`
`--dynatemp_range`	Range for dynamic temperature scaling.	No	-	`--dynatemp_range 0.7-1.0`
`--dynatemp_exponent`	Exponent for dynamic temperature scaling.	No	-	`--dynatemp_exponent 1.2`
`--top_k`	Top K sampling to limit token selection.	No	-	`--top_k 50`
`--min_p`	Minimum probability threshold for tokens.	No	-	`--min_p 0.1`
`--tfs_z`	Token frequency selection scaling factor.	No	-	`--tfs_z 0.5`
`--typ_p`	Typicality-based token selection probability.	No	-	`--typ_p 0.9`
`--repeat_last_n`	Number of last tokens to consider for repetition penalty.	No	-	`--repeat_last_n 64`
`--repeat_penalty`	Penalty for repeating tokens.	No	-	`--repeat_penalty 1.2`
`--mirostat`	Mirostat sampling method for stable generation.	No	-	`--mirostat 1`
`--mirostat_tau`	Target entropy for Mirostat.	No	-	`--mirostat_tau 5.0`
`--mirostat_eta`	Learning rate for Mirostat.	No	-	`--mirostat_eta 0.1`
`--penalize_nl`	Penalize new lines in generation.	No	`false`	`--penalize_nl true`
`--ignore_eos`	Ignore the end of sequence token.	No	`false`	`--ignore_eos true`
`--n_probs`	Number of probability outputs to return.	No	-	`--n_probs 5`

`cortex models import`

This command imports the local model using the model's gguf file.

Usage:

info

This CLI command calls the following API endpoint:

Import Model

MacOs/Linux
Windows


cortex models import --model_id <model_id> --model_path </path/to/your/model.gguf>


cortex.exe models import --model_id <model_id> --model_path </path/to/your/model.gguf>

Options:

Option	Description	Required	Default value	Example
`-h`, `--help`	Display help for command.	No	-	`-h`
`--model_id`	The identifier of the model.	Yes	-	`mistral`
`--model_path`	The path of the model source file.	Yes	-	`/path/to/your/model.gguf`

Subcommands:

cortex models get​

cortex models list​

cortex models start​

cortex models stop​

cortex models delete​

cortex models update​

cortex models import​

`cortex models get`

`cortex models list`

`cortex models start`

`cortex models stop`

`cortex models delete`

`cortex models update`

`cortex models import`