Table of Contents

Namespace Glitch9.AIDevKit

Classes

AIApiException
AIAsset
AIAssetExtensions
AIBehaviour
AIClientException
AIClient<TSelf, TSettings>
AIDevKitHub

Exposed central hub for AIDevKit for customization and user context.

AIDevKitSettings
AIProviderSettings

Base class for AI client settings. This class is used to store API keys and other settings related to AI clients.

AIProviders
AIRequest
AllowedTools
Annotation
AnnotationJsonConverter

Polymorphic JSON converter for Annotation and its derived types. Uses the "type" discriminator field.

AnnotationWrapper

Non-flattened wrapper for different types of annotations.

AnthropicModel
ApiPopupAttribute
ApiRefAttribute
ApiSpecificAttribute
ApiSpecificPropertyAttribute
ApproximateLocation
AudioData
AudioDelta
AudioGenerationRequest<TSelf, TPrompt>
AudioIsolationParameters
AudioIsolationRequest
AudioPart
AudioPrice
AudioUsage
BaseApiSpecificPropertyAttribute
BinaryGenerativeAudioEvent

A generative audio event that contains binary audio data.

BinaryGenerativeAudioEventParser
BrokenResponseException
ClickAction
CodeGenerationRequest

Added new on 2025.05.28 Task for generating code snippets or scripts for Unity C#.

CodeInterpreter

A tool that runs Python code to help generate a response to a prompt.

CodeInterpreter.FileIdSet

Code interpreter container.

CodeInterpreterOutput

A tool call to run code.

CodeInterpreterOutputImage
CodeInterpreterOutputLogs
CodeInterpreterParameters
CodeInterpreterResult

Be careful. This is not a separate tool call, but a sub-object used within CodeInterpreterCall.

ComparisonFilter

A filter used to compare a specified attribute key to a given value using a defined comparison operation.

CompletionRequest

Legacy completion request for models that do not support chat-based interactions.

CompoundFilter

Combine multiple filters using and or or.

ComputerAction
ComputerUse

A tool that controls a virtual computer.

ComputerUseCall

A tool call to a computer use tool. See the computer use guide for more information.

ComputerUseOutput

The output of a computer tool call.

ComputerUseParameters
ComputerUseSafetyCheck
ComputerUseScreenshotInfo
ContainerFileCitation
ContentPart

Base class for different types of content parts in a message. Each content part has a defined type, such as Text, Image(Url/Base64/FileId), Audio(Base64), or File(Base64/FileId).

ContentPart<T>

Base class for different types of content parts in a message. Each content part has a defined type, such as Text, Image(Url/Base64/FileId), Audio(Base64), or File(Base64/FileId).

ConversationItem
ConversationItemExtensions
ConversationItemStatus
ConversationItemType
CountTokensOutput
CountTokensRequest
CustomTool

A custom tool that processes input using a specified format.

CustomToolCall

A call to a custom tool created by the model.

CustomToolChoice
CustomToolFormat

Polymorphic format for custom tool input.

CustomToolFormatConverter

Polymorphic converter for CustomToolFormat.

CustomToolOutput

The output of a custom tool call from your code, being sent back to the model.

CustomToolParameters
DeleteModelRequest
DeltaEventBase
DeltaEvent<T>
DoubleClickAction
DragAction
DragAction.Coordinate
ElevenLabsModel
ElevenLabsTypes

Types only used by ElevenLabs API.
This shit is here instead of the ElevenLabs assembly because this is used by GENTask and the UnityEditor Generator Windows.

EmbeddingPrompt
EmbeddingRequest
EmbeddingResult
Embedding_OpenAI
EmptyResponseException
FieldRefAttribute
FileCatalog

ScriptableObject database for storing file data. This database is used to keep track of the files available in the AI library.

FileCatalog.Repo

Database for storing file data.

FileCitation
FileData
FileDeleteRequest
FileDownloadRequest
FilePart
FilePath
FileSearch

A tool that searches for relevant content from uploaded files.

FileSearch.RankingOptions

Ranking options for search.

FileSearchOutput
FileSearchParameters
FileSearchResult
FileUploadRequest
FindAction
FineTuningFile

A JSONL file is a text file where each line is a valid JSON object. This format is commonly used for training data in machine learning tasks, including fine-tuning.

FineTuningRequest
FluentApiRequestBuilderExt
FluentApiRequestCallerExt

Beginner-friendly fluent extension methods that create request objects for generative AI. These helpers do not send any network calls until you invoke .ExecuteAsync().

  • Pattern: host.GENXxx().SetModel(...).ExecuteAsync()
  • Thin factories only; they return strongly-typed *Request objects.
  • No background work, no I/O, no async until .ExecuteAsync().
FluentApiRequestOptions

원래 RESTOptions썼는데 존나 헷갈려서 따로 DTO로 분리

FluentApiRequest<TSelf, TResult>
FreePrice
FrequencyPenalty

Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Function

Structured representation of a function declaration as defined by the OpenAPI 3.03 specification. Included in this declaration are the function name and parameters. This FunctionDeclaration is a representation of a block of code that can be used as a Tool by the model and executed by the client.

FunctionCall

A tool call to run a function. See the function calling guide for more information.

FunctionOutput

The output of a function tool call.

FunctionParameters
FunctionPropertyAttribute

Attribute for marking properties as function parameters in JSON Schema for LLM function calls.

Note: It's a duplicate of JsonSchemaPropertyAttribute for clarity and intent.

FunctionSchemaAttribute

OpenAI styled JSON Schema attribute for annotating classes for LLM function calls.

Note: It's a duplicate of StrictJsonSchemaAttribute for clarity and intent.

FunctionToolChoice
GeneratedAudio
GeneratedBase<T>

You will never know if the AI generated result is a single or multiple values. So this class is used to represent both cases: a value or an array of values.

GeneratedExtensions
GeneratedImage

Represents the Url or the content of an image generated by Image Generation AI.

GeneratedOutput<T>
GeneratedText

Represents a generated text result from an AI model.

Generated<T>

New generic-type based design for File outputs

GenerativeRequestStreamCompleter<TEvent, TChunk, TOutput>
GenerativeRequest<TSelf, TInput, TOutput, TChunk, TEvent>

Abstract base class for all generative AI tasks. Provides common properties and methods for handling prompts, models, outputs, and execution.

GenerativeSequence

Orchestrates a series of generative tasks (text/image/audio) where each step can consume the previous output.

  • Build a pipeline with Append* methods, then run once with ExecuteAsync().
  • Keeps the most recent outputs (text/image/audio) in an internal buffer for the next step.
  • Type safety at runtime: each appended task must return the expected type (e.g., string for Text).
  • Stops on first exception; wrap ExecuteAsync() in your own try/catch if you need partial tolerance.
await new GENSequence()
    .AppendText(new GENResponseTask(new TextPrompt("Give me a short poem about the ocean.")))
    .AppendTextToImage(text => new GENImageTask(new TextPrompt($"Illustrate: {text}")))
    .AppendInterval(0.5f)
    .AppendImageToAudio(tex => new GENSpeechTask(new TextPrompt("Narrate the poem over ambient waves.")))
    .ExecuteAsync();
GetCreditsRequest

Get total credits purchased and used for the authenticated user

GetCustomModelRequest
GetModelRequest
GetVoiceRequest
GoogleModel
GoogleTypes

Types only used by Google API.
This shit is here instead of the Google assembly because this is used by GENTask and the UnityEditor Generator Windows.

GoogleTypes.UploadMetadata
GrammarCustomToolFormat

A grammar defined by the user.

HostedToolChoice

Only for Responses API. Indicates that the model should use a built-in tool to generate a response.
Learn more about built-in tools: https://platform.openai.com/docs/guides/tools

Allowed types (2025-09-21):

  • file_search
  • web_search_preview
  • computer_use_preview
  • code_interpreter
  • image_generation
HyperParameters

The hyperparameters used for the fine-tuning job.

IPromptExtensions
ImageCompressionLevel

output_compression integer or null Optional Defaults to 100 The compression level (0-100%) for the generated images. This parameter is only supported for gpt-image-1 with the webp or jpeg output formats, and defaults to 100.

ImageData
ImageDelta
ImageGenerationOutput

An image generation request made by the model.

ImageGenerationRequest

Task for generating image(s) from text using supported models (e.g., OpenAI DALL·E, Google Imagen).

ImageGenerationTool

A tool that generates images using a model like gpt-image-1.

ImageInpaintingRequest

Task for editing an existing image based on a text prompt and optional mask (OpenAI or Google Gemini).

ImageParameters
ImagePart
ImagePrice
ImagePrompt

A specialized prompt for various image-related requests, such as image inpainting, rotation, animation, etc.
This class is used to pass the instruction and the image to the respective image model for processing.

ImageQualitySwitchAttribute
ImageReference

A reference to an image, either by file ID or base64-encoded data.

ImageSizeSwitchAttribute
ImageUsage
InappropriatePromptException
IncompleteDetails

Details on why the response is incomplete. Will be null if the response is not incomplete.

IncompleteResponseException
InputAudioBufferEvent
InterruptedResponseException
InvalidPromptException
ItemReference
JsonSchemaFormat
KeyboardTypeAction
LanguageModelRequest<TSelf, TInput, TOutput>

Base class for text generation tasks using LLM models. Supports instructions, role-based prompts, and attachments.

ListCustomModelsRequest
ListCustomVoicesRequest
ListFilesRequest
ListModelsRequest
ListVoicesRequest
LocalPropertyAttribute
LocalShell

A tool that allows the model to execute shell commands in a local environment.

LocalShellCall

A call to run a command on the local shell.

LocalShellOutput

The output from a local shell tool call.

LocalShellParameters
Location
LogProb
LogitBias

Optional. Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling.

The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. Defaults to null.

Logprobs

Whether to return log probabilities of the Output tokens or not. If true, returns the log probabilities of each Output token returned in the content of message. This option is currently not available on the gpt-4-vision-preview model. Defaults to 0.

Mcp

Give the model access to additional tools via remote Model Context Protocol (MCP) servers.

McpApprovalRequest

A request for human approval of a tool invocation.
Model > User

McpApprovalRequestEvent
McpApprovalResponse

A response to an MCP approval request.
User > Model

McpException
McpHttpException
McpListToolsCallOutput

A list of tools available on an MCP server.
Model > User

McpOutput

An invocation of a tool on an MCP server. This is both a tool call (from model to user) and a tool call output (from user to model).
Model > User > And you should send the corresponding output back to the model

McpParameters
McpProtocalException
McpServerConnectorNotFoundException
McpServerNotFoundException
McpToolApprovalFilter

Specify which of the MCP server's tools require approval.

McpToolChoice
McpToolExecutionException
McpToolInfo

Information about a tool available on an MCP server.

McpToolPermissionInfo
McpToolRefList

List of allowed tool names or a filter object.

Message
MessageContent

Text:

  • ChatCompletion > ChatChoice[] > Message[] > MessageContent > StringOrPart > Text
    MessageContentPart:
  • ChatCompletion > ChatChoice[] > Message[] > MessageContent > StringOrPart > MessageContentPart[]
MessageMapper
MicrosoftTypes
Model

ScriptableObject representation of a generative AI model with metadata, configuration, and pricing information. Supports token limits, ownership, creation time, and dynamic pricing for various content types (text, image, audio).

ModelCatalog

ScriptableObject database for storing model data. This database is used to keep track of the models available in the AI library.

ModelCatalog.Repo

Database for storing model data.

ModelFamily

Defines the family names of various AI models and services.

Warning!! DO NOT MAKE THIS INTO ENUM
Enum will make it hard to maintain because if you insert a new family inbetween existing families, it will break the order.

ModelNotFoundInLibraryException
ModelNotFoundOnApiException
ModelPopupAttribute
ModelPrice
ModelRefAttribute
ModelResponseException
ModelTypeExtensions
ModerationParameters
ModerationPrompt

Not directly used as a prompt, but other prompts can convert to this type for moderation requests.
This class is used to pass the text and optional images to the moderation model for processing.

ModerationRequest

Audio not supported yet.

ModerationResult
MouseActionBase
MoveAction
NCount

The number of responses to generate. Must be between 1 and 10.

NonStreamGenerativeRequest<TSelf, TInput, TOutput>
NoopGenerativeParameters
NotSupportedEndpointException

Exception thrown when a requested endpoint is not supported by the specified API.

OpenAIModel
OpenAITypes

Types only used by OpenAI API.
This shit is here instead of the OpenAI assembly because this is used by GENTask and the UnityEditor Generator Windows.

OpenPageAction
OpenRouterModel
PerplexityTypes
PresencePenalty

Defaults to 0 Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

PressKeyAction
ProjectContext
Prompt
PromptBase<T>
PromptFeedback

A set of the feedback metadata the prompt specified in GenerateContentRequest.Contents.

PromptTemplate

A reference to a predefined prompt template stored on the AI provider's servers.
This allows you to use complex prompt templates without having to include the full text of the prompt in your request.
Instead, you can simply reference the prompt by its unique identifier and provide any necessary variables for substitution.
This can help to keep your requests smaller and more manageable, especially when working with large or complex prompts.

Example Template: "Write a daily report for ${name} about today's sales. Include top 3 products."

Prompt<T>
ProviderBridgeAttribute
ProviderBridgeRegistry
RateLimitExceededException
RealtimeApiException
RealtimeSessionStatusEvent
Reasoning
ReasoningOptions
RecordMergeOptions
RedactedReasoning

Anthropic-specific class. Represents a block of content where the model's internal reasoning or "thinking" has been intentionally hidden (redacted) before being returned to the client.

RequestPrice
RequestUsage
ResponseFormat
ResponseMessage
SafetyFeedback

Safety feedback for an entire request.

This field is populated if content in the input and/or response is blocked due to safety settings. SafetyFeedback may not exist for every HarmCategory. Each SafetyFeedback will return the safety settings used by the request as well as the lowest HarmProbability that should be allowed in order to return a result.

SafetyIdentifier

A stable identifier used to help detect users of your application that may be violating OpenAI's usage policies. The IDs should be a string that uniquely identifies each user. We recommend hashing their username or email address, in order to avoid sending us any identifying information. https://platform.openai.com/docs/guides/safety-best-practices#safety-identifiers

SafetyRating

A safety rating associated with a {@link GenerateContentCandidate}

SafetySetting

Safety setting, affecting the safety-blocking behavior. Passing a safety setting for a category changes the allowed probability that content is blocked.

SafetySettingExtensions
ScreenshotAction
ScrollAction
SearchAction
Seed

Random seed for deterministic sampling (when supported):

  • Purpose – Reproduce the same output across runs with identical inputs.
  • Scope – Holds only if provider, model/deployment, version, and all params are unchanged.
  • null – Lets the service choose a random seed (non-deterministic).
  • Range – 0–4,294,967,295 (32-bit).
  • Support – Some models/services ignore seeds; if unsupported, this has no effect.
SegmentObject
ServerDictionary

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.

ShellCommand
ShellCommandCatalog
ShellCommandCatalog.Repo
ShellCommandEntry
SoundEffectGenerationRequest

Task for generating sound effects based on a text prompt.

SpeechGenerationOptions
SpeechGenerationRequest

Task for generating synthetic speech (text-to-speech) using the specified model.

SpeechGenerationRequestBase<TSelf, TPrompt>
SpeechParameters
SpeechSpeed

The speed of the model's spoken response as a multiple of the original speed. 1.0 is the default speed. 0.25 is the minimum speed. 1.5 is the maximum speed. This value can only be changed in between model turns, not while a response is in progress.

This parameter is a post-processing adjustment to the audio after it is generated, it's also possible to prompt the model to speak faster or slower.

SpeechTranslationRequest

Task for translating speech into English text using the speech translation model.

SpokenLanguagePopupAttribute
StatefulItem
StreamEventArgs
StreamEventArgs<TType>
StreamOptions
StrictJsonSchema

OpenAI styled JSON Schema for strict response formatting.

StrictJsonSchemaAttribute

OpenAI styled Strict JSON Schema attribute for annotating classes.

StrictJsonSchemaAttribute > StrictJsonSchema > JsonSchemaFormat(ResponseFormat)

StrictJsonSchemaExtensions
StringOrTextAsset
StringOrTextAssetExtensions
StructuredOutputRequest<T>

Task for generating structured output (e.g., JSON) using an LLM model.

StructuredOutput<T>
SystemMessage
Temperature

Sampling temperature: controls randomness in output.

  • Lower = deterministic
  • Higher = creative
Range: 0.0–2.0 (typical: 0.7–1.0).
TextCustomToolFormat

Unconstrained free-form text format.

TextDelta
TextDeltaThrottler

들쑥날쑥 들어오는 텍스트 조각을 내부 버퍼에 모아 고정 주기(기본 30Hz)로 합쳐 내보내는 스로틀러.

TextEditorParameters
TextOutput
TextPart

Text, Refusal, InputText, OutputText content part.

TextResponseOptions
TextSpansEvent
TimeWindowExtensions
TokenCount

Used to set 'max_tokens', 'max_completions_tokens', 'max_output_tokens', etc. Must be greater than or equal to 1024. Set it null to disable the limit.

TokenId
TokenPrice
TokenUsage
TokenizeOutput
TokenizeRequest
Tool

Base class for all tools, includes type.

ToolCall
ToolCallArgs
ToolCallEvent
ToolChoice

This can be a String or an Object Specifies a tool the model should use. Use to force the model to call a specific tool.

ToolMessage

API don't send these anymore. It is only used to send the tool outputs from the 'client' side.

ToolOutput
ToolOutputEvent
ToolOutputTimeoutException
ToolReference
ToolStatusEvent
ToolTypeExtensions
TopK
TopP
Transcript
TranscriptionParameters
TranscriptionPrice
TranscriptionRequest

Task for converting speech audio into text (speech-to-text).

TranscriptionRequestBase<TSelf, TOutput, TEvent>
TranscriptionUsage
TruncationStrategy
UnhandledToolCallException
UnknownAction
UnknownItem
UploadedFile
UploadedFileExtensions
UrlCitation
UrlSearchSource
Usage
UsageMetadata

Usage metadata returned by AI service providers after a generation request. Contains token usage details for billing and monitoring.

UserMessage
VerboseTranscript

Represents a verbose json transcription response returned by model, based on the provided input. Used by OpenAI, GroqCloud, and other compatible services.

VideoGenerationRequest
VideoParameters
VisualMediaGenerationRequest<TTask, TPrompt, TOutput, TChunk, TEvent>
Voice
VoiceCatalog

ScriptableObject database for storing voice data used for TTS (Text-to-Speech) and other voice-related tasks.

VoiceCatalog.Repo

Database for storing voice data.

VoiceChangeParameters
VoiceChangeRequest
VoiceData
VoicePopupAttribute
VoiceStyleConverter
VoiceUtil
WaitAction
WebSearch

Search the Internet for sources related to the prompt.

WebSearchAction
WebSearchFilter

Filters for the search.

WebSearchOptions
WebSearchOptionsWrapper
WebSearchOutput

A tool call to perform a web search action.
This tool call does not have a corresponding output class, as the results are returned via text messages.

WebSearchParameters
WebSearchPreview

This tool searches the web for relevant results to use in a response.

WebSearchPrice
WebSearchSource
WebSearchUsage
WordObject
XSearchParameters

Structs

FluentApiRequestType
ImageQuality

The quality of the image that will be generated. HD creates images with finer details and greater consistency across the image. This param is only supported for DallE3.

ImageSize
ServiceTier

The service tier to use for the request. "auto" lets the system choose the appropriate tier based on context. Different providers may have different tier names and meanings. See provider documentation for details.

StreamHeader
TextSpan
ToolType
TruncationType
VideoSize

Interfaces

IAssetData
IAssetFilter<T>
IChatApiListenerBase
IComputerUseResult
ICreditData
IDeltaListener<TEvent>
IErrorHandler
IEventListener<T>
IFileSearchFilter
IFineTuningResult
IGeneratedFiles
IGeneratedOutput
IGenerativeAudioEvent
IGenerativeEvent<TChunk, TOutput>
IGenerativeImageEvent
IGenerativeParameters
IGenerativeRequest
IImageDeltaListener
IInputAudioBufferListener
ILanguageModelRequest
IListener
IMcp
IModelData

Interface for model data retrieved from various AI APIs. (e.g., /v1/models) This interface defines the properties that all model data should implement. It is used to standardize the model data across different AI providers.

IModeratable
IMultiProviderJsonWriter<TModel>
INoopStreamEvent<TOutput>
INoopStreamEvent<TChunk, TOutput>
IPrompt

'Prompt' is a general term for the input given to an AI model to generate a response.
This can include text prompts, image prompts, audio prompts, or any combination thereof.
The prompt can be a simple string, a more complex object, or even a file.
The purpose of the prompt is to guide the AI model in generating a relevant and accurate response.

IPromptWithFiles

Prompts that require loading (e.g., from files) should implement this interface.
This ensures that any necessary loading operations are handled before the prompt is used.

IProviderBridge
IProviderData
IRealtimeApiListener
IResponsesApiListener
ISequentialRequest

Interface for tasks that can be executed as part of a sequence.

ITextDeltaListener
ITextSpanData
ITextSpansListener
IToolCallArgsListener
IToolCallOutput
IToolOutputListener
IToolParameters
IToolStatusListener
ITranscriptionEvent
IUploadedFile

Represents a file object retrieved from an AI provider (e.g., /v1/files).

  • Provides common properties such as file size, MIME type, and timestamps.
  • Normalizes file information across providers like OpenAI and Google.
  • Used as the return type for file-related tasks (e.g., UploadFileTask, DownloadFileTask).
IUsageHandler
IUserProfile

Attach this interface to your user class to enable AIDevKit features. This interface is used to provide user-specific context and settings.

IVoiceData

Interface for voice data retrieved from various AI APIs.
This interface defines the properties that all voice data should implement. It is used to standardize the voice data across different AI providers.

Enums

Api

Identifies the available AI service providers for API integrations. Didn't call it 'Provider' because some services are self-hosted/local (e.g. Ollama).

ArtStyle
ChatRole
CodeReferenceSource
ComparisonType
CompoundType
ContentFormat
CustomToolFormatType
ElevenLabsTypes.InputFormat

The format of input audio. Options are ‘pcm_s16le_16’ or ‘other’ For pcm_s16le_16, the input audio must be 16-bit PCM at a 16kHz sample rate, single channel (mono), and little-endian byte order. Latency will be lower than with passing an encoded waveform.

ElevenLabsTypes.OutputFormat

Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs. Default is mp3_44100_128.

EmbedTaskType

Google Only. Task type for embedding content.

GameGenre
GameTheme
Gender

Mainly used as a TTS Voice property.

GeneratedImageFormat
GoogleTypes.AspectRatio
GoogleTypes.PersonGeneration
GoogleTypes.Resolution
HarmBlockThreshold

Block at and beyond a specified harm probability.

HarmCategory

Represents the category of harm that a piece of content may fall into. This is used in moderation tasks to classify content based on its potential harm.

HarmProbability

Probability that a prompt or candidate matches a harm category.

ImageBackground
InputAudioBufferEvent.Type
ItemStatus

Used by Responses API.
Unified status for all response items (messages, tool calls, searches, code-interpreter runs, generations, etc.).

Core lifecycle values (in_progress, completed, incomplete, failed) map directly to the OpenAI Responses API and are populated when items are returned from the API.
Domain-phase values (searching, generating, interpreting, partial) are used by AIDevKit to represent more detailed sub-states for search, tool execution, code interpretation, and streaming generation.

  • For input messages (system, developer, user), this field is optional.
  • For assistant output items (assistant messages, tools, searches, code interpreter, etc.), this field is required when returned from the API or internal pipelines.
  • partial indicates a non-final, intermediate snapshot (e.g., while streaming).
LanguageTone
McpServerConnectionType
MediaGenOp
MicrosoftTypes.TranscriptionApi
Modalities

"Modality" refers to the type or form of data that a model is designed to process, either as input or output. In AI and machine learning contexts, modality describes the nature of the information being handled — such as text, image, audio, or video.

For example:

  • A text-to-text model like GPT-4 processes text inputs and generates text outputs.
  • A text-to-image model like DALL·E takes text prompts and produces images.
  • A multimodal model like Gemini can process multiple types of data simultaneously, such as combining text and image inputs.

The concept of modality helps categorize models based on the kinds of sensory or informational data they handle, and is especially important for understanding the capabilities and limitations of a model.

ModelCapabilities

Unified Model Capabilities Enum Combines capabilities across different model types for easier management.

ModelType

Types of AI Models. Multi-modal models such as Gemini should be classified under their primary function, typically as Language Models.

MouseButton
NamingRule

Defines rules for generating unique file names.

OSMask
OpenAITypes.AudioStreamFormat
OpenAITypes.Fidelity
OpenAITypes.ImageDetail
OpenAITypes.ImageStyle

The style of the generated images. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. This param is only supported for DallE3.

OpenAITypes.MediaAspect
OpenAITypes.SpeechOutputFormat
OpenAITypes.UploadPurpose
PerplexityTypes.WebSearchMode
Platform
PromptType
RealtimeSessionStatus
ReasoningEffort
ReasoningFormat

GroqCloud-specific parameter

ReasoningOptions.SummaryLevel
RequireApproval
ResponseVerbosity
SearchContextSize
StopReason

The reason the model stopped generating tokens. It is also called "finish_reason" in some APIs.

TextSpanType
TextType
TimeWindow
TokenCountPreset
TokenType
ToolChoiceMode
TranscriptFormat
UsageType
VoiceAge
VoiceCategory

The category of the voice.

VoiceStyle
VoiceType
WebSearchLocationMode