Developing Provider
Providers are KiraAI's unified interfaces for calling large language models, speech synthesis, image generation, and other models. KiraAI has built-in common model providers, but they cannot meet all scenarios.
This chapter will guide you through developing a KiraAI provider.
File Structure
All providers are located in the core/provider/src directory.
The file structure of a KiraAI provider is as follows:
my_provider/
├── provider.py # Provider main logic
├── model_clients.py # Optional, separate file for model client implementations
├── manifest.json # Provider manifest file
├── schema.json # Provider configuration parameters Schema file
├── requirements.txt # Provider dependencies
└── README.md # Provider descriptionProvider Manifest File
The provider manifest file (manifest.json) is a configuration file used by KiraAI to identify and load providers. It contains basic provider information such as name, version, dependencies, etc.
{
"name": "Provider name, such as OpenAI",
"version": "1.0.0",
"author": "Author",
"description": "Provider description",
"repo": "Provider repository URL or null"
}Provider Configuration Parameters Schema File
{
"provider_config": {
"config_key": {
"name": "Configuration item display name",
"type": "Configuration item type",
"default": "Configuration item default value",
"hint": "Configuration item description"
}
},
"model_config": {
"model_type": {
"config_key": {
"name": "Configuration item display name",
"type": "Configuration item type",
"default": "Configuration item default value",
"hint": "Configuration item description"
}
}
}
}The following are explanations of the related fields:
provider_config: Provider configuration parameters, used to configure global parameters of the provider, such as API keys, model lists, etc.model_config: Model configuration parameters, used to configure model parameters, such as model name, temperature, maximum length, etc.model_type: Model type, available model types are:llm: Large language modeltts: Text-to-speech modelstt: Speech-to-text modelimage: Image generation modelvideo: Video generation modelembedding: Embedding modelrerank: Reranking model
config_key: Configuration item key, used to identify the configuration item.name: Configuration item display name, used to display on WebUI.type: Configuration item type, such asstring,integer,float, etc.default: Configuration item default value.hint: Configuration item description, used to display on WebUI.enum: Optional value list, only valid fortypeasstring.
Provider Main Logic
Taking OpenAI provider's main logic as an example:
from core.provider import ModelType, BaseProvider
from .model_clients import OpenAILLMClient, OpenAIImageClient, OpenAIEmbeddingClient
class OpenAIProvider(BaseProvider):
models = {
ModelType.LLM: OpenAILLMClient,
ModelType.IMAGE: OpenAIImageClient,
ModelType.EMBEDDING: OpenAIEmbeddingClient
}
def __init__(self, provider_id, provider_name, provider_config):
super().__init__(provider_id, provider_name, provider_config)The provider class needs to inherit from the BaseProvider class, which defines the basic interface of the provider.
The specific model clients are defined in the class attribute models of the provider class, with the key being the model type and the value being the model client class.
The provider class is automatically loaded when KiraAI starts. KiraAI will load the corresponding model client classes based on the configuration in the provider manifest file and model type, and provide a visual configuration interface in the WebUI.
Model Clients
WARNING
Some model client base classes may change with version updates, but in most cases, they will be compatible with older versions.
LLM Client
The LLM client is responsible for interacting with large language models, including calling the model, processing requests and responses, etc.
from core.provider import ModelInfo, LLMModelClient
from core.provider.llm_model import LLMRequest, LLMResponse
class OpenAILLMClient(LLMModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def chat(self, request: LLMRequest, **kwargs) -> LLMResponse:
# Call OpenAI API
...
return LLMResponse(...)The LLM client needs to implement the chat method, which is responsible for calling the large language model and returning the model's response.
For the definitions of LLMRequest and LLMResponse, please refer to...
TTS Client
The TTS client is responsible for interacting with the text-to-speech model, receiving text input, and returning a speech object.
from core.provider import ModelInfo, TTSModelClient
from core.chat.message_elements import Record
class CustomTTSClient(TTSModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def text_to_speech(self, text: str, **kwargs) -> Record:
# Call TTS API
...
return Record(...)STT Client
The STT client is responsible for interacting with the speech recognition model, receiving a speech object, and returning text.
from core.provider import ModelInfo, STTModelClient
from core.chat.message_elements import Record
class CustomSTTClient(STTModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def speech_to_text(self, record: Record, **kwargs):
# Call STT API
...
return ...IMAGE Client
The IMAGE client is responsible for interacting with the image generation model, receiving text or image input, and returning an image object.
from core.provider import ModelInfo, ImageModelClient
from core.chat.message_elements import Image
class CustomImageClient(ImageModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def text_to_image(self, prompt: str, **kwargs) -> Image:
# Call IMAGE API
...
return Image(...)
async def image_to_image(self, prompt: str, image: Image, **kwargs) -> Image:
# Call IMAGE API
...
return Image(...)VIDEO Client
The VIDEO client is responsible for interacting with the video generation model, receiving text or video input, and returning a video object.
from core.provider import ModelInfo, VideoModelClient
from core.chat.message_elements import Video
class CustomVideoClient(VideoModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def generate_video(self, prompt: str, ref: list[Image] = None, duration: int = 5, **kwargs) -> Video:
# Call VIDEO API
...
return Video(...)EMBEDDING Client
The EMBEDDING client is responsible for interacting with the embedding model, receiving text input, and returning an embedding vector.
from core.provider import ModelInfo, EmbeddingModelClient
class CustomEmbeddingClient(EmbeddingModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def embed(self, texts: list[str]) -> list[list[float]]:
# Call EMBEDDING API
...
return ...RERANK Client
The RERANK client is responsible for interacting with the reranking model, receiving text input, and returning reranking results.
from typing import Optional
from core.provider import ModelInfo, RerankModelClient
from core.provider.llm_model import RerankResult
class CustomRerankClient(RerankModelClient):
def __init__(self, model: ModelInfo):
super().__init__(model)
async def rerank(
self,
query: str,
documents: list[str],
top_n: Optional[int] = None,
**kwargs
) -> list[RerankResult]:
# Call RERANK API
...
return ...