Loading connector details…
Loading connector details…
Choose a unique username to continue using AgentHotspot
by fasuizu-br • Uncategorized
Production AI APIs providing speech, text, image, and LLM inference services via REST endpoints and MCP servers.
Real-time speech recognition and pronunciation assessment.
Natural language processing capabilities like sentiment analysis and PII detection.
Image processing features such as background removal and face restoration.
Brainiall AI APIs offer a comprehensive suite of AI services including pronunciation assessment, text-to-speech, speech-to-text, NLP, image processing, and large language model inference. These services are accessible through REST APIs and MCP servers designed for AI agents, enabling fast, scalable, and versatile AI integration. The platform supports multiple authentication methods, extensive model options, and is compatible with popular AI SDKs and frameworks.
Assess English pronunciation quality from audio. Scores pronunciation at four levels: overall, sentence, word, and phoneme. Each score is 0-100. Phonemes are returned in both IPA and ARPAbet notation. Sub-300ms inference latency. Args: audio_base64: Base64-encoded audio data. Supports WAV, MP3, OGG, and WebM formats. text: The reference English text that the speaker was expected to read aloud. audio_format: Audio format hint — one of 'wav', 'mp3', 'ogg', 'webm'. Defaults to 'wav'. Returns: dict with keys: - overallScore (int 0-100): Overall pronunciation quality - sentenceScore (int 0-100): Sentence-level fluency and accuracy - words (list): Per-word scores, each containing: - word (str): The word - score (int 0-100): Word pronunciation score - phonemes (list): Per-phoneme scores with IPA/ARPAbet notation - decodedTranscript (str): What the model heard (ASR transcript) - transcript (str): Reference text - confidence (float 0-1): Scoring confidence - warnings (list[str]): Quality warnings if any - audioQuality (dict): Audio metrics (SNR, peak/RMS dB, etc.)
Check if the pronunciation assessment service is healthy and ready. Returns: dict with keys: - status (str): 'healthy' or error state - modelLoaded (bool): Whether the scoring model is loaded - version (str): API version
Get the full phoneme inventory supported by the pronunciation scorer. Returns a list of all English phonemes the engine can assess, including ARPAbet symbol, IPA equivalent, example word, and phoneme category (vowel, consonant, diphthong). Returns: list of dicts, each with keys: - arpabet (str): ARPAbet symbol (e.g. 'AA', 'TH') - ipa (str): IPA notation - example (str): Example word containing the phoneme - category (str): vowel, consonant, or diphthong
Transcribe audio to text with word-level timestamps. Converts spoken English audio into text with optional word-level timestamps and per-word confidence scores. Args: audio_base64: Base64-encoded audio data (WAV, MP3, OGG, FLAC, WebM). audio_format: Audio format hint. Auto-detected from magic bytes if omitted. include_timestamps: Whether to include word-level timing (default: true). Returns: dict with keys: - text (str): Full decoded transcript - words (list): Per-word results with timestamps, each containing: - word (str): The transcribed word - start (float): Start time in seconds - end (float): End time in seconds - confidence (float 0-1): Word-level confidence - audioDurationMs (int): Audio duration in milliseconds - metadata (dict): Processing time, audio length, model version - audioQuality (dict): Audio metrics (SNR, peak/RMS dB, etc.)
Check if the speech-to-text service is healthy and ready. Returns: dict with keys: - status (str): 'healthy' or error state - modelLoaded (bool): Whether the STT model is loaded - version (str): API version
Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.