Skip to main content
Fhddos proxies Volcengine OpenSpeech text-to-speech (TTS) requests through the /volcark/openspeech/* path. You call the endpoint with your Fhddos API key, and the platform injects the upstream VolcArk TTS credentials automatically — your Fhddos key never reaches Volcengine’s logs. Request and response bodies are identical to the official Volcengine OpenSpeech documentation.
export BASE_URL="https://aiapi.fhddos.com"
export TOKEN="oh-xxxxxxxxxxxxxxxx"
Authorization: Bearer <TOKEN>
Content-Type: application/json
All TTS endpoints require a channel_id query parameter pointing to an enabled VolcArk channel. Your administrator must configure the channel and optionally set the custom_parameter.tts credentials within it.

Supported Endpoints

HTTP Interfaces

VersionModePath
V1Non-streaming (full audio at once)POST /volcark/openspeech/api/v1/tts?channel_id=<channel_id>
V3HTTP unidirectional streamingPOST /volcark/openspeech/api/v3/tts/unidirectional?channel_id=<channel_id>
V3Long-text async: submitPOST /volcark/openspeech/api/v3/tts/submit?channel_id=<channel_id>
V3Long-text async: queryPOST /volcark/openspeech/api/v3/tts/query?channel_id=<channel_id>

WebSocket Interfaces

VersionModePath
V1Binary unidirectional streamGET /volcark/openspeech/api/v1/tts/ws_binary?channel_id=<channel_id>
V3Unidirectional streamGET /volcark/openspeech/api/v3/tts/unidirectional/stream?channel_id=<channel_id>
V3Bidirectional streamGET /volcark/openspeech/api/v3/tts/bidirection?channel_id=<channel_id>

Credential Injection

Your administrator configures TTS credentials in the VolcArk channel’s custom_parameter.tts field:
{
  "tts": {
    "v1": {
      "token": "<v1_access_token>"
    },
    "v3": {
      "app_id": "<X-Api-App-Id>",
      "access_key": "<X-Api-Access-Key>",
      "resource_id": "seed-tts-1.1"
    }
  }
}
  • V1 token: If Authorization is absent from your request, Fhddos auto-sets Authorization: Bearer;<token> on the upstream call.
  • V3 credentials: If X-Api-App-Id, X-Api-Access-Key, or X-Api-Resource-Id are absent, Fhddos injects them from the channel config.
If you prefer to pass credentials directly in your request headers (e.g. for testing), Fhddos won’t overwrite headers you’ve already set.

V1 Non-Streaming HTTP

The V1 endpoint synthesizes the full audio in one shot and returns it as a base64-encoded string:
curl -X POST "$BASE_URL/volcark/openspeech/api/v1/tts?channel_id=123" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "app": {
      "appid": "appid123",
      "token": "any_non_empty_string",
      "cluster": "volcano_tts"
    },
    "user": {
      "uid": "uid123"
    },
    "audio": {
      "voice_type": "zh_male_M392_conversation_wvae_bigtts",
      "encoding": "mp3",
      "speed_ratio": 1.0
    },
    "request": {
      "reqid": "550e8400-e29b-41d4-a716-446655440000",
      "text": "Hello from Volcengine TTS",
      "operation": "query"
    }
  }'
The response body follows the official Volcengine format, containing code, message, data (base64 audio), sequence, and addition.

V3 HTTP Unidirectional Streaming

The V3 streaming endpoint delivers audio in multiple JSON chunks over an HTTP stream. Each chunk contains a base64-encoded audio segment:
curl -N "$BASE_URL/volcark/openspeech/api/v3/tts/unidirectional?channel_id=123" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-Control-Require-Usage-Tokens-Return: text_words" \
  -d '{
    "user": {
      "uid": "12345"
    },
    "req_params": {
      "text": "Welcome to Fhddos, your unified AI model gateway.",
      "speaker": "zh_female_shuangkuaisisi_moon_bigtts",
      "audio_params": {
        "format": "mp3",
        "sample_rate": 24000
      }
    }
  }'
Set X-Control-Require-Usage-Tokens-Return: text_words to receive a usage field in the final chunk that shows the billable character count. Fhddos passes X-Tt-Logid through the response headers to help with debugging.

V3 Long-Text Async Tasks

For long texts, use the two-step submit/query flow.

Step 1: Submit

curl -X POST "$BASE_URL/volcark/openspeech/api/v3/tts/submit?channel_id=123" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "user": {"uid": "12345"},
    "unique_id": "5dad8cff-aa5d-496d-a83e-e9c902f4d460",
    "req_params": {
      "text": "This is a longer text that will be synthesized asynchronously by Volcengine TTS.",
      "speaker": "zh_male_bvlazysheep",
      "audio_params": {
        "format": "mp3",
        "sample_rate": 24000
      }
    }
  }'
Response:
{
  "code": 20000000,
  "data": {
    "req_text_length": 11,
    "task_id": "e7438a29-ed47-4ef8-98a6-0a10a503d8b0",
    "task_status": 1
  },
  "message": "ok"
}

Step 2: Query

Poll using the task_id returned from submit:
curl -X POST "$BASE_URL/volcark/openspeech/api/v3/tts/query?channel_id=123" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "e7438a29-ed47-4ef8-98a6-0a10a503d8b0"
  }'
When complete, the response includes:
FieldDescription
audio_urlTime-limited signed URL to download the synthesized audio file
sentencesSentence-level and character-level timestamps
req_text_lengthOriginal input character count
synthesize_text_lengthActual synthesized character count
task_status1 = Running, 2 = Success, 3 = Failure
Fhddos does not modify any of these fields. Parse them directly using the Volcengine official documentation.

WebSocket Transparent Proxy

For WebSocket-based TTS (V1 binary or V3 streaming), Fhddos operates as a byte-level transparent proxy:
  • At connection time, Fhddos uses channel_id to select the VolcArk channel and injects TTS auth headers.
  • During the session, all WebSocket frames are forwarded bidirectionally without parsing or modification.
  • If either side disconnects, Fhddos closes the other connection immediately.
To migrate existing Volcengine WebSocket code to Fhddos, replace only the host and path:
# Before (direct Volcengine)
wss://openspeech.bytedance.com/api/v1/tts/ws_binary

# After (via Fhddos)
wss://aiapi.fhddos.com/volcark/openspeech/api/v1/tts/ws_binary?channel_id=<channel_id>
All handshake headers, binary protocol framing (Protocol version, Message type, event, payload), and audio content remain unchanged.

Voice and Audio Configuration

Configure the voice and output format in the audio (V1) or req_params.audio_params (V3) fields of your request:
ParameterDescriptionExample Values
voice_type / speakerVoice ID from the Volcengine voice libraryzh_female_shuangkuaisisi_moon_bigtts
encoding / formatAudio codecmp3, pcm, ogg_opus
speed_ratioPlayback speed multiplier0.52.0, default 1.0
sample_rateOutput sample rate (Hz)8000, 16000, 24000
Refer to the Volcengine OpenSpeech documentation for the full list of available voice IDs and parameter constraints.

Security Note

When your request reaches Fhddos, Authorization: Bearer oh-xxxxxxxx is used to authenticate you with Fhddos. Before forwarding to Volcengine, Fhddos strips this header entirely so your Fhddos key never appears in Volcengine’s access logs.