GLM by Zhipu AI: Chat, Web Search via Fhddos

GLM is Zhipu AI’s family of general-purpose language models, available through Fhddos via two interface options: an OpenAI-compatible chat completions endpoint and an Anthropic-compatible messages endpoint. Start with POST /v1/chat/completions for most use cases, and switch to POST /v1/messages only when your application explicitly depends on the Anthropic Messages protocol.

Environment setup

export BASE_URL="https://aiapi.fhddos.com"
export TOKEN="your-fhddos-token"

Verify access and list available models:

curl "$BASE_URL/v1/models" \
  -H "Authorization: Bearer $TOKEN"

Available models

Model	Notes
`glm-5.1`	Latest generation general-purpose model
`glm-5`	New-generation general model
`glm-4.7`	General text; natively supports the `web_search` tool
`glm-4.6`	Earlier text model

The actual set of models available to your token may differ. Run GET /v1/models to see the authorised list.

Interface options

OpenAI Compatible

POST /v1/chat/completions — recommended for most use cases, preserves GLM’s native thinking and web_search fields.

Anthropic Compatible

POST /v1/messages — use this only when your app or SDK explicitly requires the Anthropic Messages protocol.

OpenAI-compatible chat completions

Send a standard POST /v1/chat/completions request using any OpenAI SDK or HTTP client:

curl -X POST "$BASE_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {"role": "system", "content": "You are a concise technical assistant."},
      {"role": "user", "content": "Introduce GLM in three sentences."}
    ],
    "max_tokens": 256
  }'

GLM-specific extensions

The OpenAI-compatible entry point transparently passes through two GLM-native fields: thinking object — control chain-of-thought reasoning:

{
  "model": "glm-5.1",
  "thinking": {
    "type": "disabled"
  }
}

tools.type=web_search — enable native web search (see Web search below).

Response structure

A standard chat call returns the familiar OpenAI response shape:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "GLM is Zhipu AI's flagship language model series..."
      }
    }
  ],
  "usage": { "prompt_tokens": 32, "completion_tokens": 48, "total_tokens": 80 },
  "reasoning_content": "..."
}

When a web_search tool was active and the upstream returns evidence, a top-level web_search array is appended to the response:

{
  "choices": [{ "message": { "content": "Today's date is 2026-06-03." } }],
  "web_search": [
    {
      "title": "Current Local Time in Beijing, Beijing Municipality, China",
      "link": "https://www.timeanddate.com/worldclock/china/beijing",
      "refer": "ref_1"
    }
  ]
}

Web search

GLM’s web search capability uses tools.type=web_search, but it requires a full web_search sub-object — not just a bare type field.

Correct request shape

curl -X POST "$BASE_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {
        "role": "user",
        "content": "Answer with only the current Beijing date in YYYY-MM-DD format, citing your sources."
      }
    ],
    "tools": [
      {
        "type": "web_search",
        "web_search": {
          "enable": true,
          "search_engine": "search_pro_jina",
          "search_result": true,
          "search_query": "Beijing current date today timeanddate",
          "count": 3
        }
      }
    ],
    "max_tokens": 900
  }'

`web_search` sub-object fields

Field	Description	Recommendation
`enable`	Activate web search	Always pass `true`
`search_engine`	Search engine identifier	Use `search_pro_jina` for best results
`search_result`	Include search evidence in response	Pass `true` to receive the `web_search` array
`search_query`	Explicit search query string	Supply this for time-sensitive queries
`count`	Number of search results to retrieve	Typically 3–5

The minimal form below is invalid and will return an error:

{
  "tools": [{ "type": "web_search" }]
}

You must include the web_search sub-object with at least "enable": true.

When to use explicit `search_query`

General questions — let the model decide

For broad or timeless questions (e.g. “Summarise recent AI trends”), pass only enable: true and let GLM choose its own search terms.

{
  "type": "web_search",
  "web_search": { "enable": true }
}

Time-sensitive questions — provide an explicit query

For questions about today’s date, current weather, recent news, or latest announcements, explicitly provide search_query and set search_result: true. Then read the top-level web_search array in the response instead of relying solely on the model’s natural-language answer for source verification.

{
  "type": "web_search",
  "web_search": {
    "enable": true,
    "search_result": true,
    "search_query": "AI news today 2026",
    "count": 5
  }
}

Anthropic-compatible messages

Use POST /v1/messages when your application or SDK explicitly targets the Anthropic Messages API protocol:

curl -X POST "$BASE_URL/v1/messages" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello, who are you?"}
    ]
  }'

Verify tool and thinking extension support per channel when using the Anthropic-compatible entry point. Not all GLM-native extensions are guaranteed to round-trip through the Anthropic protocol layer.

Compatibility boundaries

Feature	OpenAI entry (`/v1/chat/completions`)	Anthropic entry (`/v1/messages`)
GLM `thinking` field	✅ Passed through	Verify per channel
GLM `web_search` native tool	✅ Supported	Verify per channel
`web_search` evidence array in response	✅ Returned at top level	Verify per channel
OpenAI SDK compatibility	✅ Full	❌ Not applicable
Anthropic SDK compatibility	❌ Not applicable	✅ Full

Troubleshooting

invalid_tool — web_search tool requires web_search object

Cause: You sent {"tools": [{"type": "web_search"}]} without the required web_search sub-object.Fix: Always include the full sub-object:

{
  "tools": [
    {
      "type": "web_search",
      "web_search": { "enable": true }
    }
  ]
}

Channel unavailable / upstream error

Cause: The GLM channel your token is routed to may be temporarily overloaded or undergoing maintenance.Fix: Retry with exponential back-off. If the error persists for more than a few minutes, check the Fhddos status page or contact support with the platform_id from your failed request.

Search results are stale or missing today's information

Cause: GLM performed an autonomous search with weak query terms, or search_result was not set to true.Fix:

Add "search_result": true to the web_search sub-object.
Pass an explicit search_query that includes date-anchoring keywords (e.g. "today 2026", "current date").
Read the top-level web_search evidence array in the response to verify sources rather than trusting only the model’s prose answer.

reasoning_content is empty or missing

Cause: Thinking/reasoning is disabled for the selected model, or "thinking": {"type": "disabled"} was set explicitly.Fix: Remove the thinking field or set "type": "enabled" for models that support it (e.g. glm-5.1).

​Environment setup

​Available models

​Interface options

OpenAI Compatible

Anthropic Compatible

​OpenAI-compatible chat completions

​GLM-specific extensions

​Response structure

​Web search

​Correct request shape

​web_search sub-object fields

​When to use explicit search_query

​Anthropic-compatible messages

​Compatibility boundaries

​Troubleshooting

Environment setup

Available models

Interface options

OpenAI-compatible chat completions

GLM-specific extensions

Response structure

Web search

Correct request shape

`web_search` sub-object fields

When to use explicit `search_query`

Anthropic-compatible messages

Compatibility boundaries

Troubleshooting