Gemma 4 Inference (Yamify)
Yamify hosts Gemma 4 so you can ship chatbots and AI features without running model infra.
Endpoint
Yamify provides an OpenAI-compatible Chat Completions endpoint:
- Base URL:
https://www.yamify.co/api/inference/v1 - Chat completions:
POST /chat/completions - Models:
GET /models
Auth (MVP)
Create an API key in Yamify → Dashboard → Settings → Gemma Inference (Heroku-like).
Then send it as:
Authorization: Bearer <YOUR_KEY>
Keys are shown once when created (store it in a password manager).
Quick test (curl)
export YAMIFY_INFERENCE_API_KEY="yamify_sk_..."
curl -sS https://www.yamify.co/api/inference/v1/chat/completions \
-H 'content-type: application/json' \
-H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
-d '{"model":"yamify/gemma4","stream":false,"messages":[{"role":"user","content":"ping"}],"max_tokens":32}'
If you get a 503 or a “taking longer than expected” message, switch to streaming (below).
Streaming (recommended)
Streaming returns tokens as they are generated (SSE). It’s the most reliable way to integrate (best UX + avoids proxy timeouts).
curl -sS -N --no-buffer --http1.1 https://www.yamify.co/api/inference/v1/chat/completions \
-H 'content-type: application/json' \
-H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
-d '{"model":"yamify/gemma4","stream":true,"messages":[{"role":"user","content":"Reply with exactly: pong"}],"max_tokens":8,"temperature":0}'
This returns an OpenAI-style SSE stream (data: {...}\n\n), ending with data: [DONE].
Node.js example (OpenAI SDK)
npm i openai
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.YAMIFY_INFERENCE_API_KEY,
baseURL: "https://www.yamify.co/api/inference/v1",
});
const completion = await client.chat.completions.create({
model: "yamify/gemma4",
messages: [{ role: "user", content: "Say hi from Gemma 4." }],
stream: false,
});
console.log(completion.choices[0].message.content);
Deploy a sample chatbot on Yamify (custom app)
Use the ready-to-deploy sample:
- GitHub (public):
https://github.com/yamify-org/yamify-public/tree/main/examples/gemma4-chatbot
Deploy steps:
- In Yamify, go to Deploy from GitHub or ZIP (inside your workspace / Yam).
- Paste the repo URL above (Yamify will clone and build it automatically).
- Set env var
YAMIFY_INFERENCE_API_KEY(same key you created in Settings) - Deploy → Yamify will give you a public URL (it appears on your Yam page under Live apps)
Expected URL shape (example):
https://<project>.<username>.apps.yamify.co
Once it’s live, open the URL and chat — Gemma 4 is the model behind it.