Gemma 4 Inference (Yamify)
Yamify hosts Gemma 4 so you can ship chatbots and AI features without running model infra.
Endpoint
Yamify provides an OpenAI-compatible Chat Completions endpoint:
- Base URL:
https://www.yamify.co/api/inference/v1 - Chat completions:
POST /chat/completions - Models:
GET /models
Auth (MVP)
Create an API key in Yamify → Dashboard → Settings → API keys.
Then send it as:
Authorization: Bearer <YOUR_KEY>
Keys are shown once when created (store it in a password manager).
Quick test (curl)
curl -sS https://www.yamify.co/api/inference/v1/chat/completions \\
-H "content-type: application/json" \\
-H "authorization: Bearer $YAMIFY_API_KEY" \\
-d '{
"model": "yamify/gemma4",
"stream": false,
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Write a 1-paragraph README for a TODO app." }
],
"temperature": 0.2
}' | jq
If you get a warming up error, wait ~2 minutes and retry (first-time model pull).
Streaming (recommended)
Gemma 4 can be slow to answer on CPU. For the best UX (and to avoid proxy timeouts), use streaming:
curl -sS --no-buffer https://www.yamify.co/api/inference/v1/chat/completions \\
-H "content-type: application/json" \\
-H "authorization: Bearer $YAMIFY_API_KEY" \\
-d '{
"model": "yamify/gemma4",
"stream": true,
"messages": [
{ "role": "user", "content": "Count from 1 to 5." }
],
"temperature": 0.2
}'
This returns an OpenAI-style SSE stream (data: {...}\n\n), ending with data: [DONE].
Node.js example (OpenAI SDK)
npm i openai
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.YAMIFY_API_KEY,
baseURL: "https://www.yamify.co/api/inference/v1",
});
const completion = await client.chat.completions.create({
model: "yamify/gemma4",
messages: [{ role: "user", content: "Say hi from Gemma 4." }],
stream: false,
});
console.log(completion.choices[0].message.content);
Deploy a sample chatbot on Yamify (custom app)
- Create a new Next.js app:
npx create-next-app@latest gemma4-chatbot --ts --eslint
cd gemma4-chatbot
npm i openai
- Add
app/api/chat/route.ts:
import OpenAI from "openai";
export async function POST(req: Request) {
const { messages } = await req.json();
const client = new OpenAI({
apiKey: process.env.YAMIFY_API_KEY,
baseURL: "https://www.yamify.co/api/inference/v1",
});
const completion = await client.chat.completions.create({
model: "yamify/gemma4",
messages,
temperature: 0.2,
});
return Response.json({ reply: completion.choices[0].message.content });
}
- Deploy it with Yamify:
- Go to Dashboard → Deploy from GitHub or ZIP
- Push this repo to GitHub and paste the URL
- Set env var
YAMIFY_API_KEYin Yamify for the app (same key you use locally)