Skip to main content

Gemma 4 Inference (Yamify)

Yamify hosts Gemma 4 so you can ship chatbots and AI features without running model infra.

Endpoint

Yamify provides an OpenAI-compatible Chat Completions endpoint:

  • Base URL: https://www.yamify.co/api/inference/v1
  • Chat completions: POST /chat/completions
  • Models: GET /models

Auth (MVP)

Create an API key in Yamify → Dashboard → Settings → Gemma Inference (Heroku-like).

Then send it as:

  • Authorization: Bearer <YOUR_KEY>

Keys are shown once when created (store it in a password manager).

Quick test (curl)

export YAMIFY_INFERENCE_API_KEY="yamify_sk_..."

curl -sS https://www.yamify.co/api/inference/v1/chat/completions \
-H 'content-type: application/json' \
-H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
-d '{"model":"yamify/gemma4","stream":false,"messages":[{"role":"user","content":"ping"}],"max_tokens":32}'

If you get a 503 or a “taking longer than expected” message, switch to streaming (below).

Streaming returns tokens as they are generated (SSE). It’s the most reliable way to integrate (best UX + avoids proxy timeouts).

curl -sS -N --no-buffer --http1.1 https://www.yamify.co/api/inference/v1/chat/completions \
-H 'content-type: application/json' \
-H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
-d '{"model":"yamify/gemma4","stream":true,"messages":[{"role":"user","content":"Reply with exactly: pong"}],"max_tokens":8,"temperature":0}'

This returns an OpenAI-style SSE stream (data: {...}\n\n), ending with data: [DONE].

Node.js example (OpenAI SDK)

npm i openai
import OpenAI from "openai";

const client = new OpenAI({
apiKey: process.env.YAMIFY_INFERENCE_API_KEY,
baseURL: "https://www.yamify.co/api/inference/v1",
});

const completion = await client.chat.completions.create({
model: "yamify/gemma4",
messages: [{ role: "user", content: "Say hi from Gemma 4." }],
stream: false,
});

console.log(completion.choices[0].message.content);

Deploy a sample chatbot on Yamify (custom app)

Use the ready-to-deploy sample:

  • GitHub (public): https://github.com/yamify-org/yamify-public/tree/main/examples/gemma4-chatbot

Deploy steps:

  1. In Yamify, go to Deploy from GitHub or ZIP (inside your workspace / Yam).
  2. Paste the repo URL above (Yamify will clone and build it automatically).
  3. Set env var YAMIFY_INFERENCE_API_KEY (same key you created in Settings)
  4. Deploy → Yamify will give you a public URL (it appears on your Yam page under Live apps)

Expected URL shape (example):

  • https://<project>.<username>.apps.yamify.co

Once it’s live, open the URL and chat — Gemma 4 is the model behind it.