Gemma 4 Inference (Yamify)

Yamify hosts Gemma 4 so you can ship chatbots and AI features without running model infra.

Endpoint

Yamify provides an OpenAI-compatible Chat Completions endpoint:

Base URL: https://www.yamify.co/api/inference/v1
Chat completions: POST /chat/completions
Models: GET /models

Auth (MVP)

Create an API key in Yamify → Dashboard → Settings → Gemma Inference (Heroku-like).

Then send it as:

Authorization: Bearer <YOUR_KEY>

Keys are shown once when created (store it in a password manager).

Quick test (curl)

export YAMIFY_INFERENCE_API_KEY="yamify_sk_..."

curl -sS https://www.yamify.co/api/inference/v1/chat/completions \
  -H 'content-type: application/json' \
  -H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
  -d '{"model":"yamify/gemma4","stream":false,"messages":[{"role":"user","content":"ping"}],"max_tokens":32}'

If you get a 503 or a “taking longer than expected” message, switch to streaming (below).

Streaming (recommended)

Streaming returns tokens as they are generated (SSE). It’s the most reliable way to integrate (best UX + avoids proxy timeouts).

curl -sS -N --no-buffer --http1.1 https://www.yamify.co/api/inference/v1/chat/completions \
  -H 'content-type: application/json' \
  -H "authorization: Bearer $YAMIFY_INFERENCE_API_KEY" \
  -d '{"model":"yamify/gemma4","stream":true,"messages":[{"role":"user","content":"Reply with exactly: pong"}],"max_tokens":8,"temperature":0}'

This returns an OpenAI-style SSE stream (data: {...}\n\n), ending with data: [DONE].

Node.js example (OpenAI SDK)

npm i openai

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.YAMIFY_INFERENCE_API_KEY,
  baseURL: "https://www.yamify.co/api/inference/v1",
});

const completion = await client.chat.completions.create({
  model: "yamify/gemma4",
  messages: [{ role: "user", content: "Say hi from Gemma 4." }],
  stream: false,
});

console.log(completion.choices[0].message.content);

Deploy a sample chatbot on Yamify (custom app)

Use the ready-to-deploy sample:

GitHub (public): https://github.com/yamify-org/yamify-public/tree/main/examples/gemma4-chatbot

Deploy steps:

In Yamify, go to Deploy from GitHub or ZIP (inside your workspace / Yam).
Paste the repo URL above (Yamify will clone and build it automatically).
Set env var YAMIFY_INFERENCE_API_KEY (same key you created in Settings)
Deploy → Yamify will give you a public URL (it appears on your Yam page under Live apps)

Expected URL shape (example):

https://<project>.<username>.apps.yamify.co

Once it’s live, open the URL and chat — Gemma 4 is the model behind it.

Endpoint​

Auth (MVP)​

Quick test (curl)​

Streaming (recommended)​

Node.js example (OpenAI SDK)​

Deploy a sample chatbot on Yamify (custom app)​