Screen Vision

Vision AI that watches your game in real time

Sidekick AI's screen vision reads your gameplay frame by frame. The companion already knows the boss, the phase, and the loot — so voice coaching matches the moment instead of waiting for you to describe it.

Add to Steam Wishlist

How It Works

Reads your screen frame by frame

The companion sees what you see at native frame cadence. Boss health bar, your position, the cursor over a chest, the icon of the spell that just popped — all of it.

Identifies game state, not just pixels

Vision AI turns raw frames into structured understanding: which game, which encounter, which phase, which mechanic is winding up. The companion talks about the situation, not the screen.

Acts on the moment, not the description

Because the companion already sees the scene, you skip the entire describe-then-respond loop. You react to Sidekick's voice; Sidekick reacts to your game.

On-device frame analysis where possible

Frame analysis runs on your machine for speed and to keep your gameplay private. Only what's needed to generate the next coaching line is sent up.

Why screen vision is the headline feature

Every AI gaming companion claims to help in real time. The honest test is whether the companion can act on the moment without you describing it. A chatbot that needs you to type “I'm at half health and Malenia just started Waterfowl Dance” before it gives advice has already lost the moment. By the time you finish typing, the fight is over.

Screen vision collapses the loop. The companion sees the boss health bar, sees the Waterfowl windup animation, sees your stamina, and calls the dodge timing in voice before you can articulate what's happening. That's the entire pitch of real-time AI coaching, and screen vision is what makes it real instead of marketing.

What “reads your screen” means in practice

Vision AI doesn't just dump pixels into a language model. The pipeline turns each captured frame into structured signals: which game is running, what scene is on screen, what UI elements are visible, what the player avatar is doing, what enemies are present, what state the player and enemies are in. Those signals are what the coaching layer actually reasons about.

That structure is why Sidekick can make precise calls instead of vague observations. The companion can say “you're at 30% HP, back out and chug an Estus” because the vision layer extracted your HP and your flask count — not because the model guessed.

The category difference

Most AI assistants and chatbots are blind to your game. Character.AI, ChatGPT, Replika — none of them can see what you see. They can chat about a game you describe to them, but they can't coach during play because the loop is too slow.

The AI gaming companion category exists because screen vision changed what was possible. Sidekick AI is built around that change. The 3D avatar, the voice layer, the HypeReel highlight workflow — all of it sits on top of the vision layer being good enough that the companion already knows what's happening when it speaks.

Frequently Asked Questions

How does Sidekick AI's screen vision actually work?
Sidekick captures frames from your gameplay window on a regular cadence and runs them through a vision language model tuned for games. The model identifies what's on screen — the game, the scene, the active mechanic, the player state — and passes that structured understanding to the coaching layer. The coaching layer decides what (if anything) to say. The result is voice tips that match what's actually happening, not generic advice.
Which games does screen vision work with?
Any PC game that runs in a standard window. There's no per-game integration required because the vision layer reads the rendered screen rather than the game's internal state. Single-player and co-op titles are where the experience is sharpest because the coaching content is tuned for them — Elden Ring, Baldur's Gate 3, Hollow Knight, Dark Souls 3, Resident Evil 4, Silent Hill 2, Lethal Company, Phasmophobia, Minecraft, and more.
Does screen vision slow down my game?
No measurable impact on most setups. The frame capture is lightweight and happens outside the game's render loop. The vision analysis runs on a separate thread or device. Sidekick is designed so your frame rate and input latency stay where they are — coaching is the value, not the bottleneck.
Can Sidekick see HUD elements, menus, and inventory screens?
Yes. The vision layer reads the whole rendered frame, including UI elements like health bars, mini-maps, inventory grids, and dialog boxes. This is how Sidekick can say things like "you're at 30% health, back off" or "that spell scroll in the loot drop is worth picking up."
What about spoilers? Will Sidekick reveal late-game content?
The coaching layer is tuned to talk about what's on screen right now, not to volunteer information about content you haven't reached. If a story beat is about to trigger, Sidekick won't pre-empt it. If you actively ask for help on a puzzle whose solution involves later content, the companion can warn you and let you decide.
Is screen vision different from streaming or screen recording?
Yes. Streaming tools capture and broadcast your screen to viewers. Screen recording saves your screen to a file. Sidekick's screen vision reads frames in real time to generate coaching audio — the frames themselves don't leave your machine in the same way a broadcast or recording does. Sidekick is built to coexist with your existing streaming setup rather than replace it.
Does screen vision work on multi-monitor setups?
Yes. You select which window or display Sidekick reads. The companion only sees the surface you point it at, so a second monitor with Discord, OBS, or a wiki tab stays private.
Can I turn screen vision off temporarily?
Yes. There's a clear toggle to pause vision capture. When vision is paused, the companion still talks but stops referencing the screen — useful for cutscenes, story moments, or when you just want company without coaching.

Ready to play smarter?

Sidekick AI uses vision AI to watch your screen and coach you in real-time. Try the free demo on Steam.

Add to Steam Wishlist