← Back to Blog
Ideas

The Caption Formula: Hook, Value, Ask (And When to Break It)

Most captions that actually work follow the same three-part shape — and once you can see it, you stop writing every post from scratch. Here's the formula, a worked example, and the moments when ignoring it entirely is the smarter move.

Dave Smith

The Caption Formula: Hook, Value, Ask (And When to Break It)

# The Caption Formula: Hook, Value, Ask (And When to Break It)

Staring at a blank caption box is its own special kind of dread. You've got the photo. You've got thirty seconds before the kettle boils. And the cursor just blinks at you, waiting for words you haven't got.

Here's the good news: most captions that work follow the same simple shape. Once you can see it, you stop writing from scratch every time. You're just filling in three slots — hook, value, ask — and you're done before the tea's brewed.

The three parts, plainly

The hook is your first line. It's the only bit guaranteed to be seen, because on every platform the rest gets folded behind a "...more". If the first line doesn't earn a tap, nothing else you wrote matters. So it has to do one job: make someone pause.

A good hook usually does one of three things. It says something slightly unexpected ("We turned away three jobs this week, and I'm glad we did"). It names a problem the reader recognises ("Nobody tells you how quiet January is when you're self-employed"). Or it asks a question they actually want answered ("Ever wonder why your bread goes stale faster in summer?"). What it never does is start with "We are delighted to announce." That's a press release, not a hook.

The value is the middle — the reason the post exists. This is where you actually say the thing. Teach something, explain a decision, tell the small story behind the photo, let people in on how the work gets done. It doesn't need to be long. Two or three sentences of genuine substance beats a paragraph of warm air. The test is simple: if a reader screenshots this and nothing else, have they got something worth keeping? A tip, a reason, a laugh, a "huh, didn't know that." If yes, you've delivered value.

The ask is the last line, and it's the part most people skip. An ask is just a gentle nudge towards one specific action. "Tell me your worst January story." "Save this for the next time your bread goes stale." "Pop in before noon — that's when the good ones go." Notice these are small and concrete. You're not begging for sales. You're giving people one easy thing to do next, because a post with no exit just trails off and gets scrolled past.

Putting it together

Say you run a small garden centre and you've got a photo of a tray of tomato seedlings.

Without the formula, you'd probably write: "Tomato seedlings now in stock! 🍅 Come and grab yours." Fine. Forgettable.

With it: *"Most people kill their tomatoes in the first fortnight."* (hook) *"It's almost always overwatering — seedlings want their soil to dry out a touch between drinks, not stay soggy. Poke a finger in; if it's damp two knuckles down, leave it."* (value) *"These trays just landed if you fancy another go this year — and tell me, what's the one plant you can never keep alive?"* (ask)

Same photo. Same thirty seconds. One of them earns a comment and gets saved; the other vanishes. The difference isn't talent or budget. It's structure.

When to break it

Now, the bit nobody mentions: the formula is a scaffold, not a law. Lean on it when you're stuck, and walk away from it when the moment calls for something else.

Skip the ask when the post is purely human — a thank-you to your team, a tribute to a regular who's moved away, a quiet "we're shut Monday for the bank holiday, see you Tuesday." Tacking "what do you think? 👇" onto a heartfelt post cheapens it. Let it just be what it is.

Drop the hook gymnastics when the news *is* the hook. "We're closed today, burst pipe, back tomorrow" doesn't need a clever opener. Clarity is the hook.

And sometimes the whole thing is one line. A genuinely funny observation, a single striking photo with a three-word caption, a real-time "the queue is out the door right now." Over-building those kills them. The formula exists to rescue the posts you'd otherwise abandon, not to flatten every post into the same shape.

The point of having a shape

The reason any of this matters isn't that captions are some dark art. It's that a repeatable structure removes the part of posting that actually stops you — the staring. When you know a caption is just three small decisions, you stop treating each one like an essay. You write it, you post it, you get on with your day.

That's also roughly how we think about it at Aunty Social. The platform learns how your business actually talks, then drafts captions with that same hook-value-ask bones already in place, so you're editing and approving rather than wrestling a blank box at half nine at night. Same idea, less typing — but the formula works whether a tool's helping or it's just you and the kettle.

Try it on your next three posts. Hook, value, ask. Then, on the fourth, break it on purpose — and notice that you only knew you were breaking a rule because you'd finally learned what the rule was.