Most gear guides assume you shoot horizontal and worry about vertical later. Short-form flips that. A YouTube Short lives or dies in a 9:16 frame, and the kit that serves it best is different from the camera-and-mic stack built for 16:9 long-form. This guide is built around vertical from the first dollar, with three tiered build paths so you spend in the right order. If you also publish horizontal long-form, our beginner camera guide covers that side; this one is Shorts-first.
Shorts gear is different because the frame is vertical, the viewing distance is a phone held at arm's length, and the watch context is sound-on scrolling. Those three facts reshape the kit. Vertical framing rewards devices that capture true 9:16 instead of cropping. The phone-sized viewing surface means resolution loss from cropping is more visible than you would expect. And sound-on autoplay makes audio quality disproportionately important, because a Short with bad audio gets swiped away in the first second.
The practical consequence is that the spending priority inverts. For long-form, many creators buy the camera first. For Shorts, the camera is the part you already own (your phone), and the first two upgrades that move the needle are audio and stabilization. Spend there before anything else.
Shoot vertical-native unless you publish long-form from the same take, in which case shoot horizontal at high resolution and crop. This is the single most important decision in a Shorts workflow, and it has a clean rule. Vertical-native capture keeps your full sensor resolution in the 9:16 frame. Cropping a horizontal clip to vertical throws away roughly half the frame's pixels, though starting from 4K still leaves a sharp 1080p Short.
This is why phones and vertical-native devices like the DJI Pocket 3 dominate pure short-form. They sidestep the crop entirely. A mirrorless camera can produce great Shorts, but only by either rigging it sideways with an L-bracket or accepting the 4K-to-vertical crop, and most short-form creators decide the extra weight is not worth it.
A tiered build path is a spending order that upgrades the highest-impact gear first, so each dollar buys the most quality. Rather than a flat product list, the three tiers below tell you exactly what to add and in what sequence. Start at the bottom and climb only as your channel earns it.
The best mic for Shorts is a compact wireless lavalier, because it clips to your shirt and captures clean voice at any distance, including outdoors. Short-form is shot on the move and at arm's length, where built-in phone mics collapse, so a transmitter near your mouth is the single biggest audio upgrade. The DJI Mic Mini and Hollyland Lark M2 are the leading sub-$200 options, both clipping on without a visible boom.
For static desk Shorts you can use a budget USB or shotgun mic instead, but the moment you film outside or while walking, the wireless lav wins decisively. Audio quality is not a polish item in short-form; it is a retention item, because sound-on autoplay means a viewer hears your audio before they consciously decide to keep watching. For a full breakdown of microphone options across price points, see our YouTube microphone guide.
You need a gimbal only if you shoot moving or walking Shorts; for static talking-head Shorts, a small tripod is enough. Modern phones already have strong EIS (electronic image stabilization), so a gimbal is a polish upgrade rather than a hard requirement. The deciding question is your content: vlogs, walk-and-talks, and follow-the-action Shorts benefit enormously, while sit-down or desk Shorts do not.
If you do shoot motion, a smartphone gimbal in the $90 to $160 range, like the DJI Osmo Mobile 6 or the Insta360 Flow 2 Pro, adds mechanical stabilization plus subject tracking that keeps you centered while you move. For static Shorts, put that money toward audio or lighting instead, because a tripod and good electronic stabilization already cover you.
Grab the free Creator Gear Stack: every pick in this guide plus camera, mic, and lighting tiers in a single checklist.
Get the gear stack →To shoot true vertical on a mirrorless or DSLR camera, you either rotate the body physically with an L-bracket and cage, or shoot horizontal at high resolution and crop to 9:16 in editing. Most cameras cannot record a native vertical file, so these are your only two routes. Rotating the camera preserves full resolution but makes monitoring and audio rigging awkward; cropping is simpler and, from a 4K source, still produces a clean 1080p Short.
For creators who are short-form-first, the vertical-native device is almost always the right call. The big camera earns its place only when the same footage also feeds a horizontal long-form video. AI reframing tools can also auto-crop horizontal footage to vertical, and our friends at PickAI reviewed the AI tools content creators use to automate that reframing.
One page: camera, mic, lighting, and Shorts picks by budget tier.
At minimum you need a phone that shoots 1080p vertical, a clip-on or wireless microphone, and a way to stabilize. A modern smartphone covers the camera, a $30 to $170 mic covers audio, and a $30 tripod or $90 gimbal covers stabilization. The phone-only tier works to start; the audio and stabilization upgrades matter far more than a dedicated camera for short-form.
Yes, a current smartphone is genuinely enough to make competitive YouTube Shorts. Phones shoot natively vertical, have strong stabilization and autofocus, and avoid the resolution loss from cropping horizontal footage to 9:16. The two upgrades worth making first are a clip-on microphone and a small gimbal or tripod.
A compact wireless lavalier like the DJI Mic Mini or Hollyland Lark M2 is the best mic for Shorts because it clips to your shirt and captures clean voice at any distance, including outdoors. For desk or static Shorts, a budget USB or shotgun mic also works. The key is getting the mic close to your mouth; on-phone mics degrade quickly past arm's length.
You need a gimbal only if you shoot moving or walking Shorts; for static talking-head Shorts a small tripod is enough. Modern phones have strong electronic stabilization, so a gimbal is a polish upgrade rather than a requirement. If most of your Shorts involve walking or following action, a $90 to $160 smartphone gimbal noticeably improves the footage.
Most mirrorless cameras cannot record true vertical, so you either rotate the camera physically with an L-bracket and rig, or shoot horizontal at high resolution and crop to 9:16 in editing. Cropping a 4K horizontal clip still yields a sharp 1080p Short. A vertical-native device like the DJI Pocket 3 or a phone avoids the crop entirely.
The biggest production mistakes in Shorts are weak audio and flat front-on lighting, both of which trigger an instant swipe. Because Shorts autoplay sound-on in a fast-scrolling feed, a viewer judges your clip in roughly one second, and bad audio or a washed-out, shadowless face reads as low effort before they process a single word. Fixing these two things does more for retention than any camera upgrade.
None of these fixes require expensive gear. A window, a clip-on mic, and a one-second hook beat a $349 camera used badly. For the full lighting decision framework that applies to vertical setups too, see our lighting setup guide, and for the audio side, the wireless options in our microphone guide cover everything a Shorts creator needs.
Build your Shorts kit vertical-first and spend in order: phone, then audio, then stabilization, then a vertical-native camera. The phone-only tier proves you will post; the $300 tier with a wireless mic and gimbal delivers the biggest visible jump; the $800 tier with a DJI Pocket 3 is for full-time short-form creators. Shoot vertical-native unless the same take also feeds a long-form video, and never crop a Short-only clip from horizontal.