New to filmmaking or post-production? There may be hundreds of terminologies in this video editing industry. Some of them you may have known, but some of them you may not be familiar with.
Don't worry. We've outlined an A-Z list of the most popular and important terms and definitions in video editing and videography.
Also, we will update our video editing terminology glossary regularly. Stay tuned!
A-Roll is the primary footage that tells the story or the main plot. In non-narrative videos or interviews, A-Roll usually involves the talking head that focuses on the story going on, while B-roll acts as supplementary for additional information or visuals.
Alpha channel stores the transparency information of images or videos. It tells the video editing software which part is transparent in the footage.
An alpha matte feature in the capable software can create matte based on the alpha information, thus separating certain parts of the image from its surroundings.
Image formats such as PNG, TIFF, and GIF can contain alpha information. For videos, Apple ProRes 4444, HEVC, and WebM.
Animation is the process of creating an illusion of motion by manipulating a series of figures. Traditionally, it involves the drawing of successive images on celluloid paper and later evolved into a computer-aided process.
Animation types include hand-drawn animation, 2D and 3D animation with CGI, motion graphics, and stop motions featuring puppets, paper cutouts, and clay figures.
Aperture controls the amount of light passing through the lens since it is the hole on the diaphragm. It is written as f/number (f/2.8, f/4, f/5.6), which represents the focal length of the lens. A smaller F-number corresponds to a larger aperture that allows more light to hit the sensor.
A larger aperture (lower f/stop) will result in a shallow depth of field, creating background blurriness for aesthetic concerns.
Aperture, shutter speed, and ISO are the main factors that influence exposure in photography and videography.
Aspect Ratio describes how wide an image is in relation to its height proportionally. For instance, an image with 6000 pixels horizontally and 4000 pixels vertically has an aspect ratio of 6000:4000, or 3:2 when reduced to the lowest terms.
Assembly edit, or first assembly, is the process to arrange footage on the timeline to create a story outline. It is done after the editor reviews all the footage and has a general idea of how the story goes by.
It can be seen as the draft of the story before the rough cut, and it is usually 2 to 3 times the duration of the final cut and the released version.
Bit Depth (Color Depth)
Bit depth refers to the number of bits used to define RGB channels (Red, Green, Blue). 8-bit color permits 2^8 = 256 steps of color values in a pixel for each channel, that amounts to 256x256x256 = 16,777,216 (over 16.7 million) colors.
Likewise, 10-bit system produces 1,073,741,824 colors, and 12-bit achieves a whopping 68,719,476,736 colors.
Bit rate is the amount of data contained in video and audio streams in a second, measured in Mbps (megabits per second). A higher bit rate usually corresponds to better image quality and hence a larger file size.
Recommended bitrate for various resolutions are: 35-45 Mbps for 4K, 16 Mbps for 2K, 8 Mbps for 1080p and 5 Mbps for 720p.
Channels refer to color channel that stores color information and alpha channel that contains transparency information. For color channels, there are modes such as RGB, YUV, and HSV.
For instance, RGB mode has three channels: red, green, and blue, with each channel being a grayscale image in its respective color in association with the source image.
Chroma describes the intensity and purity of a color, or the colorfulness. Related concepts are hue and value (lightness), together the three traits are essential to describe color.
Chroma keying is a feature to isolate parts of the image by the color – or more specifically, the chrominance value – designated by users. Editors can then replace, change or remove the targeted area. For instance, one can remove the green screen by keying out the background and then composite with other scenes.
Codec is a computer program or compression algorithm that encodes or decodes video and audio streams. The term is a mashup of the word coder and decoder (or compressor and decompressor). The decoder is used to uncompress data for playing back and editing, and the encoder works to compress data such as when exporting the video.
Common codecs for videos are HEVC (H.265), AVC (H.264), VP9, AV1, ProRes. For audios, there are MP3, WMA, AAC, FLAC, ALAC, Opus, etc.
AVC (H.264): AVC (Advanced video coding) is a standard co-developed by ITU and ISO for video compression technology. The codec that conforms to this standard is widely used for video recording, compression, and distributing via disks or streaming. AVC can be referred to as H264 or MPEG-4 Part 10, where the H264 naming follows the convention of ITU, and the MPEG-4 Part 10 gets its name because it belongs to one of the parts of ISO's MPEG-4 standard.
HEVC (H.265): HEVC (High-efficiency video coding) is another compression standard developed by the same groups that created AVC. This newer standard offers a better compression rate than H264 for the same level of image quality, which is up to 50% saving of the file size. HEVC is also known as H265 from the ITU or MPEG H Part 2 from ISO.
VVC (H.266): VVC is designed to be a successor of HEVC to provide even better compression. The standard is finalized in 2020, though it's yet to be rolled out into wider applications in software and platforms.
Color correction is a post-production phase to manipulate color, white balance, exposure, contrast, and so on. The aim is to make the footage look natural as the human eye sees it in real life and to achieve a consistent appearance for multiple shots.
Read more to know the differences between color correction and color grading.
Color grade is the term mostly used to describe the end result of the colorist's work, which usually includes the correcting for technical issues and the more creative expressions in stylizing phase.
There are the primaries for normalizing, balancing, and shots matching, and then the second grade for specific areas. The creative grade can happen in both the primary and secondary correction.
Compositing is the process of combining multiple layers of images into a single image. In visual effects (VFX), it usually involves the live-action film and CGI assets. Notable examples of compositing are chroma keying green screen or blue screen footage, color blending, and digital matting. Pre-digital ages see tricks such as multiple exposures, traveling mattes, and sodium lamp screens.
Video compression in a broader sense involves a variety of tasks such as motion estimation, variable-length coding, and transforms. Depending on the quality and efficient usage of bits, there are lossless and lossy compressions. Videos compressed with lossy compression tend to be smaller than the lossless compressed files.
In its narrow sense, video compression encodes video using fewer bits by identifying and replacing redundant data. HEVC/H265 offers up to 50% better data compression when compared to AVC/H264 under the same quality conditions.
A container is a wrapper to store video streams, audio streams, metadata, and other elements allowed by the container. For instance, containers such as MKV can wrap subtitles together with media data.
A container can be identified by the extension. Popular container formats include MP4, MOV, MKV, AVI, FLV for videos, and MP3, M4A, M4B, WAV for audios.
MP4: MP4 is a container format with the extension .mp4 for video files. It is probably the most widely used format and thus offers the best compatibility for editing and playing.
WAV: WAV is an audio container to store uncompressed PCM data in most cases. It offers a better sound quality than a lossy MP3 does and is larger in file size consequently.
Cutaway is the insertion of another shot to interrupt a consistent action in the film and video editing. It is a technique used by cinematographers and editors to avoid certain continuity problems, create tensions, or reveal the internal thoughts of a character.
B-rolls are used for the cutaway shot.
Dimensions of the video refer to the width and height measured in pixels. For instance, for a DCI 4K video, its dimension is 4096 x 2160 pixels, UHD 4K has a dimension of 3840 x 2160 pixels.
Dissolve is a classic type of video transition between shots. Instead of using a straight cut that jumps immediately to the next shot, a dissolve is more gradual. The beginning shot and the adjacent shot overlap for some duration.
Fade is another popular type of video transition. It starts in full brightness and fades out into another scene. There are options to fade into white, fade into black, or fade into the next shot.
FPS represents the frame rate of motion pictures, or the number of images played in a certain duration, the abbreviation stands for frame rate per second since the unit time is usually measured in a second. A higher frame rate, such as 120 fps and 240 fps, accounts for smoother action in the video, and anything below 12 FPS will not be perceived as continuous motion.
Grayscale is a series of visual tones that cover true black, true white, and transitional gray in between. A grayscale image has only one channel that presents the brightness. The pixel value in a grayscale image ranges from 0 to 225, corresponding to black and white.
HDR (High Dynamic Range) is a processing technology that increases the brightness and contrast range of an image. Dynamic range describes the total difference between the lightest area and the darkest one. A high ratio of dynamic range brings more details of luminance and contrast.
Histogram is a graphical statistic report that shows data distribution roughly. In most video and image post-production software, the horizontal axis shows image luminance from true black to true white, and the vertical axis shows the number of pixels with the given luminance.
J-cut is a video editing technique that lets the audio start before the video. It is usually used to give a dramatic introduction to the video.
Jump cut is an editing trick that breaks the conventional rules of time and space. It cuts the shot and switches to another shot abruptly to show the audience the passage of time or create an intense atmosphere.
Watch the video to learn how to achieve cinematic jump cut.
Keying is to separate certain elements from the rest of an image by the color and brightness. When a part has been keyed out, it'll be fully transparent and can be removed or replaced by other elements. This method is often used for visual effects and color correction.
Key frame (or keyframe) defines the state of an object in a certain frame. After setting the initial and the end keyframes, the video editing software will generate intermediate frames based on the two points.
L-cut is an editing trick that lets the video end before the audio, with which you can make a subtle transition from one shot to the next.
Layering, in video editing, is to stack multiple media elements in the project timeline. By doing so, you can combine several videos, images, and texts in the same footage and play them simultaneously. It is often done for picture-in-picture and split-screen videos.
Letterbox is a technique to display a video recorded in landscape orientation on a narrower screen while retaining its original aspect ratio. Videos with letterbox effects are filled with black bars on the top and the bottom.
Lip sync (lip synchronization) is a technique that matches the lip movements of a singing or speaking person with an existing song or pre-recorded speech.
The lower third is a graphical element with texts placed on the lower area of a video to give the viewer more information, such as the name of a person or a place shown on the screen. It does not always occupy the lower third of the screen. It just indicates an inconspicuous place so as not to distract the audience from the video.
Luma means the brightness in images. Luma and luminance sometimes are used interchangeably, but they are different. Luma is created in the computer world and expresses the weighted sum of gamma-corrected components of RGB. While luminance is a physical measure that represents the weighted sum of linear components of RGB.
Mashup, as a video editing term, means a combination of multiple pre-existing videos even they have no discernible relation with each other. It could be a film trailer remix, a YouTube Poop, a supercut, and more.
Mask is a feature in video editing software. It outlines an area in a selected video and turns it into transparent so that you can cover, blur, duplicate, and apply effects. A mask can be either a preset or a custom frame.
Match cut is an editing technique extensively used in movies and vlogs to combine two different shots coherently. It comes in three types – graphic match, action match, and audio match. By jointing two shots that share some elements in common, such as the shape or trajectory of objects, or the sound in the clips, a transition between two inconsistent scenes could look smooth, seamless, and natural without any effects.
Watch the video to learn how to achieve cinematic match cut.
Matte is a feature that defines the transparency of a selected area so that other layers can show through or you can generate a new shot to fill that part. The difference between a mask and a matte is that, a mask is only applied to a specific layer, while a matter affects all the layers.
Montage is a video editing technique of assembling a list of separate short clips into a continuous whole to tell a story. The main purpose is to advance the story over time and space. It's reasonable to follow the chronological order or not.
Watch the video to learn how to create a montage in one minute.
Motion graphics is a branch of animation and gives static graphics movement without following a specific narrative. It can illustrate abstract ideas visually and make the message more engaging and entertaining. Motion graphics are often used when designing logos, posters, intros, transitions, and outros for videos.
NLE (Non-Linear Editing) is a method of video and audio editing. It gives editors full freedom to modify any video/audio clip in any order and does not make changes to the original files. Compared with linear editing, NLE is more efficient and allows much more creativity.
Pacing is a term describing the rhythm within a shot. It is always in sync with the storyline with the help of voiceover, video, music, and script. Pacing can be either conscious or unconscious.
PAL (Phase Alternating Line) is a video format standard used in many European and Asian countries. A PAL video is displayed at a rate of 25 frames per second and each frame is composed of 625 interlaced lines.
Pixel Aspect Ratio
Pixel aspect ratio describes how the width compares to the height of a pixel. Most normal systems display images with tiny square pixels (1:1), but there are others like elongated pixels at a ratio of 16:9.
Post-production is the last part of video production. It covers all the stages after shooting videos, including assembling shots one by one, adding music and visual effects, color grading, sound design, and even making trailers in filmmaking.
POV (Point of View) mostly refers to a shot that represents the perspective of a character, such as the first, second, and third person, or a certain object. It may also refer to the point of view of a person's story, in the form of storytelling.
Rendering in video editing is a process of compositing all the source media materials in a project into a final file for output. It requires a lot of calculations for mixing videos, audios, effects, transitions, animations, and so on.
Different from the term exporting, the purpose of rendering is for playing or previewing the entire project in a video editor before exporting.
Resolution means the number of pixels both horizontally and vertically in each frame of a video. It can somehow determine the definition and clarity of your video. Generally, the higher the number of pixels is, the higher the resolution is, and the clearer the video image is. But the quality of a video is also decided by several other factors.
Some of the resolution examples are 1920x1080 (1080p, also known as Full HD), 1280x720 (720p, also known as HD), and 9840 x 2160 (also known as 4K UHD or 4K UHDTV).
Ripple edit is a trim action or tool of moving an edit point that will cause all the rest clips to move forward or backward to ensure there is no blank between the clips. When you delete or add a video clip in the timeline of a video editor, the ripple edit feature will remove gaps between the clips automatically. Different from a rolling edit, a ripple edit will change the overall length of the project.
Rolling edit is another trim action or tool of rolling the cut point between two adjacent clips and remaining the overall length of the project unchanged. When you delete or add a video clip, the rolling edit feature will shorten the length of the first clip by moving the Out point, and lengthen the second clip by moving the In point by the same length or the same amount of frames.
Rotoscoping is an animation technique that traces over each frame of a motion picture and draws frame by frame to form realistic graphic assets for animation or live-action projects. In the visual effects industry, movie makers use rotoscoping to create complex animation effects and scenes by creating a matte or mask for an element, such as an object or character.
A rough cut, or a rough edit, is the initial edited version of a video or movie that still needs refining.
Scrubbing is manually moving the slider, cursor or playhead back and forth across the timeline in order to find the exact cut point.
A slideshow refers to a presentation of a series of photo slides and still images on a projection screen or electronic display device. A slideshow video is a video composed of flipping digital photos and background music.
Smash cut is a video editing technique that cuts the footage at an unexpected moment and switches to the next scene in sharp contrast to it. For example, an abrupt transition from a quiet shot to a chaotic fight. It is often used to add drama, comedy, or great tension to the story.
Stitching refers to combining multiple videos together to produce a panorama video. The technology is usually involved in 360-degree videos.
A storyboard is like a graphic script made up of a series of pictures and illustrations, shot by shot, showing how your story and plot begins, develops, and ends.
Timelapse can be a photography technique in which every video frame is recorded at a much lower frame rate than that of the playback. So when you playback the video at normal speed, it seems that the time moves faster and feels like a time-lapse. The timelapse effect can also be achieved by increasing the video speeding through post-production.
Transcode is the process of converting the video or audio format by changing the encoding to another, different from the conversion of container format. Examples are MP4 (HEVC/H265) to MP4 (AVC/H264).
Transition is a creative post-production method to connect one shot or scene of a video to another. There are various transition effects to convey the scene shift, the smooth switch, or the passage of time.
Video Editing Software
Video editing software is software for non-linear editing of video materials. The software remixes the added pictures, background music, special effects, and other materials with the video sources, cuts and merges the video sources, and generates the new videos with different expressions.
Some examples of them are VideoProc Vlogger, Adobe Premiere Pro, Final Cut Pro, Davinci Resolve, Lightworks, Hitfilm Express, etc.
Video reel (also called demo reel, showreel, or sizzle reel) is a collection of short video clips that serves as a personal portfolio to showcase one's abilities.
Virtual Reality (VR) is a technology to create a three-dimensional and computer-generated environment by which a person (a player) can be immersed and interacted with.
Vlog is the short form of video blog or video log, and refers to a personal website or social media account where most or all of the posts are videos, especially short videos.
Voiceover (or voice over) is a production technology to record a voice from a third-party or outside perspective, mainly for narrating a story, giving a statement, or making an explanation.
YUV is an image/video color encoding system. The YUV color space, different from RGB, saves transmission bandwidth. The Y in YUV stands for "luma"(brightness), U and V means chrominance (color).