JPEG vs WebP vs AVIF: The Perceptual Quality Math Behind 'Quality 80'

Save the same 4 MB product photo at "quality 80" in three formats. The JPEG comes out at 580 KB. The WebP at 340 KB. The AVIF at 190 KB. The files look roughly the same on screen at typical web sizes. So which one was "really" quality 80?

None of them, exactly. The 0-100 quality knob is a vendor convention. Each format implements it against a different underlying compression strategy, and the number doesn't correspond to anything measurable on the human visual system. When one format looks worse than another at the same setting, the cause is usually a divergence in how the format models perception, not in how aggressively it crushed the bits.

The interesting story is what each codec assumes about your eyes, and where those assumptions break.

What "quality" actually controls

In every lossy image codec, the quality slider is a knob on a quantization step. The image is transformed (DCT for JPEG, prediction-plus-DCT for WebP, several adaptive transforms for AVIF), each coefficient is divided by a quantizer value, rounded to an integer, and stored. Higher quantizer means coarser rounding, smaller files, more lost detail. The slider just maps 0-100 onto a set of quantizer values that the encoder author designed to feel intuitive.

Three things vary across codecs:

The transform itself: how pixels get converted into the coefficients that get quantized.
The perceptual weighting: which coefficients are protected and which are sacrificed.
The mapping from "quality 80" onto specific quantizer numbers.

JPEG, WebP, and AVIF make different choices on all three. So "quality 80" in JPEG and "quality 80" in AVIF aren't different settings of the same dial. They're different dials, attached to different machines, calibrated by people who never coordinated.

JPEG: DCT and a 1992 quantization table

JPEG was formally specified as ITU-T T.81 in 1992 and packaged for general use as JFIF the same year. It works on 8x8 blocks of pixels. Each block is converted from RGB to YCbCr (separating luma from chroma), then each color channel is transformed via the discrete cosine transform into 64 frequency coefficients. The top-left coefficient is the average color of the block; the rest describe how that color varies in increasing detail.

The quantization step divides each coefficient by a per-position value from a quantization matrix. The standard luminance table in Annex K.1 of the spec is hand-tuned: it divides high-frequency coefficients by larger numbers, on the theory that the human eye is less sensitive to fine detail than to overall brightness. That theory was developed against still-image targets in 1992 and has not been revisited in the standard since.

The quality slider scales the entire table. The Independent JPEG Group's libjpeg defines "quality 50" as the literal table values from Annex K. Quality 100 scales them down toward 1 (preserving everything). Quality 0 scales them up toward 255 (destroying almost everything). Quality 75 was the IJG default for years and remains the closest thing to an industry consensus for "good enough" web JPEG.

There is no universal calibration. Photoshop's quality scale doesn't match libjpeg's, which doesn't match the one in browsers' Canvas API. Two tools producing "quality 80 JPEG" can produce files of substantially different size and visual quality. The format itself has no opinion on what 80 means.

WebP: VP8 intra-frame prediction

WebP shipped from Google in September 2010 as the still-image use of the VP8 video codec. The mental model: a JPEG is one frame of nothing; a WebP is one frame of something. Instead of treating each block in isolation, WebP runs intra-frame prediction first. It predicts each block's content from its already-decoded neighbors (left, above, above-left), then DCT-transforms and quantizes only the prediction *error*.

When neighboring blocks share content (most areas of a photo), the prediction error is small. Small numbers quantize more cleanly than big numbers, so WebP can get away with coarser quantization at the same perceptual quality. WebP files at "quality 80" usually weigh 25 to 35 percent less than JPEG files at "quality 80" of the same image.

WebP's quality scale is calibrated against its own internal quantizer, not libjpeg's. WebP quality 75 is typically closer to JPEG quality 85 in perceptual fidelity and roughly 30 percent smaller in bytes. WebP also has a separate lossless mode (PNG-like, completely different encoder) that ignores the quality slider entirely. Most tools that compress to WebP at "quality 80" are quietly always using the lossy path.

WebP's weakness is in flat smooth areas: skies, blank walls, large solid backgrounds. Block prediction works well there in the strict mathematical sense, but the residual quantization noise becomes visually obvious because there is nothing else to mask it. Photographs of textured subjects compress cleanly in WebP. Renders of UI mockups, where most pixels share a color, expose its banding artifacts.

AVIF: AV1 turned sideways

AVIF was published in February 2019 as the still-image profile of the AV1 video codec, which the Alliance for Open Media had finalized the previous year. AV1 was designed to compete with HEVC, and it carried into AVIF the same toolbox: more sophisticated intra-frame prediction with directional modes that handle diagonal edges, variable block sizes from 4x4 up to 128x128, an asymmetric discrete sine transform where DCT performs poorly, and entropy coding tuned for finer distinctions in coefficient distribution.

The result is a format that consistently compresses still images 40 to 60 percent smaller than JPEG at perceptually matched quality, and 25 to 40 percent smaller than WebP. The cost is encoding time. AV1's reference encoder, libaom, is famously slow. Even the production-tuned rav1e and SVT-AV1 encoders run an order of magnitude slower than libjpeg per pixel. Decoders are fast on modern silicon (Chrome, Firefox, and Safari all decode AVIF in hardware), but the encoder side still costs CPU.

AVIF's quality slider in most consumer encoders is a wrapper around AV1's quantizer parameter (cq-level), inverted and rescaled. "Quality 80" in AVIF often corresponds to cq-level around 28, but the mapping varies between encoders. Quality 100 disables quantization entirely. Quality 50 is typically the floor of "still acceptable for photographs."

Where AVIF surprises people is in handling of subtle color. Its metadata is more expressive than JPEG's: HDR, wide-gamut Display P3, ICC profiles all pass through correctly. A JPEG saved on a wide-gamut Mac and viewed on a wide-gamut iPhone often looks more muted than the original because JPEG can't represent the source gamut. AVIF preserves it. For product photography where color fidelity matters, that is a real advantage that has nothing to do with file size.

A side-by-side at "quality 80"

The same 2400x1600 photograph of a textured ceramic vase against a plain wall, encoded with current production encoders:

Format	File size	Encoder	Visible artifacts at 1:1
Original PNG	11.4 MB	n/a	None (reference)
JPEG quality 80	720 KB	libjpeg-turbo 2.1	Mild blockiness in the wall's gradient
WebP quality 80	480 KB	libwebp 1.3	Slight banding in the wall, sharper vase edges
AVIF quality 80	290 KB	libaom-av1 3.7	Faint mosquito noise on vase edges, clean wall

Same source, same nominal quality, three byte counts in a 2.5x range, three different artifact profiles. The AVIF file is the smallest and has the cleanest flat areas. The WebP is the middle ground. The JPEG is the largest and shows the most familiar artifact pattern, which is partly why "JPEG quality 80" still feels like a known quantity to anyone who has worked with images for 20 years. Familiar artifacts are easier to forgive than unfamiliar ones.

How "quality" is supposed to be measured

The industry's preferred quality metrics have been moving away from raw signal math toward perceptual modeling.

The oldest metric is peak signal-to-noise ratio, a logarithmic measure of mean squared error between the original and the compressed image. PSNR is easy to compute and has almost no correlation with how humans perceive quality. Two images with identical PSNR can look obviously different. The metric rewards faithful reproduction of pixel-exact values, which is not how vision works.

Structural Similarity Index, introduced by Zhou Wang and colleagues in 2004, models local correlations of luminance, contrast, and structure across small windows of the image. SSIM matches subjective opinion much better than PSNR and is still the most common metric in codec benchmarking. Modern variants like MS-SSIM (multi-scale) and CW-SSIM (complex wavelet) refine it further.

Butteraugli, open-sourced by Jyrki Alakuijala at Google in 2017, goes further. It models known properties of the human visual system (contrast sensitivity at different spatial frequencies, color masking, edge perception) and produces a per-pixel difference map weighted by how likely each pixel is to be noticed. Butteraugli is what JPEG XL's reference encoder uses internally to drive its rate-distortion decisions, and it is the closest the industry has to a perceptually meaningful score.

The point of mentioning these is to make clear that "quality 80" on a slider is not measuring anything. It is an encoder parameter. The actual quality of the result, in the sense of how it looks to a human, depends on the codec's perceptual model and the content of the image. A photograph of skin tones, an architectural render, and a screenshot of a styled dashboard will all respond differently to "quality 80" in each codec, and none of those responses are predictable from the number alone.

What this means for picking a target quality

For web photography, start at quality 80 in any of the three formats and visually inspect the output. The starting point is reasonable across all three; the deviation from there is content-dependent and codec-dependent. If the result looks fine, you can usually drop another 5 to 10 points on the slider for AVIF and WebP without visible degradation, while JPEG tends to fall apart faster below quality 70.

For UI assets with solid colors and sharp edges, WebP often beats JPEG and AVIF beats both, but the artifacts also become more visible. Inspect at 1:1, not at thumbnail scale. The banding that's invisible in a 200-pixel thumbnail can be obvious at full resolution.

For batch work where one quality setting covers many images, use the lowest setting that still looks acceptable for the *worst* image in the batch, not the best. A quality that works for a textured product photo may produce visible artifacts on a flat-color logo. The codec doesn't know which image is which; you do.

For comparing codecs, file size alone is misleading. Two files of similar size in different formats are not equivalent. Compare byte size at matched perceptual quality (measured with SSIM or Butteraugli, or eyeballed against the same source), not at matched encoder settings.

If you want to play with the same image at varying quality and watch the file size and visual fidelity change in real time, Komprs puts the slider, the size readout, and the before/after viewer in one window. Drag the quality down until you can see the artifacts, then back it up a notch. That is the procedure that works, in any format, on any image, and it is the part the codec literature consistently leaves for the reader.

Komprs

Compress unlimited images without a single upload.

Try Komprs

ParifyEveryday Tools

JPEG vs WebP vs AVIF: The Perceptual Quality Math Behind 'Quality 80'

What "quality" actually controls

JPEG: DCT and a 1992 quantization table

WebP: VP8 intra-frame prediction

AVIF: AV1 turned sideways

A side-by-side at "quality 80"

How "quality" is supposed to be measured

What this means for picking a target quality

Related posts

Why One APR Becomes Eight Different APYs: The Compounding Frequency Dial

How 0.08% Became the National BAC Limit, and Where It's Heading Next

APR vs APY: Where the Four-Letter Difference Compounds Into Real Money