spigel: image comparison with pHash and dHash

ichioda/spigel is a TypeScript library for image comparison and hashing. The idea is simple: you hand it two images, it computes a visual fingerprint for each, and it gives you back a distance that tells you how similar they look.

The fun part is there's no machine learning model anywhere in here. Nothing gets run through a neural network. Instead the library uses perceptual hashing, which is really just math that squashes what an image looks like into a short string you can compare.

Basic usage looks like this:

import { compare } from "spigel";
 
const result = await compare("./example.png", "./example2.png", {
  algorithm: "phash",
  humanize: true
});
 
// { distance: "high similarity", hashes: { hashA: "...", hashB: "..." } }

What is perceptual hashing?

A cryptographic hash like SHA-256 changes completely the moment a single byte in the file changes. Great for integrity checks, useless for images. Resize a photo, recompress it as JPEG, nudge the contrast a little, and the bytes shift all over the place even though the picture still looks basically the same to you.

Perceptual hashing is after a different question: "do these two images look the same to a person?"

Don't treat the answer as gospel. It's a signal. The smaller the distance between two hashes, the more visually similar they probably are. The bigger the distance, the stronger the hint that they're different.

dHash: local differences

dHash, or difference hash, runs on one small idea: instead of storing the color of each pixel, it stores the direction of change from one pixel to the next.

In spigel, the dHash flow goes like this:

load the image from a file path or Buffer;
resize it to 9x8;
convert to grayscale;
compare each pixel with the one right next to it;
turn those comparisons into a hex string.

Since the grid is 9 columns by 8 rows, each row gives you 8 comparisons. That's 64 binary decisions about how the image behaves locally.

It's fast, and it's handy when you're trying to catch near-identical images: duplicate thumbnails, small compression differences, copies with tiny tweaks. It's also easy to explain. Two images look alike if their light and dark patterns rise and fall in a similar way.

pHash: visual structure through frequencies

pHash, or perceptual hash, is going after something more structural. It doesn't stop at the differences between neighboring pixels. It moves the image into frequency space using DCT (Discrete Cosine Transform).

In spigel, the pHash process:

converts the image to grayscale;
resizes it to 32x32;
applies DCT over the pixel matrix;
looks at an 8x8 region of low frequencies;
compares each coefficient against the mean;
produces a final binary string.

Low frequencies are the overall composition of the image: masses of light, broad contrast, the main shapes. That's why pHash tends to hold up better than a pixel-by-pixel comparison when there's resizing, compression, or small visual changes in the mix.

In practice, dHash is your friend for quick and simple comparisons. pHash usually does better when you want to tolerate small transformations but still keep a wider sense of similarity.

The spigel API

The library gives you three main functions:

import { compare, compareHash, hash } from "spigel";

compare takes two images, computes their hashes, and returns the distance:

await compare("./a.png", "./b.png", {
  algorithm: "dhash",
  humanize: true
});

hash just computes the fingerprint of a single image:

const imageHash = await hash("./a.png", "phash");

compareHash compares two fingerprints you already computed:

await compareHash(hashA, hashB, {
  algorithm: "phash",
  humanize: false
});

This split is handy because you can keep image processing and comparison apart. In a real system you'd compute hashes at upload time, store them in the database, and compare later without ever reopening the original files.

Turn humanize on and the numeric distance turns into a readable label like identical, high similarity, low similarity, or different.

Use cases

spigel earns its keep anywhere "looks similar" matters more than "byte-for-byte identical."

The obvious one is image deduplication. When users upload the same photo at different sizes or compression levels, perceptual hashes help you group the copies together without leaning on the filename.

Then there's similar image search. Instead of searching only by metadata, you can store hashes and compute distances against candidate images.

Moderation and security get value out of it too. Abuse detection tools, anti-spam systems, marketplaces, and avatar systems can use hashes to catch re-uploads of known images, even when someone tries to sneak in small changes.

On the product engineering side, there's a practical use in visual testing. Interface screenshots drift by a few pixels between builds. A perceptual hash helps you separate that visual noise from changes big enough to actually look into.

Limitations

Perceptual hashes don't understand meaning. Two images with a similar composition but completely different subjects might score too close together. And if an image gets aggressively cropped, heavily rotated, buried under a big overlay, or messed with on purpose, the distance can stop reflecting what a human would see.

There's also no universal threshold. What counts as "similar" depends on the domain. Product photos, memes, scanned documents, and UI screenshots all have different tolerances.

So the best way to use spigel is to treat the distance as a decision feature, not a final judge. Feed it into ranking, alerts, human review, initial filtering, or grouping.

Why this library is interesting

spigel is small, readable, and practical. It uses sharp for image manipulation, implements pHash and dHash in TypeScript, accepts file paths or Buffer, and gives you a simple API for comparing images or pre-computed hashes.

That fits nicely with tools that need to stay close to real product behavior: uploads, thumbnails, bots, media pipelines, automation, and anything that has to quickly decide whether two images are really the same thing wearing different clothes.

The value here isn't in promising artificial intelligence. It's in doing a common visual operation in a way that's predictable, explainable, and cheap.

Repository: github.com/ichioda/spigel