spigel: image comparison with pHash and dHash

ichioda/spigel is a TypeScript library for image comparison and hashing. The idea is simple: take two images, compute a visual fingerprint for each, and return a distance that tells you how similar they look.

The cool part is that it does this without any machine learning model. Instead of running images through a neural network, the library uses perceptual hashing algorithms — math functions that squash visual features into a short, comparable string.

spigel image hashing thumbnail

Basic usage looks like this:

import { compare } from "spigel";
 
const result = await compare("./example.png", "./example2.png", {
  algorithm: "phash",
  humanize: true
});
 
// { distance: "high similarity", hashes: { hashA: "...", hashB: "..." } }

What is perceptual hashing?

A cryptographic hash like SHA-256 changes completely when a single byte in the file changes. That's great for integrity checks, but terrible for images: if you resize a photo, recompress it as JPEG, or tweak the contrast slightly, the bytes change a lot even though the image still looks pretty much the same.

Perceptual hashing solves a different problem. It tries to answer: "do these two images look the same to a person?"

The result shouldn't be taken as absolute truth. It's a signal. The smaller the distance between two hashes, the more visually similar they're expected to be. The larger the distance, the stronger the hint that they're different.

dHash: local differences

dHash, or difference hash, works on a small and elegant idea: instead of storing the color of each pixel, it stores the direction of change between neighboring pixels.

In spigel, the dHash flow goes like this:

  • load the image from a file path or Buffer;
  • resize it to 9x8;
  • convert to grayscale;
  • compare each pixel with the one right next to it;
  • turn those comparisons into a hex string.

Since the grid is 9 columns by 8 rows, each row produces 8 comparisons. That gives you 64 binary decisions about the image's local behavior.

This method tends to be fast and handy when you want to catch near-identical images — duplicate thumbnails, small compression differences, or copies with minor tweaks. It's also easy to explain: two images look alike if their light/dark patterns go up and down in a similar way.

pHash: visual structure through frequencies

pHash, or perceptual hash, tries to capture something more structural. It doesn't just look at differences between neighboring pixels — it transforms the image into frequency space using DCT (Discrete Cosine Transform).

In spigel, the pHash process:

  • converts the image to grayscale;
  • resizes it to 32x32;
  • applies DCT over the pixel matrix;
  • looks at an 8x8 region of low frequencies;
  • compares each coefficient against the mean;
  • produces a final binary string.

Low frequencies represent the overall composition of the image: masses of light, broad contrast, main shapes. That's why pHash tends to be more robust than pixel-by-pixel comparison when there's resizing, compression, or small visual changes involved.

In practice, dHash is good for quick and simple comparisons; pHash tends to work better when you want to tolerate small transformations while keeping a broader sense of similarity.

The spigel API

The library exposes three main functions:

import { compare, compareHash, hash } from "spigel";

compare takes two images, computes their hashes, and returns the distance:

await compare("./a.png", "./b.png", {
  algorithm: "dhash",
  humanize: true
});

hash computes just the fingerprint of a single image:

const imageHash = await hash("./a.png", "phash");

compareHash compares two pre-computed fingerprints:

await compareHash(hashA, hashB, {
  algorithm: "phash",
  humanize: false
});

This design is useful because it lets you separate image processing from comparison. In a real system, you can compute hashes at upload time, store them in the database, and compare later without reopening the original files.

When humanize is turned on, the numeric distance becomes a readable label like identical, high similarity, low similarity, or different.

Use cases

spigel makes sense anywhere "looks similar" matters more than "byte-for-byte identical."

The obvious one is image deduplication. If users upload the same photo at different sizes or compression levels, perceptual hashes help group copies together without relying on the filename.

Another use is similar image search. Instead of searching only by metadata, a system can store hashes and compute distances against candidate images.

There's also value in moderation and security. Abuse detection tools, anti-spam systems, marketplaces, and avatar systems can use hashes to catch re-uploads of known images — even when someone tries to make small changes.

For product engineering, there's a practical use in visual testing. Interface screenshots can shift by a few pixels between builds. A perceptual hash helps separate visual noise from changes big enough to investigate.

Limitations

Perceptual hashes don't understand meaning. If two images have a similar composition but mean completely different things, the algorithm might score them too close. If an image gets aggressively cropped, heavily rotated, covered with a large overlay, or manipulated on purpose, the distance might stop reflecting what a human would see.

There's also no universal threshold. What counts as "similar" depends on the domain: product photos, memes, scanned documents, and UI screenshots all have different tolerances.

That's why the best way to use spigel is to treat the distance as a decision feature, not a final judge. It can feed into ranking, alerts, human review, initial filtering, or grouping.

Why this library is interesting

spigel is small, readable, and practical. It uses sharp for image manipulation, implements pHash and dHash in TypeScript, accepts file paths or Buffer, and offers a simple API for comparing images or pre-computed hashes.

That fits well with tools that need to stay close to real product behavior: uploads, thumbnails, bots, media pipelines, automation, and systems that need to quickly decide if two images should be treated as variants of the same thing.

The value here isn't in promising artificial intelligence. It's in doing a common visual operation in a predictable, explainable, and cheap way.

Repository: github.com/ichioda/spigel