oatsandsugar 4 days ago

They trained a model to create embeddings that were a 1024 dimension vector, with each vector being a floating point with 32 bits.

This gave them a baseline performance of 100% with an embedding size of 4,096 bytes.

They then experimented with lopping off the second half of the embedding, leaving 512 dimensions, at 2048 bytes.

They also experimented with just flattening each dimension to 1 bit, 0 or 1 (0 for negative, 1 for positive), reducing the size of the embedding to a minuscule 128 bytes.

Counterintuitively, the "binary" simplification was not only way smaller, but ended up being slightly more performant (96.46% c.f. 95.22%).

This result is wild to me.