dpkg --compare-versions

Q: How are Debian package version strings compared?

A: This is mandated by Debian Policy and dpkg is considered the single truth of implementation.

Comparing Debian package version strings is not trivial: many programs implement this themselves and get it wrong for corner cases — me included. Therefor use dpkg –compare-versions or one of its wrappers, for example apt.apt_pkg.version_compare() for Python or debversion for PostgreSQL. Continue reading if you want to understand comparing Debian package version strings yourself, which is important when you increment the version of UCS packages. The format is: [_epoch_:]_upstream_version_[-_debian_revision_]

  1. The epoch is the last resort when version numbers go backward and should be used sparingly. It is reserved for cases, for example where upstream changes its version scheme incompatibly or to fix serious errors. The epoch is a small number - by default starting at 0 - and it not included with the filename: This makes comparing packages by filename wrong as this misses the epoch!
  2. The upstream_version is a sequence of alphanumeric characters plus the characters full stop, plus, hyphen and tilde. The pattern [0-9A-Za-z.+~-]+? is not greedy: It will capture all hyphens except the last one, which is used to separate the debian_revision.
  3. The debian_revision is optional and the part after the last hyphen character. Therefor the same character set as in upstream_version is allowed except the hyphen character: The pattern [0-9A-Za-z.+~]+ is greedy.

When two version strings are compared, they are compared section by section, e.g. first numerically by epoch, second by upstrean_version and only last by debian_revision. The last two sections require a special algorithm for comparison:

  • Each section is separated into a sequence of consecutive non-digit-characters and digits.
  • Consecutive digits are compared numerically
  • All non-digits are compared lexical as strings, using a modified ASCII encoding
  • ~ sorts before the empty string - think of this as (minus Epsilon), which is often used for alpha / beta / release-candidates or backports, as they must sort before the final version.
  • After that follow the upper A-Z and lower-case letters a-z.
  • Last comes the plus +, hyphen - (only in upstream_version) and full stop ..

Examples

I have inserted blanks between the three groups to help you parse the version numbers.

Examples for parsing

1.2
epoch=0
upstream_version=("", 1, ".", 2)
debian_revision=("")
3 : 1.2
epoch=3
upstream_version=("", 1, ".", 2)
debian_revision=("")
1.2 - 3
epoch=0
upstream_version=("", 1, ".", 2)
debian_revision=("", 3)
1.2-3 - 4.5
epoch=0
upstream_version=("", 1, ".", 2, "-", 3)
debian_revision=("", 4, ".", 5)
1 - deb9
epoch=0
upstream_version=("", 1)
debian_revision=("deb", 9)

Examples for comparing

1 < 2
{epoch: 0, upstream: ("", 1), debian: ("")}
{epoch: 0, upstream: ("", 2), debian: ("")}
_debian_revision_ defaults to 0 if not given
2 < 2 : 1
{epoch: 0, upstream: ("", 2), debian: ("")}
{epoch: 2, upstream: ("", 1), debian: ("")}
_epoch_ defaults to 0 if not given
1~rc2 < 1
{epoch: 0, upstream: ("", 1, "~rc", 2), debian: ("")}
{epoch: 0, upstream: ("", 1), debian: ("")}
`~` sorts before the empty string
1 < 1.2
{epoch: 0, upstream: ("", 1), debian: ("")}
{epoch: 0, upstream: ("", 1, ".", 2), debian: ("")}
the empty string sorts before **everything** except `~`
1 < 1+gitABC123DEF
{epoch: 0, upstream: ("", 1), debian: ("")}
{epoch: 0, upstream: ("", 1, "+gitABC", 123, "DEF"), debian: ("")}
Comparing hexadecimal numbers works - do you know why?
1 < 1 - 2
{epoch: 0, upstream: ("", 1), debian: ("")}
{epoch: 0, upstream: ("", 1), debian: ("", 2)}
1 - 3 < 1-2 - 3
{epoch: 0, upstream: ("", 1), debian: ("", 3)}
{epoch: 0, upstream: ("", 1, "-", 2), debian: ("", 3)}
The _upstream_version_ changes because the pattern is greedy
1 - 2 > 1 - 2~bpo9
{epoch: 0, upstream: ("", 1), debian: ("", 2)}
{epoch: 0, upstream: ("", 1), debian: ("", 2, "~bpo", 9)}
12.0.1-2 - dp1A~4.4.0.202011022025 > 12.0.1 - 3A~4.4.0.202108311259
{epoch: 0, upstream: ("", 12, ".", 0, ".", 1, "-", 2), debian: ("dp", 1, "A~", 4, ".", 4, ".", 0, ".", 202011022025)}
{epoch: 0, upstream: ("", 12, ".", 0, ".", 1), debian: ("", 3, "A~", 4, ".", 4, ".", 0, ".", 202108311259)}
You have accidentally changes the _upstream_version_ by introducing an **addition hyphen before** the _debian_revision_
1 - A > 1 - 2
{epoch: 0, upstream: ("", 1), debian: ("A")}
{epoch: 0, upstream: ("", 1), debian: ("", 2)}
`ord("0")=48` would sort before `ord("A")=65`, but the former is a digit while the later is a non-digit. Therefore they are compared differently and `A` wins, as the comparison always starts with the non-digit-part.

Greedy vs. minimal

What most confuses people is that the upstream_version seems to be greedy in regard to hyphens:

Only the last hyphen separates upstream_version from debian_revision, so all except the last hyphen should be captured by upstream_version. Doesn’t this make it greedy?

Actually no:

  • The greedy version [0-9-]+-[0-9]+ would mistakenly capture everything as upstream_version as it does not stop before the last hyphen.
  • The minimal version [0-9-]+?-[0-9]+ leaves the last part to debian_revision while capturing only the remaining prefix correctly as upstream_version.
Written on September 21, 2021