After some playing around, it seems tracks are saved in t1 space, not diffusion space. As a interesting twist, one has to apply the inverse of the affine w/o translation, i.e. just the 3×3 not 4×4 affine.
Knowing this, obtaining voxel coords & label values is easy.
It would be nice if this were documented somewhere in the file formats section.