Texels: A texel is a 4x4 pixel (16 total) block in the larger image that is the principle building block of DXTn compression. The DXT1 texel is represented in code as 64-bit block of data. For comparison a 16 pixel uncompressed image using R5G6B5 would be 256 bits. The DXT1 code is broken up into 2 sections. The first is 32 bits long and consists of 2 16-bit colors (R5G6B5). The second is a 32 bit long lookup table where every 2 bits represents a pixel color of value 0 through 3 for each of the 16 pixels. The pixel location in the texel is implicitly defined by the structure so that the first four 2-bit values are the four pixels left to right of the top row, the next four are the next row from left to right and so on. The 2-bit values in the look-up table indicate which of 4 possible colors the pixel can be. Those 4 colors consist of the 2 colors stored in the first 32 bits and 2 colors spaced evenly between the first two.
For example, say that color1 (the first defined color) is R:10, G:50, B:75 (a dark blue) and that color4 (the second defined color) is R:200, G:200, B:200 (a very light grey). Color2 would be R:73, G: 100, B: 142 (a light blueish grey color) and color3 would be R:135, G:150, B:158 (a light grey). Color2’s red channel was calculated using 2/3 the value of color1’s red and 1/3 the value of color4 (10*2/3 + 200*1/3 ≈ 73). Color3 did the same thing in reverse using 2/3 of color4’s values and 1/3 of color1’s values. In special cases, DXT1 may only use 3 colors with the fourth “color” representing transparency.
As can be seen, this compresses 16 possible colors into 4, but those 4 colors all line along the same gradient which is chosen to best fit those 16 colors. With what’s already been shown it’s quite possible to render a DXT1 compressed image, but compressing an image into DXT1 requires determining the best possible gradient. The most common technique used is Principle Component Analysis (PCA). PCA in its most general form is a way of rotating a coordinate system such that along the principle axis there is the greatest amount of variation in value with each subsequent axis having less. In most cases only the principle axis is used to transform a higher order data system to a one dimensional system. Since along the principle axis the data varies the most, it also represents the best possible straight line that fits the data. This makes PCA perfect for determining the color gradient in DXTn compression.
By treating all 16 colors as coordinates in 3-space using red, green and blue as the original axis we can use PCA to find one of the best possible gradients for compression. The first step for PCA is to create a covariant matrix of the system (3x3 for 3 dimensions). A covariant is a value which represents how much one dimensions value varies with respect to the variation of another variation. The covariant of X and Y aka cov(X,Y) is expressed by this equation:

A variance is simply the measure of how much a single dimension varies and is equal to cov(X,X) for whatever dimension the variance relates to. The square root of a variance is the standard deviation of that dimension which is often used during statistical analysis. The covariance matrix that we will use for image compression looks like this:

It is important to note that the covariance must be constructed from a data set that has a mean value of 0 for all dimensions. So before finding the covariance, we will need to subtract the mean value of each color channel. So now that we have our matrix that contains information about all possible changes in any dimension, we can use matrix math to find the axis along which there is the most variation. First we need to find the eigenvalues of the matrix. Since this a 3x3 real symmetrical matrix we know that there are 3 real eigenvalues and that the eigenvectors are real and orthogonal (which is important since we’ll be using the eigenvectors as our new axes). The eigenvector corresponding to highest eigenvalue is the principle axis with each successively lower eigenvalue corresponding to the next most significant axis.
Notes on matrix math:
Eigenvalues and eigenvectors are special properties of square matrices. An eigenvalue is any value that satisfies the following equation:

That is to say, for every invertible square matrix there exists a value that can represent the matrix when multiplied by a vector. The number of these eigenvalues is equal to the order of the matrix although they may be repeated or complex. An eigenvector is the vector that satisfies the above equation for a given eigenvalue (which may be complex). The eigenvectors do not have fixed values, but each element of the vector is directly proportional to the others.
So by finding the eigenvector of the highest eigenvalue we have a vector that represents the best fit gradient we will use for our compression. By finding the dot-product of each pixel color and the unit-length gradient (subtracting the mean value of each channel for those colors), we can determine each colors projection along the principle axis. The next important step in PCA/DXTn compression is to determine the endpoints of the gradient.
A simple process is to find the maximum and minimum projections along the principle axis and pick the points at 1/8 along that line and 7/8 along that line (with the two calculated points being 3/8 and 5/8). Then it is a simple matter of determining which projection value correlates to which of the four points on the line. We have now simplified the 16 possible colors into 4 points along a 0-mean gradient in color-space. All that remains is to convert the two endpoint colors into actual colors (by re-adding the channel mean values) and we have the two colors that will be stored in the texel data. With all of this it is now possible to compress an image into DXT1.
For more information:
DXTn, S3TC compression - Wikipedia
PCA - Wikipedia
PCA - squish library
Covariance - Wikipedia
Eigenvalues and eigenvectors in general - Wikipedia
PCA, covariance, eigenvalues and eigenvectors - mediafire:PDF