Current location - Education and Training Encyclopedia - Graduation thesis - Find out the basic facts on which image compression is based.
Find out the basic facts on which image compression is based.
Brief introduction of the principle and method of image compression

Out of my love and study of photoshop, I rented a world encyclopedia gallery two days ago. As a result, after loading into the computer, there is only 4GB left on the D drive, which is really too little for a person like me who likes to save materials. Moreover, I found that the picture movement in the large gallery is about 1.5M, which is quite different from the size of the pictures I usually collect online. The pictures I save are usually around 100KB, unless they are taken with a digital camera, they are all around 0.5M, and there is no clear picture of 100KB, so I tried to compress the encyclopedia. Today's article is a brief introduction to the principle and simple method of image compression.

First of all, I will introduce two presentation skills of pictures on the computer.

There are two kinds of image representation technologies, bitmap technology and vector technology. Vector graphics are mainly used for cartoon graphics created by computer and regular graphics in mathematics. There are many contacts in daily life, such as digital photos, and the pictures scanned into the computer are bitmaps.

For vector graphics, compression is unnecessary, because vector graphics are realized by commands, not in the form of dot matrix, so no matter how big or small you enlarge it, its commands are still the same, and the format cannot be changed at all. If you change the format, you will lose all the functions of vector graphics, so today we will not discuss vector graphics, but mainly talk about bitmap compression technology.

For bitmap compression, there are basically two methods:

The first method is format type conversion compression.

This way is to use some technologies (such as jpeg is a technology) to re-encode the image. For image files, there are many extensions, such as bmp, jpeg(jpg), gif and so on. If you want a comprehensive understanding, you can check it online. Each square format corresponds to an image coding. Among so many codes, jpeg technology can realize lossless compression of images. If the file extension of the image is. BMP, then you should use this method to directly convert the extension into. JPG。 The operation method is simple. Open the picture with the drawing board that comes with Windows XP, and then choose jpg or jpeg as the format when saving it as. If you are a windows2000 operating system, you can't save it as a jpg file with the sketchpad. Select program-> attachment-> image processing in the start menu to complete the above operation, and the operation method is the same. You can also use QQ's automatic conversion function to convert the picture format by sending it to others, then "right-clicking" the picture displayed on QQ and selecting Save As. Then the saved picture is a compressed picture. Of course, many softwares have this function, especially when batch processing is needed. I suggest you use the help of software, such as PhotoShop and ACDSEE. I will introduce the specific method in a future article.

At present, the most popular technology is to use jpeg coding to compress pictures. I quote a professional photo website to explain the principle of this technology. You can skip this paragraph if you don't want to know:

The basic principle of compressing a file is to find out the duplicate bytes in the file, and create a "dictionary" file with the same bytes, which is represented by a code. For example, there are several places in the file where the same words "China people * * and China" are represented by a code and written into the "dictionary" file, so that the file can be reduced.

Because the information processed by computer is expressed in the form of binary numbers, compression software marks the same string in binary information with special characters to achieve the purpose of compression. To help you understand file compression, please imagine a picture of blue sky and white clouds in your mind. For thousands of monotonous blue pixels, instead of defining a long list of colors "blue, blue, blue ……" one by one, it is better to tell the computer that "storing117 blue pixels from this location" is more concise and can greatly save storage space. This is a very simple example of image compression. In fact, in the final analysis, all computer files are stored in the form of "1" and "0". Just like blue pixels, through reasonable mathematical calculation formula, the volume of files can be greatly reduced, and the effect of "lossless and dense data" can be achieved. Generally speaking, compression can be divided into lossy compression and lossless compression. If the loss of individual data will not have much impact, it is a good idea to ignore them, which is lossy compression. Lossy compression is widely used in animation, sound and image files, and the typical examples are mpeg, mp3 and jpg. But in more cases, the compressed data must be accurate, so people have designed lossless compression formats, such as common zip and rar. Compression software is naturally a tool to compress data by using compression principle. The file generated after compression is called archive, and its volume is only a fraction or even smaller. Of course, the compressed package is already another file format. If you want to use the data in it, you must first restore the data with compression software. This process is called decompression. Common compression software includes winzip, winrar, etc.

There are two forms of duplication of computer data, and zip compresses them.

One is repetition in the form of phrases, that is, repetition of more than three bytes. For this repetition, zip uses two numbers: 1. The distance between the repetition position and the current compression position; 2. The length of the repetition, to represent this repetition, assuming that these two numbers each occupy one byte, then the data is compressed and easy to understand.

One byte has 0-255 * * 256 possible values, and three bytes have more than 256 * 256 * 256 * *16 million possible situations. The possible values of longer phrases increase exponentially, and the probability of repetition seems extremely low. In fact, all types of data tend to be repetitive. In a paper, several terms will appear repeatedly. A novel, names and places will appear repeatedly; For a background picture with a gradient up and down, pixels in the horizontal direction will appear repeatedly; Grammatical keywords will appear repeatedly in the source file of the program (how many times have we copied and pasted the program before and after writing it? ), in the uncompressed format data with tens of K as the unit, a large number of phrase repetitions often occur. After the above compression, the tendency of phrase repetition is completely destroyed, so the second phrase compression of the compression result is generally invalid.

The second kind of repetition is single-byte repetition, and there are only 256 possible values in a byte, so this kind of repetition is inevitable. Among them, some bytes may appear more times, while others are less, which tends to be unevenly distributed statistically, which is easy to understand. For example, in an ASCII text file, some symbols may be rarely used, while letters and numbers are used more, and the frequency of each letter is different. It is said that the letter e has the highest probability of use; Many pictures are dark or bright, and dark (or bright) pixels are often used (here by the way: png picture format is a lossless compression, and its core algorithm is zip algorithm. The main difference from zip format files is that as a picture format, it stores information such as the size of the picture and the number of colors used in the file header). The results of the above phrase compression also have this trend: repetition often appears near the current compression position, and the repetition length is often shorter (within 20 bytes). In this way, it can be compressed: 256 kinds of bytes are re-encoded, so that bytes with more occurrences use shorter encoding, and bytes with fewer occurrences use longer encoding. In this way, when the shorter bytes are more than the longer bytes, the total length of the file will be reduced, and the more uneven the byte usage, the greater the compression ratio.

It can be said that jpg has been compressed very badly, and there is less loss if it cannot be compressed. According to your different requirements for image quality, the compression ratio may vary greatly, but it is generally very large (this is the charm of technology). It may give you some compression options when using its special software for compression. In the adjustment of image quality, you'd better not lower than 40%, which will cause great loss to the image. This is itself some options in JPEG technology, depending on how much compression you need.

The second method: resize the picture (some may be called resolution adjustment)

This method is actually to change the size of the picture, a photo of 3000*2000, adjust its size to 600*400, and its size will become 25 1. Of course, according to the principle of bitmap representation, we can also consider reducing the number of colors in the image, but we generally don't do this. Users of windowsXP operating system can modify it with the drawing board that comes with the system. The operation method is simple. After opening the picture with the sketchpad, select Stretch/Twist in the Image menu (or see ctrl+w shortcut key), enter the scale you want to reduce (just adjust it to the normal size, if it is too small, you can use ctrl+z to undo the operation), and finally save the picture. Users of windows2000 operating system can modify it with the image processing software that comes with the system. The operation method is to start the menu selection program-> Attachment-> Image Processing. After opening the image, select Properties-> Size in the page menu to modify the image size. Finally save the picture. Friends who use ACDSEE can take the following actions: open the picture they want to compress with ACDSEE, select Edit in the toolbar, and then select Resize in the toolbar of the pop-up picture editor to resize the picture. For example, if the picture is changed from 1024×768 to 640×480, the picture size will naturally decrease. Of course, changing the size of the picture will also affect the viewing effect of the picture to a certain extent.

At present, there are many special software developed by individuals for image volume compression, such as MyPhotoZip, Jpeg Imager, image optimizer and so on. You can also compress images more accurately, but the use of these softwares is relatively troublesome. But the principle used is nothing more than the two methods I mentioned above, some of which adopt jpeg2000 coding, which is a more concise technology than jpeg and quite good. How to compress pictures with more reasonable coding technology is too abstruse. Although I have some information about this, I haven't studied it deeply myself. You will know something about these things and compress your pictures appropriately.

Flying Moon Reminder: Image compression is at the cost of changing the quality of the image itself. Although we may not feel it at all, no matter what compression method you adopt, the image quality will be damaged.