?? lzwtut.htm
字號:
Also the code only uses a maximum of 16 bit compression and reads and writes bytes at a time. It is very possible to instead use up to 32 bit compression by replacing chars with ints and using a long as a buffer. However for the sake of clarity and keeping everything nice and easy to use I restricted it to a maximum of 16 bit, but feel free to like I said before to do so as an exercise.
</p>
<b>5c. C code notes and bit manipulation</b>
<p>
Yep another long section, however a very important one. One of the more confusing aspects of the source code is the write and read code routines, which use a lot of bit manipulation routines, and can therefore be very confusing if you are unsure of exactly how the code works.<br><br>
I am going to assume you know the basics of bit manipulation, if not then I suggest you read Joseph Farrell very informative and well written tutorial, that can be found over at Gamedev <a href = "http://www.gamedev.net/reference/articles/article1563.asp">Gamedev Bit Tutorial</a>. It will cover all the basics, give you some information on applying them to code, I will simply only go over exactly how the code works.<br><br>
The only thing I will briefly go over is the << and >> operators.<br><br>
<< is left shift, if you have the number 10 in binary here is an example:<br>
Number 10, in binary 0000 1010<br>
apply << operator by one, code looks like 10 << 1<br>
Now Number 20, in binary 0001 0100<br><br>
>> is right shift, if you have the number 10 again:<br>
Number 10, in binary 0000 1010<br>
apply >> operator, code looks like 10 >> 1<br>
Now Number 5, in binary 0000 0101<br><br>
It is very important to note that if you right shift the most significant bit is lost, and in left shifting the least significant is lost, this is important because you can use it to your advantage.<br><br>
1010 0000 << 1 = 0100 0000<br>
Notice how the 1 has disappeared? The 1 has been lost because it cannot be placed anywhere.<br>
The second thing I抎 like to emphasize before we start on dissecting the code is that you have to be very careful with signed data types. Once the signed bit is set, you'll find that the values you'll and how it reacts to bitwise operators change, and can affect your code!<br>
If you take a look at the gif_get_lzw code you抣l notice it differs from the version that was used. Why? Let try to use the one we found in the lzw.c/h code! So if you implement it, you'll find it doesn抰 work, why? Well let抯 draw up a quick table, along with code needed to implement :<br><br>
<b>LZW Code, not working correctly</b>
<pre>
int get_gif_lzw_code(FILE *fp,LZW *table, GIF *gif)
{
unsigned int code,looper;
code = 0;
while(table->buffer_bits <=((sizeof(int)-sizeof(char))*8))//Never go over than size - 8or we loose data
{
//No point reading anymore, files done :p
if(feof(fp))
{
break;
}
table->buffer |= ((unsigned char)fgetc(fp)) << (((sizeof(int)*8 - sizeof(char)*8)- table->buffer_bits));
table->buffer_bits+= 8;
table->to_nbuffer--;
if(table->to_nbuffer <= 0)
{
table->buffer_size = getc(fp);
table->to_nbuffer = table->buffer_size;
}
}
code = table->buffer >> (sizeof(int) * 8 - table->bits);
buffer <<= table->bits;
table->buffer_bits -= table->bits;
return code;
}
</pre>
In case you抮e wondering what the table is for, basically
I抳e taken a gif, and taken the first 5 Bytes of the Stream, to demonstrate where in the bit manipulation the new code faults.
This is an 8 bit GIF, and the first few bits of the stream; therefore the current codesize in bits is 9.<br><br>
<b>Tables of Values</b>
<pre>
------------------------------------------------|
GIF Stream(8 bits) |Correct Value(9 bits) |
------------------------------------------------|
Dec. Value|Binary |Dec. Value|Binary |
0 |0000 0000 |256 |1 0000 0000 |
183 |1011 0111 |217 |0 1101 1101 |
9 |0000 1001 |258 |1 0000 0010 |
28 |0001 1100 |259 |1 0000 0011 |
72 |0100 1000 | |
------------------------------------------------|
Table Continued
-------------------------
Received Value(9bits) |
------------------------|
Dec. Value|Binary |
1 |0 0000 0001 |
220 |0 1101 1100 |
72 |0 0100 1000 |
452 |1 1100 0010 |
| |
-------------------------
</pre>
Let抯 see just why it抯 not working correctly. Remember Gif Buffer streams go by bits right to left. Let抯 take a look at what the code does:<br>
<i> table->buffer |= ((unsigned char)fgetc(fp)) << (((sizeof(int)*8 - sizeof(char)*8)- table->buffer_bits));<br>
table->buffer_bits+= 8;</i> <br>
Basically it adds bits in the using the high bytes, and working its way to the low bytes, so the buffer would look like so in binary after storing the first 4 bytes:
<pre>
Byte 1 Byte 2 Byte 3 Byte 4
|-------|---------|----------|---------|
0000 0000 1011 0111 0000 01001 0001 1100
|---------|----------|-----------|
Code 1 Code 2 Code 3
</pre>
Okay, now we can see what happening, basically the values are being stored backwards to what the really should be! Now how do we fix that? Well what would happen if we stored Bytes in the reverse order?
<pre>
Byte 4 Byte 3 Byte 2 Byte 1
|--------|----------|---------|--------|
0001 1100 0000 01001 1011 0111 0000 0000
|---------|----------|---------|
Code 3 Code 2 Code 1
</pre>
Hey it works if we do it like so!And that method is the one used in the source code.
Now why don抰 we also use this method for the other LZW code? Well, first off this routine is not quite as optimized as the other, do you see the extra code = table->buffer << (sizeof(int) * 8 - table->bits);? This is to kick of all the extra bits(remember we talked about this) So as to only get the bits we want.
Also you'll notice that the code seems to be written to have a very low overhead, this is pretty pointless today as computers do have more than 1MB of memory, so feel free to modify the code so it loads everything in memory and then compresses for a massive speed increase, and upgrade the code so it can write up to 32 bit code sizes, and reads 16 bits instead of 8, (the code makes it very easy to do so). The code was written like this because I felt it would be easier to understand, not more efficient to use, as this is a tutorial not a Fully Built Ready to rock LZW library :)
</p>
<b>5d. Notes on Java Source</b>
<p>
Okay, I admit, the Java Code is more of a hack then anything, and it was added as an extra for those who prefer java over C. If something really bothers you about the code or you'd like to improve the code in the tutorial by all means drop me an email and ill see about updating the tutorial.
Java is a little different in that it does not directly support unsigned types, which can cause
problems as shifting works a little different with negative numbers. I was going to go into detail on why I did certain tricks in the code, but I already did that with the c code, and while searching for sources to use as reference for this section I happened to find a very useful link
that will explain it a lot better then I will, and would make a great link in your bookmarks:
<a href ="http://mindprod.com/jgloss/binary.html">Java Binary Explanation</a>
</p>
<a name = "ref"><b>References</b></a>
<hr><br>
<b>6a. Bibliography</b>
<p>
<a href = "http://dogma.net/markn/articles/lzw/lzw.htm"><b>LZW Data Compression, by Mark Nelson.</b></a><br>
This is THE tutorial on the source, without this tutorial, I doubt I would have been able to write the code as well as it came out and have been able to grasp the concept behind it. Well written, and very informative, and although it is showing its age, is still the best source you can find.
<br><br>
<b><a href ="http://www.arturocampos.com/ac_lzw_gif.html">LZW, gif decoding by Arturo San Emeterio Campos.</a></b><br>
This is a very informative and good read, however there are some really annoying spelling and grammar mistakes(although I've probably made just as many), and the level of writing could be a little higher, but despite its faults it was a helpful resource.
<br><br>
<b><a href = "http://www.cis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html">LZW and gif Explained by Steve Blackstock</a><br></b>
This tutorial will seem a little cryptic and hard to understand, and its recommended you read the above two tuts before diving into this tutorial, but it does cover its topic well enough, just expect an explanation, not working code.(This is found on a lot of sites, so just google the author if the link doesn't work).<BR><BR>
<b><a href ="http://www.wotsit.org">Official GIF documentation 87a/89a at wotsit</a></b>
If you need to know more about GIF, you can go wrong with this handy paper, covers everything you'll ever need to know on the GIF format.<br><br>
<b><a href ="http://www.gamespp.com/cprogramming/displayingGIFGraphicFiles.html">Games++ GIF Decoder by Steven H. Don</a></b><br>
Without this written source code I can assure you it would have been harder to have written the GIF code, although it shows its age(don't fread/fwrite structs, its not very safe), it is very clearly written and was a great help.<br><br>
<b><a href ="http://www.allegro.cc/depot/project.php?_id=648"> Allegro GIF Animation Library</a><br></b>
A very well written library, and it was also a great help when I was writing the GIF loader
</p>
<b>6b. Other references</b>
<p>
<b><a href ="http://www.datacompression.info/">http://www.datacompression.info/</a> <br></b>
By Mark Nelson, so you know the site is worth checking out for any compression tutorials you may need<br><Br>
<b><a href = "http://members.aol.com/royalef/gifabout.htm">All about GIF 89a</a><br></b>
Lots of info on the GIF, Covers those extensions I talked about. GIF specifications can be found here, in plain text format.
</p>
<b>6c. The Patent issue</b>
<p>
The GIF patent has become somewhat an issue, and a subject of great controversy.
Here are some links you may wish to read:<br>
<b><a href = "http://slashdot.org/article.pl?sid=04/07/06/1717243">Slashdot</a></b><br>
A slashdot posts containing some interesting information on the topc.<br><br>
<b><a href= "http://www.unisys.com/about__unisys/lzw/"> Unisys</a></b><br>
Unisys, the Patent owner on LZW, their information on the licensing<br><Br>
<b><a href = "http://www.cloanto.com/users/mcb/19950127giflzw.html">http://www.cloanto.com/users/mcb/19950127giflzw.html</a></b><br>
A very interesting read on the topic, and well worth your time to read!<br><br>
<b><a href = "http://www.gnu.org/philosophy/gif.html"> GNU gif page</a></b><br>
Not directly related at GIF use, but has a lot of info on the topic<br><br>
<b><a href = "http://burnallgifs.org/">Burn All GIFs</a></b><Br>
Written in a very negative manner towards the GIF, but contains a wealth of information on the topic<br><br>
</p>
<b>6d. Libraries/Tools Used</b>
<p>
I almost forgot this section, and perhaps left many of you wondering where to find Allegro and some other tools! Well, here you go<br><br>
<b><a href = "http://alleg.sf.net">Allegro</a></b><br>
A Cross Platform Game programming Library, very easy to use and was used for the C/C++ GIF loading code. Precompiled dll's for MSVC can be found at <a href ="http://allegro.cc/files/">allegro.cc</a><br><br>
<b><a href = "http://java.sun.com">Java</a></b><br>
What you don't know what Java is? Well anyway the SDK at the website lets you write your own software, the JRE lets you run the software compiled with it.<br><br>
<b><a href = "http://mingw.org">Mingw</a> <a href = "http://bloodshed.net">Dev-C++</a></b><br>
Mingw is a free C/C++ compiler for windows, and is really worth using, while Dev-c++ is an IDE for Mingw and also rocks. Mingw has many other IDE's so Google for them if you don't like Dev-C++.
</p>
<b>6e. The End</b>
<p>
Okay the end of the tutorial! Well, hopefully it wasn't that hard to understand, I tried hard to make it a decent tutorial and easy to understand, and I also tried to make the code as easy to read as possible( lots of comments). If you find any bugs or mistakes in either the code or the tutorial please send them to me at <a href = "mailto:hard_rock_2@yahoo.com">hard_rock_2@yahoo.com</a> Also if you want to translate this tutorial into any other languages, or host the tutorial somewhere, that's great, but be sure to tell me you are doing so and leave the links to my website! Also tell me where you are going to be hosting it so I can send information about possible updates or fixes. You are of course free to link directly to this tutorial <a href = "http://starsdev.cjb.net/developer/lzw.php">http://starsdev.cjb.net/developer/lzw.php</a>,no need to tell me if you are going to do so.<br><br>
Also don't forget to check out <a href = "http://starsdev.cjb.net">The Stars Development Company Website</a>, and the <a href = "http://starsdev.cjb.net/developer/">developer section</a> for any other tutorials I'll write or have written and any of my libraries. That's all folks, happy coding!
</p>
</body>
</html>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -