[44] | 1 | <html> |
---|
| 2 | <body> |
---|
| 3 | <h2>File Formats for Punch Cards</h2> |
---|
| 4 | |
---|
| 5 | <p>Since punch cards have a quite special layout (usually 80 columns with 12 bits each), |
---|
| 6 | they cannot simply be saved in a binary data stream like 8-bit paper tapes are usually |
---|
| 7 | stored on computers (at least with my programs). It doesn't seem that many people encountered |
---|
| 8 | that problem in the past – I could find only one (!) document facing this problem.</p> |
---|
| 9 | |
---|
| 10 | <h3>The Jones Punched Card File Format</h3> |
---|
| 11 | <p>The <a href="http://www.cs.uiowa.edu/~jones/cards/format.html">punched card emulation proposal</a> |
---|
| 12 | by Douglas W. Jones is the only document in the world wide web that concerns this problem. Jones |
---|
| 13 | has written some utilities in C to work with emulated punched card decks (I simply call them |
---|
| 14 | "punch card files"). It's a highly compact binary format with a 3-byte header (that holds some magic |
---|
| 15 | string like <tt>H80</tt> or <tt>H82</tt>) that preceds every card file. The file itself is simply a |
---|
| 16 | concatenation of punch cards. Every punch card consists of a 3-byte header (filled up with some |
---|
| 17 | flags that are supposed to describe the design of the punch card) and the 80 columns. Two 12-bit |
---|
| 18 | colums are packed to three octets in bigendian format.</p> |
---|
| 19 | |
---|
| 20 | <p>This model of a punch card stack has following advantages and disadvantages:</p> |
---|
| 21 | |
---|
| 22 | <h4>Advantages</h4> |
---|
| 23 | <ul> |
---|
| 24 | <li>It is <b>very compact</b>, one card costs only 123 bytes.</li> |
---|
| 25 | <li>I/O is <b>very fast</b> due to cheap bit shifting operations that can be made easily |
---|
| 26 | in C/C++ programs</li> |
---|
| 27 | <li>It's a well-documented <b>standard</b>, there already exist some programs that |
---|
| 28 | can work with that format.</li> |
---|
| 29 | </ul> |
---|
| 30 | |
---|
| 31 | <h4>Disadvantages</h4> |
---|
| 32 | <ul> |
---|
| 33 | <li>No chance to edit with <b>text editors</b>, difficult hex editor handling is neccessary</li> |
---|
| 34 | <li>I/O needs complex bit shifting operations, this is not intiutive for the programmer, |
---|
| 35 | especially if files would be operated by scripts, etc.</li> |
---|
| 36 | <li>No place for <b>meta data</b> like the label for a column, a translation table |
---|
| 37 | (column value to Unicode character) or additional comments</li> |
---|
| 38 | <li>Due to the 3-byte header, files cannot be treated like a <b>stream of punch cards</b>, |
---|
| 39 | so usual unix programs like <tt>cat</tt> won't work</li> |
---|
| 40 | </ul> |
---|
| 41 | |
---|
| 42 | <h3>The Card Markup Language (XML)</h3> |
---|
| 43 | <p>Modeling a punch card with XML is a contemporary idea. Basically, I think such a file could |
---|
| 44 | look like this one:</p> |
---|
| 45 | |
---|
| 46 | <pre> |
---|
| 47 | <?xml version="1.0" encoding="UTF-8" ?> |
---|
| 48 | <card-deck xmlns="http://dev.technikum29.de/2009/punch-card-markup-language"> |
---|
| 49 | <card> |
---|
| 50 | <meta name="generator" value="Punch Card Reader Xyz" /> |
---|
| 51 | <meta name="scanned" value="15.08.09T12:35:42" /> |
---|
| 52 | |
---|
| 53 | <property key="Reader.Error" raw="-1235">Error:Foobar</property> |
---|
| 54 | |
---|
| 55 | <property key="Jones.Color" raw="0100">cream</property> |
---|
| 56 | <property key="Jones.Corner" raw="0">round</property> |
---|
| 57 | <property key="Jones.Cut" raw="1" /> |
---|
| 58 | <property key="Jones.Header" value="&#01;&#02;&#05;" /> |
---|
| 59 | <comment type="text/html"> |
---|
| 60 | <!-- dank Qt-Richtext-Widget leicht exportierbar --> |
---|
| 61 | <html:p> |
---|
| 62 | Hier lassen sich problemlos Richtext-Inhalte |
---|
| 63 | hinschreiben. |
---|
| 64 | </html:p> |
---|
| 65 | </comment> |
---|
| 66 | <column value="101101101010"> |
---|
| 67 | <label>A</label> |
---|
| 68 | </column> |
---|
| 69 | <column value="100100100100"> |
---|
| 70 | <label>&specialfoo;</label> |
---|
| 71 | </column> |
---|
| 72 | <column value="100100100000" label="F" /> |
---|
| 73 | <column bit0="0" bit1="1" bit2="1" bit3="2" bit4="1" bit5="0" bit6="0" bit7="0" bit8="0" bit9="0" bit10="0" bit11="1" bit12="0"> |
---|
| 74 | </column> |
---|
| 75 | </card> |
---|
| 76 | |
---|
| 77 | ... |
---|
| 78 | </card-deck> |
---|
| 79 | </pre> |
---|
| 80 | |
---|
| 81 | <p>As you might notice, there are different possiblities how to model a card. There is, for |
---|
| 82 | example, the most elaborate variant (<tt><column bit0="1" bit1="0" ...</tt>). Using that |
---|
| 83 | method, every punch card costs about 10&nsbp;kByte. There are cheaper methods like the |
---|
| 84 | "boolean string" <tt>0101...</tt>, but here it needs to be defined which position corresponds to |
---|
| 85 | which one on the punch card.</p> |
---|
| 86 | |
---|
| 87 | <h4>Advantages</h4> |
---|
| 88 | <ul> |
---|
| 89 | <li><b>Good text editor support</b></li> |
---|
| 90 | <li>Most <b>unambiguous model</b> of a punch card</li> |
---|
| 91 | <li>Very good <b>meta data</b> support in all respects</li> |
---|
| 92 | </ul> |
---|
| 93 | |
---|
| 94 | <h4>Disadvantages</h4> |
---|
| 95 | <ul> |
---|
| 96 | <li>A lot of overhead data, blowing up the file size enormously. Can be |
---|
| 97 | stripped down to very small sizes via compression (like gzip-on-the-fly)</li> |
---|
| 98 | </ul> |
---|
| 99 | |
---|
| 100 | <h3>Other, more exotic formats</h3> |
---|
| 101 | |
---|
| 102 | <ul> |
---|
| 103 | <li>Any <b>text formats</b> where the card is represented in a textual form like |
---|
| 104 | <pre> |
---|
| 105 | /12345679012 |
---|
| 106 | |00000100001 |
---|
| 107 | |00011000000 |
---|
| 108 | |... |
---|
| 109 | </pre> |
---|
| 110 | They are very good editable with text editors and not so much overhead, compared |
---|
| 111 | to the XML version. On the other hand, that would be ideal I/O formats for scripts |
---|
| 112 | (like perl scripts) but not for C code. Furthermore this is also not so clean.</li> |
---|
| 113 | <li><b>Bitmaps</b>, yes, pixel images. This is a fancy method that came into my mind |
---|
| 114 | some time ago, it would have been a perfect idea for paper tapes (for paper tape |
---|
| 115 | fonts, etc.), but "drawing" text via holes is not very common on punch cards, so |
---|
| 116 | there's no serious advantage in this format against any other binary format |
---|
| 117 | (except that it can be edited with a bitmap editor, hehe)</li> |
---|
| 118 | <li><b>CSV</b>, a classical export file format if someone want's to edit punch cards |
---|
| 119 | with Excel, but not neccessarily the first class storage format for punch cards</li> |
---|
| 120 | <li>Simple <b>80 columns text</b>, also only for import/export to real punch card |
---|
| 121 | files</li> |
---|
| 122 | </ul> |
---|