File Formats for Punch Cards

Since punch cards have a quite special layout (usually 80 columns with 12 bits each), they cannot simply be saved in a binary data stream like 8-bit paper tapes are usually stored on computers (at least with my programs). It doesn't seem that many people encountered that problem in the past – I could find only one (!) document facing this problem.

The Jones Punched Card File Format

The punched card emulation proposal by Douglas W. Jones is the only document in the world wide web that concerns this problem. Jones has written some utilities in C to work with emulated punched card decks (I simply call them "punch card files"). It's a highly compact binary format with a 3-byte header (that holds some magic string like H80 or H82) that preceds every card file. The file itself is simply a concatenation of punch cards. Every punch card consists of a 3-byte header (filled up with some flags that are supposed to describe the design of the punch card) and the 80 columns. Two 12-bit colums are packed to three octets in bigendian format.

This model of a punch card stack has following advantages and disadvantages:

Advantages

Disadvantages

The Card Markup Language (XML)

Modeling a punch card with XML is a contemporary idea. Basically, I think such a file could look like this one:

<?xml version="1.0" encoding="UTF-8" ?>
<card-deck xmlns="http://dev.technikum29.de/2009/punch-card-markup-language">
	<card>
		<meta name="generator" value="Punch Card Reader Xyz" />
		<meta name="scanned" value="15.08.09T12:35:42" />
		
		<property key="Reader.Error" raw="-1235">Error:Foobar</property>
		
		<property key="Jones.Color" raw="0100">cream</property>
		<property key="Jones.Corner" raw="0">round</property>
		<property key="Jones.Cut" raw="1" />
		<property key="Jones.Header" value="&#01;&#02;&#05;" />
		<comment type="text/html">
			<!-- dank Qt-Richtext-Widget leicht exportierbar -->
			<html:p>
				Hier lassen sich problemlos Richtext-Inhalte
				hinschreiben.
			</html:p>
		</comment>
		<column value="101101101010">
			<label>A</label>
		</column>
		<column value="100100100100">
			<label>&specialfoo;</label>
		</column>
		<column value="100100100000" label="F" />
		<column bit0="0" bit1="1" bit2="1" bit3="2" bit4="1" bit5="0" bit6="0" bit7="0" bit8="0" bit9="0" bit10="0" bit11="1" bit12="0">
		</column>
	</card>

	...
</card-deck>

As you might notice, there are different possiblities how to model a card. There is, for example, the most elaborate variant (<column bit0="1" bit1="0" ...). Using that method, every punch card costs about 10&nsbp;kByte. There are cheaper methods like the "boolean string" 0101..., but here it needs to be defined which position corresponds to which one on the punch card.

Advantages

Disadvantages

Other, more exotic formats