mirror of
https://github.com/SDL-Hercules-390/hyperion.git
synced 2026-04-14 16:10:20 +02:00
905 lines
44 KiB
HTML
905 lines
44 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.0//EN" "html.dtd">
|
|
<HTML>
|
|
<HEAD><TITLE>
|
|
Hercules: Compressed Dasd Emulation</TITLE>
|
|
<LINK REL=STYLESHEET TYPE="text/css" HREF="hercules.css">
|
|
</HEAD>
|
|
<BODY BGCOLOR="#ffffcc" TEXT="#000000" LINK="#0000A0"
|
|
VLINK="#008040" ALINK="#000000">
|
|
<h1>Compressed Dasd Emulation</h1>
|
|
<hr noshade>
|
|
<h2>Contents</h2>
|
|
<ul>
|
|
<li><a href="#introduction"> Introduction </a>
|
|
<li><a href="#shadowfiles"> Shadow Files </a>
|
|
<li><a href="#filestructure"> File Structure </a>
|
|
<li><a href="#howitworks"> How It Works </a>
|
|
<li><a href="#cckdcommand"> The CCKD Command </a>
|
|
<li><a href="#utilities"> Utilities </a>
|
|
<li><a href="#faq"> FAQ </a>
|
|
</ul>
|
|
|
|
<hr noshade>
|
|
<h3><a NAME="introduction">Introduction</a></h3>
|
|
Using compressed DASD files you can significantly reduce the file space
|
|
required for emulated DASD files and possibly gain a performance boost
|
|
because less physical I/O occurs.
|
|
|
|
Both <b>CKD</b> (Count-Key-Data) and <b>FBA</b> (Fixed-Block-Architecture)
|
|
emulation files can be compressed.
|
|
<p>
|
|
In regular (or uncompressed) files, each CKD track or FBA block occupies
|
|
a specific spot in the emulation file. The offset of the track or block
|
|
in the file can be directly calculated knowing the track or block number
|
|
and the maximum size of the track or block. In compressed files, each
|
|
track image or group of blocks may be compressed by
|
|
<a href="http://www.info-zip.org/pub/infozip/zlib/"><b>zlib</b></a> or
|
|
<a href="http://sourceware.cygnus.com/bzip2/"><b>bzip2</b></a>, and only
|
|
occupies the space neccessary for the compressed image. The offset of a compressed
|
|
track or block is obtained by performing a two-table lookup. The lookup
|
|
tables themselves reside in the emulation file.
|
|
<p>
|
|
Because FBA blocks are 512 bytes in length, and that being a rather small
|
|
number, FBA blocks are grouped into <b>block groups</b>. Each block group
|
|
contains 120 FBA blocks (60K).
|
|
<p>
|
|
Whenever a track or block group is written to a compressed file,
|
|
it is written either to an existing free space within the file, or at
|
|
the end of the file, then the lookup tables are updated, and then the space the
|
|
track or block group previously occupied is freed. The location of a
|
|
track or block group in the file can change many times.
|
|
<p>
|
|
In the event of a catastrophic failure (for example, Hercules crash,
|
|
operating system crash, power failure), the compressed emulation file
|
|
on the host's physical disk may be out of sync if the host operating
|
|
system defers physical writes to the file system containing the emulation
|
|
file. A number of techniques have been provided to minimize emulation
|
|
file corruption in such an event.
|
|
<p>
|
|
A compressed file may occupy only 20% of the disk space required by an
|
|
uncompressed file. In other words, you may be able to have 5 times more
|
|
emulated volumes using compressed DASD files. However, compressed files
|
|
are more sensitive to failures and corruption may occur.
|
|
|
|
<p>
|
|
<hr noshade>
|
|
<p><h3><a NAME="shadowfiles">Shadow Files</a></h3>
|
|
|
|
|
|
An compressed CKD or FBA dasd can have more than one physical file. The
|
|
additional files are called <em>shadow files</em>.
|
|
The function is implemented as a kind of
|
|
<i>snapshot</i>, where a new shadow file can be created on demand.
|
|
An emulated dasd is represented by a <em>base</em> file and 0 or more
|
|
shadow files. All files are opened <em>read-only</em>
|
|
except for the <em>current</em> file, which is opened <em>read-write</em>.
|
|
<p>
|
|
Shadow files are specified by the <b>sf=</b><i>shadow-file-name</i> parameter
|
|
on the device statement for the compressed DASD device. The shadow file name
|
|
should have spot where the shadow file number will be set. This is
|
|
either the character preceding the last period after the last slash or the
|
|
last character if there is no period. For example:<br><br>
|
|
<code>0100 3390 disks/linux1.dsk sf=shadows/linux1_*.dsk</code>
|
|
<p>
|
|
There can be up to 8 shadow files in use at any time for an
|
|
emulated dasd device. The base file is designated file<b>[0]</b> and
|
|
the shadow files are file<b>[1]</b> to file<b>[8]</b>.
|
|
The <em>highest</em> numbered file in use at a given time is the <em>current</em>
|
|
file, where all writes will occur. Track reads start with the <em>current</em>
|
|
file and proceed down until a file is found that actually contains the track
|
|
image.
|
|
<p>
|
|
A shadow file contains all the changes made to the emulated dasd
|
|
since it was created, until the next shadow file is created. The moment
|
|
of the shadow file's creation can be thought of as a <em>snapshot</em>
|
|
of the current emulated dasd at that time, because if the shadow file is
|
|
later removed, then the emulated dasd reverts back to the state it was at
|
|
when the <em>snapshot</em> was taken.
|
|
<p>
|
|
Using shadow files, you can keep the base file on a read-only device
|
|
such as cdrom, or change the base file attributes to read-only,
|
|
ensuring that this file can never be corrupted.
|
|
<p>
|
|
Hercules console commands are provided to add a new shadow file, remove
|
|
the current shadow file (with or without backward merge), compress the
|
|
curent shadow file, and display the shadow file status and statistics:<br><br>
|
|
|
|
<table>
|
|
<tr><td align="left"><b>sf+</b></td>
|
|
<td align="left" colspan="2"><font size=-1>unit</font></td>
|
|
<td align="left">   Create a new shadow file</td>
|
|
<tr><td align="left"><b>sf-</b></td>
|
|
<td align="left" colspan="2"><font size=-1>unit</font></td>
|
|
<td align="left">   Remove a shadow file with backwards merge</td>
|
|
<tr><td align="left"><b>sf-</b></td>
|
|
<td align="left"> <font size=-1>unit</font></td>
|
|
<td><b>nomerge</b></td>
|
|
<td align="left">   Remove a shadow file without backwards merge</td>
|
|
<tr><td align="left"><b>sfc</b></td>
|
|
<td align="left" colspan="2"><font size=-1>unit</font></td>
|
|
<td align="left">   Compress the current file</td>
|
|
<tr><td align="left"><b>sfd</b></td>
|
|
<td align="left" colspan="2"><font size=-1>unit</font></td>
|
|
<td align="left">   Display shadow file status and statistics</td>
|
|
</table>
|
|
<br>
|
|
<b><font size=-1>Note</font></b>. You can use <b>*</b> in place of unit
|
|
address to apply the command to all compressed dasd.
|
|
|
|
<p>
|
|
<hr noshade>
|
|
<p><h3><a NAME="filestructure">Compressed DASD File Structure</a></h3>
|
|
|
|
A compressed DASD file has 6 types of spaces, a <em>device header</em>,
|
|
a <em>compressed device header</em>, a <em>primary lookup table</em>,
|
|
<em>secondary lookup tables</em>, track or block group <em>images</em>,
|
|
and <em>free spaces</em>. The first 3 types only occur once, at the
|
|
beginning of the file in order. The rest of the file is occupied by
|
|
the other 3 space types.
|
|
<p>
|
|
The first 512 bytes of a compressed DASD file contains a <b>device header</b>.
|
|
The device header contains an eye-catcher that identifies the file type
|
|
(CKD or FBA and base or shadow). The device type and file size is also
|
|
specified in this header. The header is identical to the header used
|
|
for uncompressed CKD files, except for the eye-catcher:
|
|
<p>
|
|
<table border=1>
|
|
<tr><td align="left" colspan="8"><font size=-1>devid</font></td>
|
|
<td align="left" colspan="4"><font size=-1>heads</font></td>
|
|
<td align="left" colspan="4"><font size=-1>trksize</font></td>
|
|
<tr><td align="left" colspan="1"><font size=-1>devt</font></td>
|
|
<td align="left" colspan="1"><font size=-1>seq</font></td>
|
|
<td align="left" colspan="2"><font size=-1>hicyl</font></td>
|
|
<td align="left" colspan="12"> </td>
|
|
<tr><td align="center" valign="middle" colspan="16">
|
|
<br><br><font size=-1>reserved</font><br><br><br></td>
|
|
</table>
|
|
<p>
|
|
|
|
The next 512 bytes contains the <b>compressed device header</b>.
|
|
This contains file usage information such as the amount of free
|
|
space in the file:
|
|
<p>
|
|
<table border=1>
|
|
<tr><td align="left" colspan="3"><font size=-1>vrm</font></td>
|
|
<td align="left" colspan="1"><font size=-1>opts</font></td>
|
|
<td align="left" colspan="4"><font size=-1>numl1</font></td>
|
|
<td align="left" colspan="4"><font size=-1>numl2</font></td>
|
|
<td align="left" colspan="4"><font size=-1>size</font></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>used</font></td>
|
|
<td align="left" colspan="4"><font size=-1>->free</font></td>
|
|
<td align="left" colspan="4"><font size=-1>free</font></td>
|
|
<td align="left" colspan="4"><font size=-1>largest</font></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>number</font></td>
|
|
<td align="left" colspan="4"><font size=-1> </font></td>
|
|
<td align="left" colspan="4"><font size=-1>cyls</font></td>
|
|
<td align="left" colspan="1"><font size=-1> </font></td>
|
|
<td align="left" colspan="1"><font size=-1>comp</font></td>
|
|
<td align="left" colspan="4"><font size=-1>parm</font></td>
|
|
<tr><td align="center" colspan="16">
|
|
<br><br><font size=-1>reserved</font><br><br><br></td>
|
|
</table>
|
|
<p>
|
|
After the compressed device header is the <b>primary lookup table</b>,
|
|
also called the <em>level 1 table</em> or <em>l1tab</em>. Each
|
|
4 byte unsigned entry in the l1tab contains the file offset of
|
|
a <em>secondary lookup table</em> or <em>level 2 table</em> or
|
|
<em>l2tab</em>. The track or block group number being accessed
|
|
divided by 256 gives the index into the l1tab. That is, each l1tab
|
|
entry represents 256 tracks or block groups. The number of entries
|
|
in the l1tab is dependent on the size of the emulated device:
|
|
<p>
|
|
<table border=1>
|
|
<tr><td align="left" colspan="4"><font size=-1>l2<sub>0</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>1</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>2</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>3</sub></font></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>l2<sub>4</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>5</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>6</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>7</sub></font></td>
|
|
<tr><td align="left" colspan="16">
|
|
<br><br><center>.  .  .</center><br><br></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>l2<sub>n-4</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>n-3</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>n-2</sub></font></td>
|
|
<td align="left" colspan="4"><font size=-1>l2<sub>n-1</sub></font></td>
|
|
</table>
|
|
<p>
|
|
Following the <em>l1tab</em>,
|
|
in no particular order, are <em>l2tabs</em>, track or block group
|
|
<em>images</em>, and <em>free spaces</em>.
|
|
<p>
|
|
Each <b>secondary lookup table</b> (or <em>l2tab</em>), contains 256 8-byte
|
|
entries. The entry is indexed
|
|
by the remainder of the track or block group number divided by 256. Each
|
|
entry contains an unsigned 4 byte offset and an unsigned 2 byte length of the
|
|
track or block group image:<p>
|
|
<table border=1>
|
|
<tr><td align="left" colspan="4"><font size=-1>
|
|
<sup>0</sup>  ->image
|
|
       </font></td>
|
|
<td align="left" colspan="2"><font size=-1>length</font></td>
|
|
<td align="left" colspan="2"><font size=-1>unused</font></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>
|
|
<sup>1</sup>  ->image
|
|
       </font></td>
|
|
<td align="left" colspan="2"><font size=-1>length</font></td>
|
|
<td align="left" colspan="2"><font size=-1>unused</font></td>
|
|
<tr><td align="center" colspan="8"><font size=-1>
|
|
<br>.   .   .<br><br></td>
|
|
<tr><td align="left" colspan="4"><font size=-1>
|
|
<sup>255</sup>  ->image
|
|
       </font></td>
|
|
<td align="left" colspan="2"><font size=-1>length</font></td>
|
|
<td align="left" colspan="2"><font size=-1>unused</font></td>
|
|
</table>
|
|
<p>
|
|
A track or block group <b>image</b> contains two fields, a 5-byte
|
|
<em>header</em> and a variable amount of data that may or may not be
|
|
compressed. The length in the l2tab entry includes the length of the
|
|
header and the data.
|
|
<p>
|
|
<table border=1>
|
|
<tr><td align="left"><font size=-1>hdr</font></td>
|
|
<td align="left"><font size=-1>track or block group data</font></td>
|
|
</table>
|
|
<p>
|
|
The 5 byte header contains a 1 byte flag field and 4 bytes that
|
|
identify the track or block group. The format of the identifier
|
|
depends on whether the emulated device is CKD or FBA:
|
|
<p>
|
|
CKD hdr
|
|
<table border=1>
|
|
<tr><td><font size=-1>flags</font></td>
|
|
<td align="left" colspan="2"><font size=-1><b>CC</b></font>  </td>
|
|
<td align="left" colspan="2"><font size=-1><b>HH</b></font>  </td>
|
|
</table>
|
|
<p>
|
|
The 2 byte CC is the cylinder number for the track image and the HH
|
|
is the head number. These numbers are stored in <em>big-endian</em>
|
|
byte order. When the flag byte is zeroed, the 5 byte header is identical
|
|
to the <em>Home Address</em> (or <em>HA</em>) for the track image.
|
|
The data, which may or may not be compressed, begins with the <em>R0</em>
|
|
count and ends with the <em>end-of-track</em> (or <em>eot</em>) marker,
|
|
which is a count field containing 8 0xff's. The <em>HA</em> plus the
|
|
uncompressed track data comprise the track image.
|
|
<p>
|
|
FBA hdr
|
|
<table border=1>
|
|
<tr><td><font size=-1>flags</font></td>
|
|
<td align="left" colspan="4"><font size=-1>nnnn</font>
|
|
       </td>
|
|
</table>
|
|
<p>
|
|
The 4 byte nnnn field is the FBA block group number in
|
|
<em>big-endian</em> byte order. The data contains 120 FBA blocks,
|
|
which may or may not be compressed. Uncompressed, the FBA block
|
|
group is 60K. The header for FBA, unlike CKD, is not used as part
|
|
of the uncompressed image.
|
|
<p>
|
|
The flags byte contains 8 bits in the format
|
|
<table border=1>
|
|
<tr><td><font size=-1>0   0   0   0  
|
|
0   0   <b>c</b>   <b>c</b>  
|
|
</font></td>
|
|
</table>
|
|
The first 6 bits are always zero but may be used in future releases.
|
|
The last two bits, <em>cc</em>, indicate the compression algorithm
|
|
for the data portion:
|
|
<table border="1">
|
|
<tr><td>0   0</td><td>    Data is uncompressed</td>
|
|
<tr><td>0   1</td><td>    Data is compressed using zlib</td>
|
|
<tr><td>1   0</td><td>    Data is compressed using bzip2</td>
|
|
<tr><td>1   1</td><td>    Not valid</td>
|
|
</table>
|
|
|
|
<p>
|
|
|
|
<b>Free space</b> contains a 4-byte <i>offset</i> to the next free space, a
|
|
4-byte <i>length</i> of the free space, and zero or more bytes of residual data:
|
|
<p>
|
|
<table border=1>
|
|
<tr><td><font size=-1>->next</font></td>
|
|
<td><font size=-1>length</font></td>
|
|
<td>   <font size=-1>residual</font>   </td>
|
|
</table>
|
|
<p>
|
|
The minimum length of a free space is 8 bytes.
|
|
The free space chain is ordered by file offset and no two free spaces are
|
|
adjacent. The <em>compressed device header</em> contains the offset to the
|
|
first free space. The chain is terminated when a free space has zero offset
|
|
to the next free space. The free space chain is read when the file is opened
|
|
for read-write and written when the file is closed; while the file is opened,
|
|
the free space chain is maintained in storage.
|
|
<p>
|
|
|
|
<hr noshade>
|
|
<p><h3><a NAME="howitworks">How It Works</a></h3>
|
|
|
|
<b>Reading</b><br>
|
|
A track or block group image is read while executing a channel
|
|
program or by the <em>readahead</em> thread. An image has to
|
|
be read before it is updated or written to. An image may be <em>cached</em>.
|
|
If an image is cached, then the channel program may complete
|
|
<em>synchronously</em>. This means that if all the data a channel program
|
|
accesses is cached and Hercules does not have to perform physical I/O,
|
|
then the channel program runs synchronously within the SSCH or SIO
|
|
instruction in the <em>CPU</em> thread. All DASD channel programs are started
|
|
synchronously. If a CCW in the channel program requires physical I/O
|
|
then the channel program is interrupted and restarted at that CCW
|
|
<em>asynchronously</em> in a <em>device I/O</em> thread.
|
|
<p>
|
|
All compressed devices share a common cache; the devices can be a mixture
|
|
of FBA and/or CKD device types. Each cache entry contains a pointer to
|
|
a 64K buffer containing an uncompressed track or block group image.
|
|
If the track or block group image being read is not found in the cache,
|
|
then the oldest (or <em>least recently used</em> or <em>LRU</em>) entry that
|
|
is not <em>busy</em> is <em>stolen</em>. A cache entry is busy if it is
|
|
being read, or last accessed by an <em>active</em> channel program, or updated
|
|
but not yet written, or being written. If no cache entries are available then
|
|
the read must enter a <em>cache wait</em>. When images are detected to be
|
|
accessed sequentially then the readahead thread(s) may be signalled to read
|
|
following sequential images.
|
|
<p>
|
|
<b>Writing</b><br>
|
|
When a cache entry is updated or written to, a bit is turned on indicating
|
|
the cache entry has been updated. When a <em>cache wait</em> occurs, or
|
|
(more likely) during garbage collection, a cache <em>flush</em> is performed.
|
|
When the cache is flushed, if any entries have the updated bit on, then
|
|
the writer thread(s) are signalled. The writer thread selects the oldest
|
|
cache entry with the updated bit on, compresses the image, and writes it
|
|
to the file. The new image is written to a new space in the file and then
|
|
the space previously occupied by the image is freed. In certain circumstances,
|
|
the image may be written under <em>stress</em>. A stress write occurs when
|
|
a reading thread is in a <em>cache wait</em> or when a high percentage of
|
|
cache entries are pending write. In this circumstance, the compression
|
|
parameters are relaxed to reduce the CPU requirements. An image written
|
|
under stress is likely to take up more space than the same image written
|
|
not under stress. The writer thread(s) run 1 nicer than the CPU thread(s);
|
|
compression is a CPU intensive activity.
|
|
<p>
|
|
<b>Garbage Collection</b><br>
|
|
The primary function of the garbage collector is to keep the emulated
|
|
compressed DASD files as small as possible. After all, that is the reason
|
|
for using compressed DASD files in the first place. Another function
|
|
is to perform emulation file synchronization.
|
|
<p>
|
|
A single garbage collector thread runs for all compressed devices.
|
|
By default it wakes up at 5 second intervals. The garbage collector
|
|
performs <em>space recovery</em> for each compressed device in the order
|
|
that the device was defined or attached. After space recovery the garbage
|
|
collector flushes the cache to force all outstanding writes. Once all the
|
|
writes have been completed, a file synchronization (<em>fsync()</em>) may
|
|
optionally be performed, which commits any outstanding host I/O to the
|
|
physical disk. Finally free space is flushed (to be explained later).
|
|
<p>
|
|
We see that with the fsync option enabled that the physical disk file
|
|
has a coherent emulation file at the end of each garbage collection cycle.
|
|
Space freed since the last garbage collection cycle completed is not
|
|
available for allocation until the current garbage collection cycle
|
|
completes. This free space is called <em>pending free space</em>.
|
|
That is, previous track or block group images are not overwritten
|
|
until the current garbage collection completes.
|
|
If a catastrophic error occurs, then the emulation file should be
|
|
recoverable at least up to the point of the last garbage collection cycle.
|
|
<p>
|
|
However, performing an fsync() may decrease performance. You can increase
|
|
the garbage collection interval, to reduce the number of fsync()s, but this may also increase the probability of a cache wait occurring. You can increase the
|
|
size of the cache to decrease this probability, but you may increase paging or
|
|
have to decrease the size of emulated memory.
|
|
<p>
|
|
Another possibility is to not enable the fsync option. This is the default.
|
|
In this circumstance, by default, freed space is not available until 2
|
|
garbage collection cycles complete. That is, <i>pending free space</i> is
|
|
not an attribute but a count. You have the option to explicitly set the
|
|
pending free space count. However, by increasing the free space count or
|
|
by increasing the garbage collection interval, then you may be increasing
|
|
the size of the emulation file.
|
|
<p>
|
|
At the very end of the garbage collection cycle, the free space is
|
|
<em>flushed</em>. This means that the pending free space count is decremented
|
|
for all free spaces with a non-zero count. If the count goes to zero and
|
|
the preceding space is a free space with a zero count then the spaces are
|
|
combined.
|
|
<P>
|
|
The space recovery process of the garbage collector simply attempts to move
|
|
some amount of used space towards the beginning of the file causing free
|
|
space to move towards the end of the file. When a free space reaches the
|
|
end of the file, the file is <em>truncated</em>, reducing its size. The
|
|
amount of used space moved depends on the ratio of free space to used space
|
|
and on the number of free spaces. The larger the numbers, the more space
|
|
the garbage collector attempts to move. That is, the garbage collector
|
|
attempts to decrease the ratio of free space to used space and to decrease
|
|
the number of free spaces. Within a cycle, the garbage collector might not
|
|
move the selected amount of used space if the moves are detected to be
|
|
counter-productive (ie the offset of the new space is greater than the
|
|
current offset).
|
|
|
|
<hr noshade>
|
|
<p><h3><a NAME="cckdcommand">The cckd command</a></h3>
|
|
|
|
The <b>cckd</b> command and initialization statement can be used to
|
|
affect cckd processing. Normally the defaults should suffice; however
|
|
the cache size may need to be adjusted depending upon the number of
|
|
emulated devices and the amount of physical memory you have.
|
|
<p>
|
|
<b>Syntax:</b>
|
|
<table>
|
|
<tr><td><b>cckd</b></td><td><b>help</b></td><td>Display cckd help</td>
|
|
<tr><td><b>cckd</b></td><td><b>stats</b></td>
|
|
<td>Display current cckd statistics</td>
|
|
<tr><td><b>cckd</b></td><td><b>opts</b></td><td>Display current cckd options</td>
|
|
<tr><td><b>cckd</b></td><td>opt=value</td><td>Set a cckd option</td>
|
|
<tr><td> </td><td> </td><td>Multiple options may be specified,
|
|
separated by a comma with no intervening
|
|
blanks.</td>
|
|
<tr><td> </td><td><b>cache=</b>n</td><td>Cache size in M</td>
|
|
<tr><td> </td><td><b>l2cache=</b>n</td><td>L2 cache size in K</td>
|
|
<tr><td> </td><td><b>ra=</b>n</td><td>Number readahead threads</td>
|
|
<tr><td> </td><td><b>raq=</b>n</td><td>Readahead queue size</td>
|
|
<tr><td> </td><td><b>rat=</b>n</td><td>Number of tracks to readahead</td>
|
|
<tr><td> </td><td><b>wr=</b>n</td><td>Number writer threads</td>
|
|
<tr><td> </td><td><b>gcint=</b>n</td><td>Garbage collection interval</td>
|
|
<tr><td> </td><td><b>gcparm=</b>n</td><td>Garbage collection parameter</td>
|
|
<tr><td> </td><td><b>nostress=</b>n</td><td>Turn stress writes on or off</td>
|
|
<tr><td> </td><td><b>freepend=</b>n</td><td>Set the free pending value</td>
|
|
<tr><td> </td><td><b>fsync=</b>n</td><td>Turn fsync on or off</td>
|
|
<tr><td> </td><td><b>ftruncwa=</b>n</td><td>Turn ftruncate bug workaround
|
|
on or off</td>
|
|
<tr><td> </td><td><b>trace=</b>n</td><td>Number of trace table entries</td>
|
|
</table>
|
|
<p>
|
|
<b>Options:</b>
|
|
<table>
|
|
<tr><td valign="top"><b>cache=</b>n</td>
|
|
<td>Size of the cache in megabytes. Each cache entry points
|
|
to a 64K buffer. Therefore each megabyte represents 16 cache entries.
|
|
<p>
|
|
The default is <b>8</b>, or 256 cache entries.
|
|
<p>
|
|
You can specify a number between <b>1</b> and <b>64</b> (16 to
|
|
1024 cache entries).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>l2cache=</b>n </td>
|
|
<td>Size of the level 2 table cache in kilobytes.
|
|
Each cache entry points to a 2K l2tab. Therefore each 2K
|
|
represents a single cache entry.
|
|
<p>
|
|
The default is <b>512</b>, or 256 cache entries.
|
|
<p>
|
|
You can specify a number between <b>256</b> and <b>2048</b> (128 to
|
|
1024 cache entries).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>ra=</b>n</td>
|
|
<td>Number of readahead threads. When sequential track or block group
|
|
access is detected, some number (<em>rat= </em>) of tracks or
|
|
block groups are queued (<em>raq= </em>) to be read by one of the
|
|
readahead threads.
|
|
<p>
|
|
The default is <b>2</b>.
|
|
<p>
|
|
You can specify a number between <b>1</b> and <b>9</b>.
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>raq=</b>n</td>
|
|
<td>Size of the readahead queue. When sequential track or block group
|
|
access is detected, some number (<em>rat= </em>) of tracks or
|
|
block groups are queued in the readahead queue.
|
|
<p>
|
|
The default is <b>4</b>.
|
|
<p>
|
|
You can specify a number between <b>0</b> and <b>16</b> (a value
|
|
of zero disables readahead).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>rat=</b>n</td>
|
|
<td>Number of tracks or block groups to read ahead when sequential access
|
|
has been detected.
|
|
<p>
|
|
The default is <b>2</b>.
|
|
<p>
|
|
You can specify a number between <b>0</b> and <b>16</b> (a value
|
|
of zero disables readahead).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>wr=</b>n</td>
|
|
<td>Number of writer threads. When the cache is <em>flushed</em> updated
|
|
cache entries are marked write pending and a writer thread is signalled.
|
|
The writer thread compresses the track or block group and writes the
|
|
compressed image to the emulation file. A writer thread is cpu-intensive
|
|
while compressing the track or block group and i/o-intensive while writing
|
|
the compressed image. The writer thread runs one <em>nicer</em> than
|
|
the CPU thread(s).
|
|
<p>
|
|
The default is <b>2</b>.
|
|
<p>
|
|
You can specify a number between <b>1</b> and <b>9</b>.
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>gcint=</b>n</td>
|
|
<td>Number of seconds the garbage collector thread waits durinng an interval.
|
|
At the end of an interval, the garbage collector performs space recovery,
|
|
flushes the cache, and optionally <em>fsync</em>s the emulation file.
|
|
(However, the file will not be <em>fsync</em>ed unless at least 5
|
|
seconds have elapsed since the last <em>fsync</em>).
|
|
<p>
|
|
The default is <b>5</b> seconds.
|
|
<p>
|
|
You can specify a number between <b>1</b> and <b>60</b>.
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>gcparm=</b>n</td>
|
|
<td>A value affecting the amount of data moved during the garbage collector's
|
|
space recovery routine. The garbage collector determines an amount of
|
|
space to move based on the ratio of free space to used space in an
|
|
emulation file, and on the number of free spaces in the file. (The
|
|
garbage collector wants to reduce the free space to used space ratio
|
|
and the number of free spaces). The value is logarithmic; a value
|
|
of 8 means moving 2<sup>8</sup> the selected value while a negative
|
|
value similarly decreases the amount to be moved. Normally, 256K
|
|
will be moved for a file in an interval. Specifying a value of 8 can
|
|
increase the amount to 64M. At least 64K will be moved. Interestingly,
|
|
specifying a large value (such as 8) may not increase the garbage
|
|
collection efficiency correspondingly.
|
|
<p>
|
|
The default is <b>0</b>.
|
|
<p>
|
|
You can specify a number between <b>-8</b> and <b>8</b>.
|
|
<p>
|
|
<tr><td valign="top"><b>nostress=</b>n </td>
|
|
<td>Indicates whether <em>stress</em> writes will occur or not. A track
|
|
or block group may be written under stress when a high percentage of
|
|
the cache is pending write or when a device i/o thread is waiting for
|
|
a cache entry. When a stressed write occurs, the compression algorithm
|
|
and/or compression parm may be relaxed, resulting in faster compression
|
|
but usually a larger compressed image. If <em>nostress</em> is set
|
|
to one, then a stressed situation is ignored. You would typically
|
|
set this value to one when you want create the smallest emulation file
|
|
possible in exchange for a possible performance degradation.
|
|
<p>
|
|
The default is <b>0</b>.
|
|
<p>
|
|
You can specify <b>0</b> (enable stressed writes) or <b>1</b>
|
|
(disable stressed writes).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>freepend=</b>n </td>
|
|
<td>Specifies the <em>free pending</em> value for freed space. When a
|
|
track or block group image is written the space it previously occupied
|
|
is freed. This space will not be available for future
|
|
allocations until <em>n</em> garbage collection intervals have completed.
|
|
In the event of a catastrophic failure, previously written track or
|
|
block group images should be recoverable if the current image has
|
|
not yet been written to the physical disk. By default the value
|
|
is set to <b>-1</b>. This means that if <em>fsync</em> is specified
|
|
then the value is 1 otherwise it is 2. If 0 is specified then freed
|
|
space is immediately available for new allocations.
|
|
<p>
|
|
The default is <b>-1</b>.
|
|
<p>
|
|
You can specify a number between <b>-1</b> and <b>4</b>.
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>fsync=</b>n </td>
|
|
<td>Enables or disables <em>fsync</em>. When fsync is enabled, then
|
|
the disk emulation file is synchronized with the physical hard
|
|
disk at the end of a garbage collection interval (however, no more
|
|
often than 5 seconds). This means that if <em>freepend</em> is
|
|
non-zero then if a catastrophic error occurs then the emulated disks
|
|
<em>should</em> be recovered coherently. However, fsync may cause
|
|
performance degradation depending on the host operating system and/or
|
|
the host operating system level.
|
|
<p>
|
|
The default is <b>0</b> (fsync disabled).
|
|
<p>
|
|
You can specify <b>0</b> (disable fsync) or <b>1</b> (enable fsync).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>ftruncwa=</b>n </td>
|
|
<td>Work-around for a linux kernel bug in 2.4.18 (shipped in at least RH7.3
|
|
and RH8.0). Symptom is excessive amount of kernel cpu time and
|
|
non-responsiveness of the associated hercules emulated dasd file.
|
|
The problem may still occur with this option turned on, although
|
|
less freqently.<br>
|
|
The problem appears to be fixed in 2.4.19.
|
|
<p>
|
|
The default is <b>0</b>.
|
|
<p>
|
|
You can specify <b>0</b> or <b>1</b> (enable workaround).
|
|
<p>
|
|
</td>
|
|
<tr><td valign="top"><b>trace=</b>n </td>
|
|
<td>Number of cckd trace entries. You would normally specify a non-zero
|
|
value when debugging or capturing a problem in cckd code. When the
|
|
problem occurs, you should enter the <b>k</b> Hercules console command
|
|
which will print the trace table entries.
|
|
<p>
|
|
The default is <b>0</b>.
|
|
<p>
|
|
You can specify a number between <b>0</b> and <b>200000</b>.
|
|
Each entry represents 128 bytes. Normally, for debugging, I use
|
|
100000.
|
|
<p>
|
|
</td>
|
|
</table>
|
|
<h4>Notes</h4>
|
|
<ul>
|
|
<li>The size of the <em>cache</em> is a difficult number to determine.
|
|
The storage used by the cache could also be used for emulated virtual
|
|
storage instead. You don't want to steal storage from your emulated
|
|
operating system so that it starts paging heavily. You also don't
|
|
want your host operating system to page. However, you don't want
|
|
your cache to flush too often because cpu cycles may have to be stolen
|
|
from the cpu thread to compress updated images.
|
|
<li>You need at least one l2cache entry per compressed device. Since the
|
|
maximum number of l2cache entries is 1024, this implies that no more
|
|
than 1024 compressed devices can be defined. Let me know if this is a
|
|
problem ;-) If you have a large number of devices then specify the
|
|
maximum value otherwise the default should suffice.
|
|
<li><em>raq</em> should be at least as large as <em>ra</em>. Readahead
|
|
threads are scheduled from entries in the readahead queue. Likewise
|
|
<em>rat</em> should not exceed <em>raq</em> because only <em>raq</em>
|
|
tracks or block groups can be queued at any time.
|
|
<li>The number of writer threads (<em>wr</em>) should usually be 1 more
|
|
than the number of host processors. This is because one writer thread
|
|
could be cpu-bound (compressing a track or block-group image) and the
|
|
other could be i/o-bound (writing the compressed image).
|
|
<li>The garbage collection interval governs the maximum time in seconds
|
|
an updated track or block group image will reside in storage before
|
|
being written to the emulation file. A large value may mean more data
|
|
loss if a catastrophic error occurs. A small value may mean that
|
|
more cpu time is spent compressing images. For example, suppose that
|
|
a particular image is updated several times each second. If the interval
|
|
is changed from the default 5 seconds to 1 second, then that image will
|
|
be compressed and written 5 times more often. A large value may cause
|
|
more cache flushes within a garbage collection interval. These kind
|
|
of flushes mean that a read will wait because there are no available
|
|
cache entries, slowing the emulated operating system. A large value
|
|
will also cause more pending free space to build up (since free space
|
|
is flushed each interval). This may mean that the garbage collector
|
|
space recovery routine will perform more work and that the emulation
|
|
file may be larger.
|
|
<li>Specify <em>fsync=1</em> and <em>gcint=5</em> if you are absolutely
|
|
paranoid about your data being lost due to a failure. <em>fsync</em>
|
|
will ensure your data on disk is coherent. However, fsync may cause
|
|
a noticeable performance degradation. Note that an fsync will not
|
|
be performed more often than every 5 seconds.
|
|
</ul>
|
|
My advice is to use the default options and adjust them if you have a very
|
|
good reason.
|
|
|
|
<hr noshade>
|
|
<p><h3><a NAME="utilities">Utilities</a></h3>
|
|
|
|
<a NAME="ckd2cckd">
|
|
<li><b>ckd2cckd</b> <i>[options] source-file target-file</i>
|
|
<ul><li><small><b>Description</b></small> Copies a regular CKD Dasd emulation
|
|
file to a compressed CKD Dasd emulation file. The target
|
|
file cannot previously exist. If the emulated Dasd device
|
|
is in more than 1 file then specify the <em>first</em> file.
|
|
After the copy completes, the target file contains no
|
|
free space, imbedded or otherwise.
|
|
<li><small><b>Options</b></small>
|
|
<ul><li><b>-c</b>ompress <i>n</i><br>Compression Algorithm
|
|
<ul><li><b>0</b> don't compress
|
|
<li><b>1</b> compress using zlib
|
|
<li><b>2</b> compress using bzip2
|
|
</ul>
|
|
<li><b>-d</b>ontcompress <i>n</i><br>Same as <i>-compress 0</i>
|
|
<li><b>-m</b>axerrs <i>errs</i><br>Maximum number of errors
|
|
that can occur before the copy is terminated;
|
|
if 0 then errors are ignored. Default is 5.
|
|
<li><b>-n</b>ofudge<br>[deprecated]
|
|
<li><b>-q</b>uiet<br>Quiet mode; don't display status
|
|
<li><b>-z</b> <i>parm</i><br>Parameter passed to compression
|
|
<br>
|
|
<br>zlib compression level:
|
|
<br>0 = no compression
|
|
<br>1=fastest ... 9=best
|
|
<br>
|
|
<br>bzip2 blockSize100k value:
|
|
<br>1=fastest ... 9=best
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
<a NAME="cckd2ckd">
|
|
<li><b>cckd2ckd</b> <i>[options] source-file target-file</i>
|
|
<ul><li><small><b>Description</b></small> Copies a compressed CKD Dasd emulation
|
|
file to a regular CKD Dasd emulation file. The target
|
|
file cannot previously exist. More than 1 target file may
|
|
be created.
|
|
<li><small><b>Options</b></small>
|
|
<ul><li><b>-c</b>yls <i>n</i><br>Number of cylinders to copy
|
|
if the entire file isn't to be copied. If <b>0</b>
|
|
then only the number of cylinders in use are copied.
|
|
<li><b>-m</b>axerrs <i>errs</i><br>Maximum number of errors
|
|
that can occur before the copy is terminated;
|
|
if 0 then errors are ignored. Default is 5.
|
|
<li><b>-q</b>uiet<br>Quiet mode; don't display status
|
|
<li><b>-v</b>alidate<br>Validate track images [default]
|
|
<li><b>-n</b>ovalidate<br>Don't Validate track images
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
<a NAME="cckdcdsk">
|
|
<li><b>cckdcdsk</b> <i>[-level] file-name</i>
|
|
<ul><li><small><b>Description</b></small> Performs compressed or shadowed CKD Dasd emulation
|
|
file integrity verification and recovery and repair.
|
|
<li><small><b>Options</b></small>
|
|
<ul><li>-<i>level</i><br>A digit 0, 1 or 3 that specifies
|
|
the level of checking. The higher the level, the
|
|
longer the integrity check takes.
|
|
<ul><li><b>0</b> Minimal checking. Device headers are verified,
|
|
free space is verified, primary lookup table and secondary
|
|
lookup tables are verified.
|
|
<li><b>1</b> Same checks as level 0 plus all 5-byte track headers
|
|
are verified.
|
|
<li><b>3</b> Same checks as level 1 plus all track images are
|
|
read, uncompressed and verified.
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
<a NAME="cckdcomp">
|
|
<li><b>cckdcomp</b> <i>[-level] file-name</i>
|
|
<ul><li><small><b>Description</b></small> Removes all free space from a compressed
|
|
or shadow CKD Dasd emulation file. (Compresses or compacts a cckd
|
|
file ... your choice!).
|
|
If <i>level</i> is specified, then <b>cckdcdsk</b> is called first
|
|
with the specified level; this is a short-hand method to call both
|
|
functions in one utility call.
|
|
<li><small><b>Options</b></small>
|
|
<ul><li>-<i>level</i><br>A digit 0, 1 or 3 that specifies
|
|
the level of checking. The higher the level, the
|
|
longer the integrity check takes.
|
|
<ul><li><b>0</b> Minimal checking. Device headers are verified,
|
|
free space is verified, primary lookup table and secondary
|
|
lookup tables are verified.
|
|
<li><b>1</b> Same checks as level 0 plus all 5-byte track headers
|
|
are verified.
|
|
<li><b>3</b> Same checks as level 1 plus all track images are
|
|
read, uncompressed and verified.
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
<a NAME="cckdfix">
|
|
<li><b>cckdfix</b> <i>file-name</i>
|
|
<ul><li><small><b>Description</b></small> This is a skeleton program that is
|
|
not compiled during make. It can be edited to change/repair
|
|
the device headers.
|
|
<li><small><b>Compiling</b></small> Enter `<i>cc -o cckdfix -DARCH=390 cckdfix.c</i>'
|
|
to compile and link the edited program.
|
|
</ul>
|
|
<li><b>cckddump</b>
|
|
<ul><li><small><b>Description</b></small> This is an os/390 hlasm (High Level
|
|
Assembler) program that will create a compressed CKD emulation file
|
|
from an actual CKD device. See <a href="#cckddump">below</a> for
|
|
a description on how to build and run this program.
|
|
</ul>
|
|
</ul>
|
|
|
|
<hr noshade>
|
|
<p><h3><a NAME="faq">FAQ</a></h3>
|
|
<table>
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
What devices are supported ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
2311, 2314, 3330, 3340, 3350, 3375, 3380, 3390 and 9345.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
Is a 3390 model 9 supported ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
Yes, maybe. A 3390-9 is a little over 8G in size.
|
|
A cckd file cannot exceed 2G on a system that does
|
|
not support large files, otherwise it cannot exceed
|
|
4G. If the data on the 3390-9 compresses to below
|
|
these limits then the answer is Yes.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
How can I get rid of the free space in my files ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
Once the total amount of free space falls below 6% of
|
|
the total file size, the garbage collector is not very
|
|
aggressive about eliminating free space. To remove
|
|
all free space from the file while Hercules is running
|
|
use the <b>sfc</b> console command. See
|
|
<a href="#usingsfiles">Using Shadow Files</a> above.
|
|
Otherwise, you can use the <b>cckdcomp</b> utility.
|
|
See <a href="#utilities">Utilities</a> above.
|
|
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
How can I display the space statistics for a compressed
|
|
file ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
The statistics are displayed when the compressed file
|
|
is opened. Currently, there is no supplied method to
|
|
display these statistics at any other time. However,
|
|
it shouldn't be too hard to write a shell script
|
|
(similar to <code>dasdlist</code>) to display these
|
|
statistics. The statistics are contained in the
|
|
<code>CCKDDASD_DEVHDR</code> which is at offset 512
|
|
in the compressed file; the header is mapped in
|
|
<code>hercules.h</code>.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
What is a "null track" anyway ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
The term "null track" is just something I made up. It is
|
|
what is returned when a zero offset is found in either the
|
|
primary or secondary lookup table for the track. It contains
|
|
the folllowing fields:
|
|
<table>
|
|
<tr><td><code>0CCHH</code></td><td>Home address</td>
|
|
<tr><td><code>CCHH0008 00000000</code></td><td>standard R0</td>
|
|
<tr><td><code>CCHH1000</code></td><td>end-of-file marker</td>
|
|
<tr><td><code>ffffffff</code></td><td>end-of-track marker</td>
|
|
</table>
|
|
When a null track is written, space previously occupied by
|
|
the track is freed and the offset in the secondary lookup table
|
|
is set to zero. If all offsets in the secondary lookup table
|
|
are zero, then the secondary lookup table is freed and the
|
|
primary lookup table entry is zeroed.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
I want to try bzip2 but I'm getting compiler errors.
|
|
What am I doing wrong ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
Probably bzip2 is not installed or is not installed
|
|
properly. You can obtain bzip2 from
|
|
<a href="http://sourceware.cygnus.com/bzip2/">here</a>.
|
|
If bzip2 is installed, then you need to find the directory
|
|
where <code>bzlib.h</code> is installed and the
|
|
directory where <code>libbz2.a</code> is installed.
|
|
You can then add "-I <i>bzlib.h-directory</i>" to the
|
|
CFLAGS in the make file and add "-L <i>libbz2.a-directory</i>"
|
|
to the LFLAGS.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
Which is better, zlib or bzip2 ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
This is a religious question. I have no actual preference,
|
|
I just wanted to make a choice available.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
Can other compression programs be used ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
Yes. The program is architecturally structured so that other
|
|
compression algorithms can be added rather painlessly. This
|
|
will require, of course, an update to the source.
|
|
<br><br>
|
|
|
|
<tr><td valign="top"><b>Q.</b><td>
|
|
Can this compression scheme be used for FBA devices too ?
|
|
<tr><td valign="top"><b>A.</b><td>
|
|
I have not worked with FBA devices for over 20 years.
|
|
However, it seems to me that a similar program for FBA
|
|
devices should be simpler than this program for CKD devices
|
|
(none of those count/key/data fields mucking everything
|
|
up). Since an FBA block is 512 bytes, it might not
|
|
be efficient to have each block compressed individually;
|
|
it might be better to compress blocks in 32K or 64K chunks.
|
|
If someone asks very nicely, I may consider looking into it;-)
|
|
<br><br>
|
|
|
|
</table>
|
|
|
|
<hr noshade>
|
|
<p>
|
|
Greg Smith
|
|
<a href="mailto:gsmith@nc.rr.com"><em>gsmith</em>@<em>nc.rr.com</em></a>
|
|
<p><small>Last updated 17 Nov 2002</small>
|
|
</BODY>
|
|
</HTML>
|