{
{ CML$7165_2x_FAILURE_DATA
{
{
{  PURPOSE:
{     The purpose of this statistic is to record the failure data
{ captured by the system when accessing a 7165 disk subsystem.
{
{  FREQUENCY: At each failure occurrence.
{
{  CONTENT:
{     The descriptive_data portion of the failure message is the
{ following:
{
{   '<mf>.<iou>.<pp>.<ch>.<sd>.<unit>*<vsn>*<class>*..
{    <message>'
{
{      where <mf> is the identification of the mainframe in the form
{        $SYSTEM_mmmm_ssss.  Where 'mmmm' is the model number of
{        Central Processor zero (CP0), e.g. 0990, and 'ssss' is the
{        serial number of that processor, e.g. 0104.
{
{      where <iou> is the string 'IOUn', where n is 0 or 1.  This
{        identifies  the  IOU  associated with the channel over
{        which the failure was reported.
{
{      where <pp> is either the string 'PPn' or the string 'CPPn'
{        and n is the decimal representation of the physical PP number
{        used to process the failing request.  Note that 'CPP' is the
{        designation given to the concurrent PPs in an I4 IOU.
{        This is the master PP as  only  the  master  PP performs
{        error recovery.
{
{      where <channel> is either the string 'CHn' or the string
{        'CCHn'; n is the decimal representation of the channel
{        number through which the disk device was accessed.  Note
{        that 'CCH' is the designation given to the concurrent
{        channels in an I4 IOU.
{
{      where <sd> is the element name of the 7165_2x Storage Director
{        used in the failing request.
{
{      where  <unit> is the element name of the disk storage
{        device used in the failing request.
{
{      where <vsn> is the recorded-vsn of the disk volume which  was
{        the object of the failing request.
{
{      where  <class>  is  the string 'UF' for unrecovered, 'RF' for
{        recovered, 'IF' for intermediate failure log-entry, and
{        'IM' for an informative message.
{
{        The  PP  reports failure data as an intermediate failure
{        log-entry  prior  to  retrying  an  i/o  request.   An
{        intermediate    failure    log-entry   will   provide   the
{        first-failure data captured by the PP  during  the  initial
{        attempt  at  the  request  or  during  a subsequent request
{        retry.  This  log-entry  provides  the  initial  and  final
{        failure data for an intermediate, unsuccessful i/o request
{        retry.  At least  one  additional  request  retry  will  be
{        performed after this log-entry is made.
{
{        For  unrecovered  disk failures the counter values contain
{        the failure data corresponding  to  the  last  unsuccessful
{        retry of the i/o request.  This log-entry provides
{        the   initial   and  final  failure  data  for  the  final,
{        unsuccessful i/o request retry.
{
{        For  failures  corrected  during  sector-oriented  (media)
{        recovery, the counter values contain the first-failure data
{        captured by  the  PP.   This  log-entry  is  only  made  to
{        document successful sector-oriented recovery.
{
{        The informative messages reported are SOFT SECTORING UNIT
{        and UNIT SOFT SECTORED.
{
{      where <message> is a statement of failure isolation based on
{        status reported by the 7165 subsystem.  The text of the
{        possible symptom statements are identical in content to
{        the upper case text described under counter value 8 below.
{
{    The  counter-value portion of this statistic contains:
{
{   1.  Physical PP number (bits 58 - 63).
{        Bit 57 = 1 implies that the PP is an I4 concurrent PP.
{        Bits 46 - 51 contain the IOU number.
{
{   2.  Channel Number (bits 58 - 63).
{        Bit 57 is set to 1 for an I4 concurrent channel.
{        Bits 46 - 51 contain the IOU number.
{
{   3.  Address of SD/HSC
{        Bits 60..62 contain the address of the Storage Director.
{        Bit 63 contains the address of the Head of String controller.
{
{   4.  Physical Unit Number
{
{   5.  Unit type (identifies the kind of unit, i.e.  product id)
{       5 - 895-2
{
{   6.  Logical Operation Code
{       1 - read
{       2 - write
{       4 - soft-sectoring the device
{
{   7.  Log-entry Class
{       0 - Recovered Failure Report
{       1 - Unrecovered Failure Report
{       2 - Intermediate Failure Report
{       3 - Informative Message
{
{   8.  Failure Analysis              Indicates  the  extent  to  which  the
{                                     subsystem and  the  PP  were  able  to
{                                     isolate   the   failure  when  it  was
{                                     detected.  The failure data is analyzed
{                                     to  generate  one  of  the  following
{                                     symptom codes.
{
{        1 - STORAGE DIRECTOR RETRY   General Status = 900(16)
{                                     EDS word 19 = 44A(16).  The storage
{                                     director  has  requested the CCC to
{                                     retry the command, but no sense bytes
{                                     are present.
{
{       Values 10 through 79 are only returned for errors where sense
{       bytes from the storage director are present.  Sense bytes are
{       present when general status equals A10(16).  They are also
{       present when general status equals 900(16) and EDS word 19
{       equals 402(16).  The upper hexadecimal digit of EDS byte 7
{       contains a format code and the lower hexadecimal digit of EDS
{       byte 7 contains a message code.
{
{       In general, if the format is 0, the problem is in the CCC or
{       storage director.  If the format code is 1, there is a drive
{       problem, if the code is 2 or 3 there is a storage director
{       problem, if the code is 4 or 5 there is a media problem, if
{       the code is 7 there is a Director to Device Controller (DDC)
{       interface problem, and if the code is 8 there is a Head of
{       String Controller (HSC) problem.
{
{       For messages 10 through 38 and 40 through 79 a symptom code is
{       saved in counter word 10.  The symptom code is generated by the
{       storage director and comes from sense bytes 22 and 23.  The
{       symptom code can be looked up in the hardware maintenance manual
{       to find a list of repair actions.
{
{       10 - UNDOCUMENTED FORMAT x MESSAGE
{                                     x is the format code (0-5,7,8)
{
{       The following symptom statements are most likely caused by the
{       CCC or storage director.
{
{       11 - INVALID COMMAND          EDS byte 7 = 01
{
{       12 - INVALID COMMAND ISSUED TO 7165
{                                     EDS byte 7 = 02
{
{       13 - CCW COUNT TOO SMALL      EDS byte 7 = 03
{
{       14 - INVALID DATA ARGUMENT    EDS byte 7 = 04
{
{       16 - CHAINING NOT INDICATED   EDS byte 7 = 06
{
{       17 - COMMAND MISMATCH         EDS byte 7 = 07
{
{       18 - DEFECTIVE TRACK POINTER  EDS byte 7 = 0B
{
{       The following symptom statements are most likely caused by
{       the drive.
{
{       19 - DEVICE STATUS 1 NOT EXPECTED
{                                     EDS byte 7 = 11
{
{       20 - INDEX MISSING            EDS byte 7 = 13
{
{       21 - UNRESETTABLE INTERRUPT   EDS byte 7 = 14
{
{       22 - DEVICE DOES NOT RESPOND  EDS byte 7 = 15
{
{       23 - INCOMPLETE SET SECTOR    EDS byte 7 = 16
{
{       24 - HEAD ADDRESS MISCOMPARE  EDS byte 7 = 17
{
{       25 - INVALID DEVICE STATUS 1  EDS byte 7 = 18
{
{       26 - DEVICE NOT READY         EDS byte 7 = 19
{
{       27 - TRACK ADDRESS MISCOMPARE
{                                     EDS byte 7 = 1A
{
{       28 - DRIVE MOTOR OFF          EDS byte 7 = 1C
{
{       29 - SEEK INCOMPLETE          EDS byte 7 = 1D
{
{       30 - CYLINDER ADDRESS MISCOMPARE
{                                     EDS byte 7 = 1E
{
{       31 - UNRESETTABLE OFFSET ACTIVE
{                                     EDS byte 7 = 1F
{
{       The following symptom statements are most likely caused by
{       the storage director.
{
{       32 - SELECTIVE RESET WHILE SELECTED
{                                     EDS byte 7 = 29
{
{       33 - SYNC LATCH FAILURE       EDS byte 7 = 2A
{
{       34 - MICROCODE DETECTED CHECK
{                                     EDS byte 7 = 2F
{
{       35 - CLOCK STOPPED CHECK 1 (SD)
{                                     EDS byte 7 = 38
{
{       36 - ALTERNATE STORAGE DIRECTOR FAILURE
{                                     EDS byte 7 = 3A
{
{       The following symptom statements are most likely caused by
{       media defects.
{
{       37 - ERROR UNCORRECTABLE BY ECC
{                                     EDS byte 7 = 40,41,42,43,48,49,4A,4B
{
{       38 - DATA SYNCHRONIZATION UNSUCCESSFUL
{                                     EDS byte 7 = 44,45,46,47,4C,4D,4E,4F
{
{       39 - ERROR CORRECTABLE BY ECC EDS byte 7 = 50,51,52,53,58,59,5A,5B
{
{       The following symptom statements are most likely caused by the
{       storage director or an error in the storage director to head of
{       string controller path (DDC).
{
{       41 - RCC INITIATED BY CCA     EDS byte 7 = 70
{
{       42 - RCC1 NOT SUCCESSFUL      EDS byte 7 = 71
{
{       43 - RCC1 AND RCC2 NOT SUCCESSFUL
{                                     EDS byte 7 = 72
{
{       44 - INVALID DDC TAG SEQUENCE
{                                     EDS byte 7 = 73
{
{       45 - EXTRA RCC REQUIRED       EDS byte 7 = 74
{
{       46 - INVALID DDC SELECTION    EDS byte 7 = 75
{
{       47 - MISSING END OP           EDS byte 7 = 76,77
{
{       48 - INVALID TAG              EDS byte 7 = 78,79
{
{       49 - DESELECTION              EDS byte 7 = 7A
{
{       50 - NO CONTROLLER RESPONSE   EDS byte 7 = 7B
{
{       51 - CONTROLLER NOT AVAILABLE
{                                     EDS byte 7 = 7C,7D
{
{       The following symptom statements are most likely caused by
{       the HSC.
{
{       52 - ECC HARDWARE FAILURE     EDS byte 7 = 81
{
{       53 - UNEXPECTED END OP        EDS byte 7 = 83
{
{       54 - END OP ACTIVE            EDS byte 7 = 84,85
{
{       Values from 55 to 79 can only be generated if EDS byte 7
{       contains a hex value of 0, 10, 28, 6X, or 80.
{
{       55 - COMMAND REJECT           EDS byte 0, bit 0
{
{       56 - INTERVENTION REQUIRED    EDS byte 0, bit 1
{
{       57 - BUS OUT PARITY           EDS byte 0, bit 2
{
{       58 - EQUIPMENT CHECK          EDS byte 0, bit 3
{
{       59 - DATA CHECK               EDS byte 0, bit 4
{
{       60 - OVERRUN                  EDS byte 0, bit 5
{
{       61 - PERMANENT DEVICE ERROR   EDS byte 1, bit 0
{
{       62 - END OF CYLINDER          EDS byte 1, bit 2
{
{       63 - MESSAGE TO OPERATOR      EDS byte 1, bit 3
{
{       64 - NO RECORD FOUND          EDS byte 1, bit 4
{
{       65 - FILE PROTECTED           EDS byte 1, bit 5
{
{       67 - FIRST LOGGED ERROR       EDS byte 2, bit 2
{
{       68 - ENVIRONMENTAL DATA       EDS byte 2, bit 3
{
{       69 - PATH ERROR               EDS byte 4, bit 2
{
{       70 - INVALID TRACK FORMAT     EDS byte 1, bit 1
{
{       79 - UNDOCUMENTED STORAGE DIRECTOR RESPONSE
{                                     Sense bytes are present but are not
{                                     described by codes 55 - 78.
{
{       Values from 80 to 119 are only returned for errors where general
{       status equals A00(16).  These errors are detected by the Cyber
{       Channel Coupler.  Note that the bits are numbered with bit zero
{       to the right.  Values from 80 through 97 are normally CCC or
{       storage director problems.  Values 98, 99, 101, and 103 through
{       119 are normally PP or CCC problems.  Values 100 and 102 usually
{       indicate a CCC problem.
{
{       80 - REQUEST IN NOT RECEIVED DURING COMMAND RETRY
{                                     EDS word 19, bits 11 and 4
{
{       81 - ILLEGAL WRITE            EDS word 19, bits 11 and 3
{
{       82 - CCC-STORAGE DIRECTOR INTERFACE ERROR
{                                     EDS word 19, bits 11 and bits 2/1
{
{       83 - FULL/EMPTY COUNT INCORRECT
{                                     EDS word 19, bits 11 and 0
{
{       92 - ADDRESS MISCOMPARE ON SELECT SEQUENCE
{                                     EDS word 19, bits 9 and 2
{
{       93 - NO REQUEST IN ON POLLING SEQUENCE
{                                     EDS word 19, bits 9 and 1
{
{       94 - SELECT IN RECEIVED ON SELECT SEQUENCE
{                                     EDS word 19, bits 9 and 0
{
{       95 - BUS IN PARITY ERROR      EDS word 19, bits 8 and 3
{
{       96 - READ PATH PARITY ERROR   EDS word 19, bits 8 and 2
{
{       97 - WRITE PATH PARITY ERROR  EDS word 19, bits 8 and 0
{
{       98 - INCOMPLETE DATA TRANSFER EDS word 18, bits 7 and 2
{
{       99 - CHANNEL PARITY DURING PP OUTPUT
{                                     EDS word 18, bit 6
{
{       100- COUPLER MEMORY PARITY ERROR DURING PP INPUT
{                                     EDS word 18, bit 5
{
{       101- DEADMAN TIMEOUT STATUS   EDS word 18, bit 4
{
{       102- COUPLER MEMORY PARITY ERROR
{                                     EDS word 18, bit 3 -or-
{                                     EDS word 19, bits 8 and 1
{
{       103- EXCESS DATA TRANSFERRED  EDS word 18, bit 2
{
{       104- DATA PACKING FOR CHANNEL DID NOT COME OUT EVEN
{                                     EDS word 18, bit 1
{
{       105- NORMAL END NOT SET       EDS word 18 bit 7 not set
{
{       The remaining symptom messages occur only when detailed status
{       is not present.  Values 121, 126 through 132, and 134 indicate
{       a PP or CCC problem.  Values 122 and 123 are informative.
{
{       121- FUNCTION TIMEOUT         A function issued by the PP to
{                                     the CCC was not responded to
{                                     within a timeout.  The function
{                                     is in counter word 18.
{
{       122- SOFT SECTORING UNIT
{
{       123- UNIT SOFT SECTORED
{
{       126- INTERFACE ERROR          The PP found a value in a CM
{                                     table created by the CP to be
{                                     incorrect.  The PP halts after
{                                     reporting this error.
{
{       127- KZ BOARD ERROR           The channel error flag in the IOU
{                                     is set and bit 61 of the CIO error
{                                     status register is set.
{
{       128- KX BOARD ERROR           The channel error flag in the IOU
{                                     is set and bit 63 of the CIO error
{                                     status register is set.
{
{       129- CHANNEL ERROR            The channel error flag in the IOU
{                                     is set. For the CIO channel it means
{                                     that bits 61 and 63 of the error
{                                     status register are not set.  Counter
{                                     word 40 contains the CIO error status
{                                     register.
{
{       131 - MEDIA FAILURE           The error has been isolated to media.
{                                     The error recovery algorithm successfully
{                                     wrote, read, and verified data on a
{                                     cylinder reserved for maintenance.
{                                     NOS/VE will automatically software flaw
{                                     the allocation unit containing the
{                                     failing address.
{
{       132- INCOMPLETE SECTOR TRANSFER  After a block input from the CCC
{                                     or a block output from the PP to the
{                                     CCC, the A  register was not zero.
{                                     Also,  unless  the  input  was  for
{                                     status, general status was zero.
{
{       133- CCC FAILURE              The autoload of CCC microcode failed.
{                                     The error code is in the right-most
{                                     two hex digits of general status.
{
{       134- PP-CCC DATA INTEGRITY    Data was transferred to the CCC buffer
{                                     and back to the PP memory.  No error
{                                     was reported, but the data did not compare.
{
{       135- PP-DRIVE DATA INTEGRITY  Data was transferred from PP memory to
{                                     the disk, and back to PP memory.  No
{                                     error was reported, but the data did not
{                                     compare.
{
{       136- SEEK COMMAND TIMEOUT     A  seek  command issued to the CCC did
{                                     not  complete  within a timeout of 10
{                                     seconds.
{
{       140- INDETERMINATE 895 ERROR  An error response was returned by  the
{                                     PP,  but no indication of an error was
{                                     found in the status.
{
{       141- UNCORRECTED CM ERROR     An  uncorrected  error  response was
{                                     received from CM on a request.  This
{                                     is  bit  50  of  the  error register
{                                     (counter word 40).
{
{       142- CM REJECT                A reject response was received from
{                                     CM.  This  is  bit  51 of the error
{                                     register (counter word 40).
{
{       143- INVALID CM RESPONSE      The  response  code from CM decoded
{                                     into an illegal value.  This is bit
{                                     52  of  the error register (counter
{                                     word 40).
{
{       144- CM RESPONSE CODE PARITY ERROR   The  response code from CM
{                                     had  a parity error.  This is bit
{                                     53 of the error register (counter
{                                     word 40).
{
{       145- CMI READ DATA PARITY ERROR  The CM interface logic detected
{                                     a parity error.  This is bit 54 of
{                                     the  error  register (counter word
{                                     40).
{
{       146- OVERFLOW ERROR           Data  was  received  after the DMA
{                                     channel's  input  buffer  was full.
{                                     This is bit 56 of the error register
{                                     (counter word 40).
{
{       147-JY BOARD ERROR            The  JY board has detected an error.
{                                     This is bit 62 of the error register
{                                     (counter word 40).
{
{       148- IOU FAILURE - OPERATIONAL STATUS WRONG  After using test mode
{                                     to transfer data between the PP and
{                                     CM, operational status was incorrect.
{
{       149- IOU FAILURE - TEST MODE DATA MISCOMPARE  After using test mode
{                                     to  transfer data between the PP and
{                                     CM,  no  hardware error was detected,
{                                     but the data miscompared.
{
{       150- TRANSFER IN PROGRESS DID NOT CLEAR   A data  transfer between
{                                     CM  and disk did not complete within
{                                     a  timeout.  General Status from the
{                                     CCC showed no error and no IOU error
{                                     register bits were set.  This is bit
{                                     47 of counter word 40.
{
{       151-T PRIME REGISTER NOT EMPTY  A data transfer between CM and disk
{                                     did  not  complete  within a timeout.
{                                     General Status from the CCC showed no
{                                     error  and no IOU error register bits
{                                     were  set.  This is bit 46 of counter
{                                     word 40.
{
{   9.  Request Retry Count - The number of times the PP driver retried the
{       entire i/0 request from the beginning.
{
{   10. Fault Symptom Code - This code, if present, is EDS bytes 22
{       and 23.  If the code is not present, the counter word will be
{       negative.  The code is right justified in this counter word.
{
{   11. Cylinder number of initial seek
{
{   12. Track number of initial seek
{
{   13. Sector number of initial seek
{
{   14. Cylinder number of failure - This is normally the cylinder
{       number in the disk request.  However, if the failure occurred
{       while loading CCC microcode or while running the confidence
{       test, the cylinder number will be 884.
{
{   15. Track number of failure - This is normally the track on disk
{       that was being read or written when the error was detected.
{       If the error occurred while loading CCC microcode or during
{       the interface test portion of the confidence test, the track
{       number will be set to zero.
{
{   16. Sector number of failure - This is normally the sector on disk
{       that was being read or written when the error was detected.
{       If the error occurred while loading CCC microcode or during
{       the interface test portion of the confidence test, the sector
{       number will be set to zero.
{
{   17. Residual byte count on incomplete channel transfer
{
{   18. Failing Function - On  a  function  timeout, the function
{       reported is the one which was outstanding when the CCC hung.
{
{   First-failure Data:
{   19. General Status of Last Failure
{                   (right justified)
{   20 .. 39. Words 1..20 of Detailed Status
{                   (right justified)
{   40. Error Register bits (CIO channel only)
{                   (right justified)
{
{   The following failure data is only provided in the cases where
{   the Log Entry Class is unrecovered or intermediate.  The data
{   represents the subsystem status at the end of the intermediate or
{   final request retry.
{
{
{   Last-failure Data:
{   41. General Status of Last Failure
{                   (right justified)
{   42 .. 61. Words 1..20 of Detailed Status
{                   (right justified)
{   62. Error Register bits (CIO channel only)
{                   (right justified)
{

  CONST
    cml$7165_2x_failure_data = cmc$min_ecc + 4104;

*copyc cmc$condition_limits


