[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

Re: iSCSI ACA - Implementers Guide (long, sorry)



I only read about 3/4 of this but one there thing that I would like to comment on:

If the target is going to have wait for acknowledgement of all responses outstanding on other connections then wouldn't it have to use a NOP-in to solicit the next ExpStatSN? (because it can't be guaranteed that the initiator will be sending another command). I don't see any big deal in the but just thought it would be worth mentioning.

I would think that the delay mentioned would not be much of a concern because this would only happen if there is a check condition followed by recovery action in which case performance would be suffering anyway.

Also I think that the delay we introduce with this proposal is consistent with the sentence in SAM-2 you have mentioned.

Eddy

----- Original Message ----- From: <Black_David@emc.com>
To: <ips@ietf.org>
Sent: Saturday, January 14, 2006 7:38 PM
Subject:  iSCSI ACA - Implementers Guide (long, sorry)


An Implementer working on ACA (Auto-Contingent Allegiance)
has pointed out some issues that could use some attention
in the implementers guide.  The SCSI reference for this is
Section 5.9.1 of SAM-2.  NOTE: This message is not written
in my role as WG Chair.

A quick introduction is that ACA is an alternative behavior
for error handling (CHECK CONDITION) to the more familiar
behavior of CA (Contingent Allegiance) that is immediately
exited via Autosense.  ACA is requested on a per command
basis via the NACA bit in the CDB.  The Autosense transmission
of sense data still occurs as a result of CHECK CONDITION
(Autosense is required by iSCSI), but unlike CA, the ACA
condition persists until it is cleared by a CLEAR ACA task
management function.  During ACA, the initiator uses commands
with the ACA attribute (cf Section 10.3 of RFC 3720) to
deal with the situation.  One ACA command at a time is
permitted, and non ACA commands are rejected with an "ACA
ACTIVE" status.

The change in command processing behavior when entering ACA
raises a number of concurrency issues, as SAM-2 is specified
as if the initiator and target interact synchronously, which
is not the case for iSCSI.  SAM-2 contains the following
somewhat cryptic text in Section 5.9.1.2 whose consequences
the implementers guide could help explain:

If the SCSI transport protocol does not enforce state
synchronization as described in 4.6.2, there may be a
time delay between the occurrence of the CA or ACA
condition and the time at which the application client
becomes aware of the condition.

iSCSI does not enforce state synchronization.  Further, in a
multi-connection session, iSCSI does not support status
sequencing (order of delivery of status/responses) across
connections.  So, for a multi-connection, session, there are
(at least) 4 interesting potential concurrency conditions:

(1) The SCSI response indicating that CHECK CONDITION has
caused establishment of ACA could get ahead of a response
to a command completed prior to establishment of ACA.

(2) An ACA ACTIVE status response could get ahead of the SCSI
response that indicates establishment of ACA.

(3) A command issued prior to establishment of ACA could
arrive after ACA is established.

(4) The CLEAR ACA task management function could get ahead of
a command that is dealing with the ACA condition.

Here are some initial thoughts on how to deal with these
situations:

-- (1) The SCSI response indicating that CHECK CONDITION has
caused establishment of ACA could get ahead of a response
to a command completed prior to establishment of ACA.

This is only a concern in multi-connection sessions, as in a
single-connection session, the response indicating ACA
establishment will come after responses to all prior commands.
This behavior can be preserved for multi-connection sessions
by saying that the target MUST wait for acknowledgement of all
responses outstanding on other connections at the time of ACA
establishment before sending the response that indicates ACA
has been established.  This "MUST" would be an additional
requirement to RFC 3720 for multi-connection targets.

This seems like a useful thing to require, as one of the first
things an initiator will probably want to do on learning
that ACA has happened is figure out what commands completed
successfully prior to ACA (vs. are still at the target, and
hence have been blocked by the ACA), and only the target knows
what statuses are outstanding against the other connections.

OTOH, it causes delays, so the alternative is to leave the
initiator to sort this out, which may entail issuing NOPs
to determine command status on other connections.  That
would have the advantage of limiting the impact of ACA
support to multi-connection initiators that actually want
to use ACA, but the need to use the NOPs is unpleasant,
leading me to favor having the target get this right
(initiators who don't like the resulting delay in entering
ACA shouldn't use it).

I don't think ignoring the issue causes any SCSI specification
violations, but it may make ACA error handling behave badly
as command completions will be unexpectedly received during ACA.

-- (2) An ACA ACTIVE status response could get ahead of the SCSI
response that indicates establishment of ACA.

Again, this is only a concern in multi-connection sessions, as
in a single-connection session, ACA ACTIVE responses will come
after the response indicating that ACA has been entered.  The
approach to this should probably match (1) - if the target is
required to wait for outstanding responses on other connections
in (1), we should require that the target MUST receive
acknowledgement of the response indicating establishment of ACA
before issuing any ACA ACTIVE status responses on any connection
other than the one that response used.  This "MUST" would be an
additional requirement to RFC 3720 for multi-connection targets.

The resulting delay in issuing ACA ACTIVE responses seems to be
less of a concern, as the initiator can figure out what's
going to happen:
- Based on (1), the initiator has responses to the completed
commands,
- Uncompleted commands prior to ExpCmdSN in the response that
established ACA were delivered to SCSI and have been
blocked by the ACA.
- All other commands will receive ACA ACTIVE status responses,
as they were not delivered to SCSI when the ACA occurred.

If we don't require the wait in (1), initiator use of NOP on
other connections will flush out delayed ACA ACTIVE responses,
but still leave a multi-connection initiator who wants to use
ACA with the surprise that ACA ACTIVE responses can show up
before the response indicating that an ACA has occurred.  This
will result in a requirement on initiators to hold ACA ACTIVE
status responses until the SCSI response indicating that ACA
has occurred is delivered.  This is an additional requirement
to RFC 3720, and may require initiators to track NACA on all
outstanding commands to determine whether an Autosense response
caused ACA or not.  This is probably better left to SCSI.

IMHO, something needs to be done here, as delivery of an ACA
ACTIVE response to SCSI before the response indicating ACA
establishment is pretty clearly wrong from a SCSI perspective.

Between (1) and (2) it's probably a good idea to advise
initiator implementers that the synchronous nature of ACA
makes it a poor fit with the concurrency of multi-connection
iSCSI sessions.

-- (3) A command issued prior to establishment of ACA could
arrive after ACA is established.

For a command without the ACA attribute, this is not an issue -
such a command will be rejected with ACA ACTIVE status if ACA
is in effect when it is due to be delivered to SCSI at the
target.

For a command with the ACA attribute, the situation is less
clear, courtesy of the text quoted from SAM-2 above.  I talked
with at least one other SCSI expert at T10 last week, and the
two of us concur that this initiator behavior (initiator issues
a command with the ACA attribute without knowing that ACA is
in effect at the target) is questionable at best and probably
wrong (and hence if the target unexpectedly executes the
command because an ACA occurred, the initiator got what it
deserved).

Hence I suggest guidance (for both single and multi-connection
sessions) that an initiator SHOULD NOT issue commands with the
ACA attribute unless it knows that ACA is in effect at a target,
because targets will process all commands with the ACA attribute
if they are received while ACA is in effect, irrespective of
whether an initiator may have issued such a command prior to
ACA being established.  I believe this target behavior is
permitted by the text quoted from SAM-2 above.  This is a
general SCSI matter, and is not a change from RFC 3720.

One of the reasons for taking this approach is that determining
what ACA command to issue could well depend on what went wrong
to cause the CHECK CONDITION in the first place, and an
initiator that thinks it knows all the possible ways in which a
target could get into a CHECK CONDITION is probably kidding
itself in most cases.  It's better in general to wait for the
ACA, look at the sense data and figure out what to do rather
than blindly fire an ACA command into the dark based on a guess
about what will cause the ACA.

-- (4) The CLEAR ACA task management function could get ahead of
a command that is dealing with the ACA condition.

The corresponding race for exiting ACA - issue CLEAR ACA while a
command with the ACA attribute is outstanding - seems less of a
concern. An initiator who has done this clearly doesn't care
whether that command (and there can only be one) executes or not.

SAM-2 says that CLEAR ACA has the side effect of aborting a task
with the ACA attribute.  iSCSI's logical ordering of command (and
task management function) delivery to SCSI at the target ensures
that this abort side effect will take place, but it's probably
worth pointing out that the ordering responsibilities in Section
10.5.1 apply to this situation:
- The target MUST wait for all tasks prior to the CLEAR ACA
function to arrive in order to issue the ACA ACTIVE response
(non ACA task) or abort them (ACA task).
- The initiator MUST NOT deliver responses from affected
tasks (all tasks prior to the CLEAR ACA function) to
SCSI after the response to CLEAR ACA is delivered.
An initiator that wishes to deal with ACA as expeditiously as
possible should (lower case):
- Issue all commands with the ACA attribute as well as the
CLEAR ACA task management function on the same connection
as the SCSI response that indicated establishment of ACA.
- Not issue any commands without the ACA attribute prior to
the CLEAR ACA task management function.

Comments? (and sorry for the length)

Thanks,
--David
----------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
black_david@emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------


_______________________________________________
Ips mailing list
Ips@ietf.org
https://www1.ietf.org/mailman/listinfo/ips


_______________________________________________
Ips mailing list
Ips@ietf.org
https://www1.ietf.org/mailman/listinfo/ips

[IETF]     [Linux iSCSI]     [Linux SCSI]     [Linux Resources]     [Yosemite News]     [IETF Announcements]     [IETF Discussion]     [SCSI]

Add to Google Powered by Linux