[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

RE: iSCSI ACA - Implementers Guide (long, sorry)



Consolidating responses to Eddy and Bill.  The most important piece
of this message is the last paragraph (repeated here):

  ACA and multi-connection iSCSI sessions are a lousy fit.  Would
  it be worth saying that ACA SHOULD NOT be used on multi-connection
  iSCSI sessions and initiators are responsible for dealing with
  the response ordering issues that arise when ACA is used on
  a multi-connection session?  That would be mostly consistent
  with RFC 3720 and impose no ACA implementation burden on
  targets.

Please comment.

--- Eddy Quicksall wrote: 

> If the target is going to have wait for acknowledgement of all responses 
> outstanding on other connections  then wouldn't it have to use a NOP-in to

> solicit the next ExpStatSN?  (because it can't be guaranteed that the 
> initiator will be sending another command). I don't see any big deal in
the 
> but just thought it would be worth mentioning.

Yes, that's definitely worth mentioning.

> I would think that the delay mentioned would not be much of a concern 
> because this would only happen if there is a check condition followed by 
> recovery action in which case performance would be suffering anyway.
> 
> Also I think that the delay we introduce with this proposal is consistent 
> with the sentence in SAM-2 you have mentioned.

I agree with both of these thoughts.

--- William Studenmund wrote:

> > -- (1) The SCSI response indicating that CHECK CONDITION has
> > 	caused establishment of ACA could get ahead of a response
> > 	to a command completed prior to establishment of ACA.

> Well, the first question that comes to my mind in these cases is what 
> does Fibre Channel do in these cases on a large SAN? Or what is FC 
> doing on SANs that are bridged over IP? :-)
> 
> We shouldn't beat ourselves up to do stuff the FC doesn't. :-)

Fibre Channel doesn't have multi-connection sessions, and hence these
ACA-related reordering-across-connections problems don't arise.

> I was going to suggest a bit of a tangent here by suggesting we pull in 
> a post-SAM-2 feature and add support for the Query Task Task Management 
> Function. However I think that that function only indicates if a task 
> is in the task set or not. Hmmm... Maybe that is enough.

That's a separable item.  At some point, someone will write the draft to
update iSCSI to SAM-3 (backwards compatible - any incompatible behavior
differences have to be negotiated), and QUERY TASK would be part of that.

> I really have all of these delays we are building into the target. If 
> we are in a MC/S situation, shouldn't we expect the initiator 
> to carry some of the load?
> 
> Put another way, if the initiator can't handle ACA cleanup, maybe it 
> shouldn't issue commands with NACA set. :-)

The issue here is that multi-connection iSCSI sessions without
target synchronization of responses add out-of-order response cases
for ACA that a SCSI initiator isn't expecting.  RFC 3720 says ACA
is entirely a SCSI-layer issue.

> I think it'd be fine for the ACA-status go through immediately as it 
> would tell the initiator that we have entered an error condition and it 
> should, for instance, immediately stop issuing commands to the target. 
> :-) It should also take steps to see what the outstanding status events 
> are on other connections.

If the initiator SCSI code assumes the device is quiesced when the
CHECK CONDITION arrives, the arrival of the successful completion for
a command that the initiator thought was going to receive ACA ACTIVE
could be confusing.  The general risk here is that initiators may have
catch-all cases in their logic where they resort to just resetting
the misbehaving device, which is not a great idea for a tape device
in the middle of a long backup.  Does anyone have direct knowledge
of initiators that use ACA for which this is a real concern?  AIX
is the usual suspect here.

> The problem is that there is no way to indicate target status across 
> the whole session; status is a per-connection thing (which is usually 
> great!).
> 
> The only other thing I can think of is to add a new async event 
> indicating we entered ACA and that all status messages with StatSN less 
> than this one were completed before the event.

And issue that on every connection in the session for ACA?  IMHO, it's
probably better to wait for the status acks, issuing NOPs as needed to
get them (as Eddy notes above).

> > -- (2) An ACA ACTIVE status response could get ahead of the SCSI
> > 	response that indicates establishment of ACA.

> Is this issue really a special worry?
> 
> As I understand SAM-2, this can always happen. In table 25, if QErr is 
> 00b and TST is 000b, some OTHER initiator triggering ACA can put our 
> port into ACA ("all enabled tasks from all SCSI initiator ports"). So 
> there may not even be a status event with the CHECK CONDITION on any of 
> our connections. Thus we can suddenly find ourselves in ACA. :-|

Right, but what if they're not?  Also, an initiator who reacts to this
situation by retrying the task (on the assumption that the other initiator
will get out of ACA eventually) is in for a surprise when the ACA turns
out to be its own fault.  I would hope the common case for ACA is single
initiator devices or TST = 001b (task set per initiator), but I don't know.

> > Between (1) and (2) it's probably a good idea to advise
> > initiator implementers that the synchronous nature of ACA
> > makes it a poor fit with the concurrency of multi-connection
> > iSCSI sessions.
> 
> Indeed. Consider the further wrinkle of a slow initiator (or slow 
> connections) in a QErr 00b/TST 000b scenario. Another initiator could 
> have cleared ACA and RETRIGGERED ACA before the one initiator could 
> have cleared it. :-)

Again, I think ACA and TST = 000b (single task set for all initiators)
are a lousy match for multi-initiator devices.  That would also be
worth saying in the implementers guide.

> > -- (3) A command issued prior to establishment of ACA could
> > 	arrive after ACA is established.

> > For a command with the ACA attribute, the situation is less
> > clear, courtesy of the text quoted from SAM-2 above.  I talked
> > with at least one other SCSI expert at T10 last week, and the
> > two of us concur that this initiator behavior (initiator issues
> > a command with the ACA attribute without knowing that ACA is
> > in effect at the target) is questionable at best and probably
> > wrong (and hence if the target unexpectedly executes the
> > command because an ACA occurred, the initiator got what it
> > deserved).
> 
> I agree it's wrong and the initiator deserves to get its command 
> killed. :-)

Good, although in this case, the ACA command is going to execute, and
the initiator is at risk for a surprise when the target enters ACA
for an unexpected reason, instead of reason the initiator anticipated
when it issued the ACA command in advance.

> > Comments? (and sorry for the length)
> 
> I think this is a mess. For ACA to work well, it really needs the SCSI 
> world to be different than it is normally for iSCSI. :-(

I think it's slightly more specific as noted above - ACA and
multi-connection iSCSI sessions are a lousy fit.  Would it be
worth saying that ACA SHOULD NOT be used on multi-connection
iSCSI sessions and initiators are responsible for dealing with
the response ordering issues that arise when ACA is used on
a multi-connection session?  That would be mostly consistent
with RFC 3720 and impose no ACA implementation burden on
targets.

Thanks,
--David
----------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
black_david@emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------

_______________________________________________
Ips mailing list
Ips@ietf.org
https://www1.ietf.org/mailman/listinfo/ips

[IETF]     [Linux iSCSI]     [Linux SCSI]     [Linux Resources]     [Yosemite News]     [IETF Announcements]     [IETF Discussion]     [SCSI]

Add to Google Powered by Linux