|
|
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] |
David and all, It took me a while to think and consult with others on this topic - apologies for the delay (we are even, an almost equally long response follows). It concerns me that iSCSI layer (initiator or target) should make ordering decisions based on SCSI payload contents (e.g. ACA ACTIVE) defined outside of the iSCSI spec. I am concerned that such an approach leaves iSCSI having to continuously catch up with SCSI spec extensions, not to mention the layer-crossing issue and the unnecessary complexity in iSCSI layer. OTOH, I would like us to address the ACA issue cleanly for multi-connection iSCSI sessions from an architecture perspective so we don't have to keep coming back to this - essentially making it possible for implementations to fully support the feature. I also hope this can be minimally incremental in functionality terms relative to RFC3720 requirements. In short, I propose the following: SCSI protocol layer instructs the SCSI transport layer at the time of "Send Command Complete" protocol data service (SAM-2, clause 5.4.2) and "Task Management Function Executed" (SAM-2, clause 6.9) invocations of a "Response Fence" associated with the status in question. The Response Fence flag instructs the SCSI transport layer that the following two conditions must be met: (1) Response with Response Fence must chronologically be delivered after all the "preceding" responses on the I_T nexus whenever the preceding responses are delivered. (2) Response with Response Fence must chronologically be delivered before all the "following" responses on the I_T nexus whenever the following responses are delivered. Note that there is no new reliable delivery requirement here, nor is there a requirement to flush the transport connection - just that chronological delivery is mandatory for this status message. And how this chronological sequencing is ensured is really transport-specific - for a single channel FCP-based nexus, it is merely an uninteresting hint, whereas for a multi-connection iSCSI sesion, it triggers a flush of all connections. Why does this matter to iSCSI? iSCSI could use Response Fence when the Response Fence flag is set by the SCSI layer on the following SCSI completion messages handed down to the transport layer: (1) The first completion message carrying the UA after the multi-task abort on the issuing and third-party sessions (2) The TMF Response carrying the Abort Task Set/Clear Task Set reponse on the issuing session (3) The completion message indicating ACA establishment on the issuing session (4) The first completion message carrying the ACA ACTIVE status after ACA establishment on issuing and third-party sessions (5) The TMF Response carrying the Clear ACA response on the issuing session (6) Any other TBD SCSI response that SCSI architecture documents define in future. In all these cases with Response Fence, the target iSCSI layer does the following: a) If it is a single-connection session, no special processing is required. Standard SCSI Response PDU build process happens. b) If it is a multi-connection session, the target iSCSI layer takes note of last-sent and unaknowledged StatSN on each of the connections in the iSCSI session, and waits for acknowledgement (may solicit for acknowledgement by way of a Nop-In) of each such StatSN to clear the fence. All further status processing is resumed only after clearing the fence. Note that step-(b) above is already required for task set TMF responses by RFC 3720. The "Response Fence" notion genericizes such behavior for a class of messages flagged with "Response Fence" qualifier from the SCSI layer. Comments are welcome. Mallikarjun --- Black_David@emc.com wrote: > An Implementer working on ACA (Auto-Contingent > Allegiance) > has pointed out some issues that could use some > attention > in the implementers guide. The SCSI reference for > this is > Section 5.9.1 of SAM-2. NOTE: This message is not > written > in my role as WG Chair. > > A quick introduction is that ACA is an alternative > behavior > for error handling (CHECK CONDITION) to the more > familiar > behavior of CA (Contingent Allegiance) that is > immediately > exited via Autosense. ACA is requested on a per > command > basis via the NACA bit in the CDB. The Autosense > transmission > of sense data still occurs as a result of CHECK > CONDITION > (Autosense is required by iSCSI), but unlike CA, the > ACA > condition persists until it is cleared by a CLEAR > ACA task > management function. During ACA, the initiator uses > commands > with the ACA attribute (cf Section 10.3 of RFC 3720) > to > deal with the situation. One ACA command at a time > is > permitted, and non ACA commands are rejected with an > "ACA > ACTIVE" status. > > The change in command processing behavior when > entering ACA > raises a number of concurrency issues, as SAM-2 is > specified > as if the initiator and target interact > synchronously, which > is not the case for iSCSI. SAM-2 contains the > following > somewhat cryptic text in Section 5.9.1.2 whose > consequences > the implementers guide could help explain: > > If the SCSI transport protocol does not enforce > state > synchronization as described in 4.6.2, there may be > a > time delay between the occurrence of the CA or ACA > condition and the time at which the application > client > becomes aware of the condition. > > iSCSI does not enforce state synchronization. > Further, in a > multi-connection session, iSCSI does not support > status > sequencing (order of delivery of status/responses) > across > connections. So, for a multi-connection, session, > there are > (at least) 4 interesting potential concurrency > conditions: > > (1) The SCSI response indicating that CHECK > CONDITION has > caused establishment of ACA could get ahead of a > response > to a command completed prior to establishment of > ACA. > > (2) An ACA ACTIVE status response could get ahead of > the SCSI > response that indicates establishment of ACA. > > (3) A command issued prior to establishment of ACA > could > arrive after ACA is established. > > (4) The CLEAR ACA task management function could get > ahead of > a command that is dealing with the ACA condition. > > Here are some initial thoughts on how to deal with > these > situations: > > -- (1) The SCSI response indicating that CHECK > CONDITION has > caused establishment of ACA could get ahead of a > response > to a command completed prior to establishment of > ACA. > > This is only a concern in multi-connection sessions, > as in a > single-connection session, the response indicating > ACA > establishment will come after responses to all prior > commands. > This behavior can be preserved for multi-connection > sessions > by saying that the target MUST wait for > acknowledgement of all > responses outstanding on other connections at the > time of ACA > establishment before sending the response that > indicates ACA > has been established. This "MUST" would be an > additional > requirement to RFC 3720 for multi-connection > targets. > > This seems like a useful thing to require, as one of > the first > things an initiator will probably want to do on > learning > that ACA has happened is figure out what commands > completed > successfully prior to ACA (vs. are still at the > target, and > hence have been blocked by the ACA), and only the > target knows > what statuses are outstanding against the other > connections. > > OTOH, it causes delays, so the alternative is to > leave the > initiator to sort this out, which may entail issuing > NOPs > to determine command status on other connections. > That > would have the advantage of limiting the impact of > ACA > support to multi-connection initiators that actually > want > to use ACA, but the need to use the NOPs is > unpleasant, > leading me to favor having the target get this right > (initiators who don't like the resulting delay in > entering > ACA shouldn't use it). > > I don't think ignoring the issue causes any SCSI > specification > violations, but it may make ACA error handling > behave badly > as command completions will be unexpectedly received > during ACA. > > -- (2) An ACA ACTIVE status response could get ahead > of the SCSI > response that indicates establishment of ACA. > > Again, this is only a concern in multi-connection > sessions, as > in a single-connection session, ACA ACTIVE responses > will come > after the response indicating that ACA has been > entered. The > approach to this should probably match (1) - if the > target is > required to wait for outstanding responses on other > connections > in (1), we should require that the target MUST > receive > acknowledgement of the response indicating > establishment of ACA > before issuing any ACA ACTIVE status responses on > any connection > other than the one that response used. This "MUST" > would be an > additional requirement to RFC 3720 for > multi-connection targets. > > The resulting delay in issuing ACA ACTIVE responses > seems to be > less of a concern, as the initiator can figure out > what's > going to happen: > - Based on (1), the initiator has responses to the > completed > commands, > - Uncompleted commands prior to ExpCmdSN in the > response that > established ACA were delivered to SCSI and have > been > blocked by the ACA. > - All other commands will receive ACA ACTIVE status > responses, > as they were not delivered to SCSI when the ACA > occurred. > > If we don't require the wait in (1), initiator use > of NOP on > other connections will flush out delayed ACA ACTIVE > responses, > but still leave a multi-connection initiator who > wants to use > ACA with the surprise that ACA ACTIVE responses can > show up > before the response indicating that an ACA has > occurred. This > will result in a requirement on initiators to hold > ACA ACTIVE > status responses until the SCSI response indicating > that ACA > has occurred is delivered. This is an additional > requirement > to RFC 3720, and may require initiators to track > NACA on all > outstanding commands to determine whether an > Autosense response > caused ACA or not. This is probably better left to > SCSI. > > IMHO, something needs to be done here, as delivery > of an ACA > ACTIVE response to SCSI before the response > indicating ACA > establishment is pretty clearly wrong from a SCSI > perspective. > > Between (1) and (2) it's probably a good idea to > advise > initiator implementers that the synchronous nature > of ACA > makes it a poor fit with the concurrency of > multi-connection > iSCSI sessions. > > -- (3) A command issued prior to establishment of > ACA could > arrive after ACA is established. > > For a command without the ACA attribute, this is not > an issue - > such a command will be rejected with ACA ACTIVE > status if ACA > is in effect when it is due to be delivered to SCSI > at the > target. > > For a command with the ACA attribute, the situation > is less > clear, courtesy of the text quoted from SAM-2 above. > I talked > with at least one other SCSI expert at T10 last > week, and the > two of us concur that this initiator behavior > (initiator issues > a command with the ACA attribute without knowing > that ACA is > in effect at the target) is questionable at best and > probably > wrong (and hence if the target unexpectedly executes > the > command because an ACA occurred, the initiator got > what it > deserved). > > Hence I suggest guidance (for both single and > multi-connection > sessions) that an initiator SHOULD NOT issue > commands with the > ACA attribute unless it knows that ACA is in effect > at a target, > because targets will process all commands with the > ACA attribute > if they are received while ACA is in effect, > irrespective of > whether an initiator may have issued such a command > prior to > ACA being established. I believe this target > behavior is > permitted by the text quoted from SAM-2 above. This > is a > general SCSI matter, and is not a change from RFC > 3720. > > One of the reasons for taking this approach is that > determining > what ACA command to issue could well depend on what > went wrong > to cause the CHECK CONDITION in the first place, and > an > initiator that thinks it knows all the possible ways > in which a > target could get into a CHECK CONDITION is probably > kidding > itself in most cases. It's better in general to > wait for the > ACA, look at the sense data and figure out what to > do rather > than blindly fire an ACA command into the dark based > on a guess > about what will cause the ACA. > > -- (4) The CLEAR ACA task management function could > get ahead of > a command that is dealing with the ACA condition. > > The corresponding race for exiting ACA - issue CLEAR > ACA while a > command with the ACA attribute is outstanding - > seems less of a > concern. An initiator who has done this clearly > doesn't care > whether that command (and there can only be one) > executes or not. > > SAM-2 says that CLEAR ACA has the side effect of > aborting a task > with the ACA attribute. iSCSI's logical ordering of > command (and > task management function) delivery to SCSI at the > target ensures > that this abort side effect will take place, but > it's probably > worth pointing out that the ordering > responsibilities in Section > 10.5.1 apply to this situation: > - The target MUST wait for all tasks prior to the > CLEAR ACA > function to arrive in order to issue the ACA ACTIVE > response > (non ACA task) or abort them (ACA task). > - The initiator MUST NOT deliver responses from > affected > tasks (all tasks prior to the CLEAR ACA function) > to > SCSI after the response to CLEAR ACA is delivered. > An initiator that wishes to deal with ACA as > expeditiously as > possible should (lower case): > - Issue all commands with the ACA attribute as well > as the > CLEAR ACA task management function on the same > connection > as the SCSI response that indicated establishment > of ACA. > - Not issue any commands without the ACA attribute > prior to > the CLEAR ACA task management function. > > Comments? (and sorry for the length) > > Thanks, > --David > ---------------------------------------------------- > David L. Black, Senior Technologist > EMC Corporation, 176 South St., Hopkinton, MA 01748 > +1 (508) 293-7953 FAX: +1 (508) 293-7786 > black_david@emc.com Mobile: +1 (978) 394-7754 > ---------------------------------------------------- > > > _______________________________________________ > Ips mailing list > Ips@ietf.org > https://www1.ietf.org/mailman/listinfo/ips > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Ips mailing list Ips@ietf.org https://www1.ietf.org/mailman/listinfo/ips
[IETF] [Linux iSCSI] [Linux SCSI] [Linux Resources] [Yosemite News] [IETF Announcements] [IETF Discussion] [SCSI]
![]() |
![]() |