PD 0128: Information in Test Results for Manual Tests
- Subject: PD 0128: Information in Test Results for Manual Tests
- From: "Observation Decisions Review Board" <faigin@aero.org>
- Date: Mon, 30 Oct 2006 08:27:45 -0800
- Content-description: Mail message body
- Content-transfer-encoding: 7BIT
- Content-type: text/plain; charset=US-ASCII
- Priority: normal
The following PD has just been issued by the ODRB:
PD 0128
TITLE
Information in Test Results for Manual Tests
ISSUE
ATE_FUN.1-11 requires that "The evaluator shall check that the expected test
results in the test documentation are consistent with the actual test results
provided." The CEM paragraphs for that work unit make clear that the developer
is to produce raw test results that the evaluation team can use to verify that
the actual results are consistent with the expected result in the vendor's test
procedures.
This seems fairly straightforward for automated tests that include an automated
comparison between observed results and predicted results: both the expected
and observed results are in a form that can be interpreted by the automated
comparison software.
However, for tests that require a human to compare observed results and
predicted results, it is unclear what constitutes adequate expected results and
actual results, or the means of conveying the comparison of the two.
For example, if each test procedure consists of actions to perform and
verification steps which include what the tester should observe, is it
sufficient if the vendor provides a statement of "Pass" as it relates to a
particular test, together with a handwritten checkmark indicating a particular
test or action resulted in the actual results that matched the expected result,
along with the testers initials or signature?
If the intent of ATE_FUN.1-11 is that the vendor provide evidence that they ran
the test suite and that the actual results matched the expected results, the
above approach by the vendor in combination with the evaluator's analysis meets
the intent of this work unit. ATE_FUN.1-11 is intended to support ATE_FUN.1.5C
which is: "The test results from the developer execution of the tests shall
demonstrate that each tested security function behaved as specified". The
approach of the checks in the checkboxes in the context of the overall manual
test report and other test evidence (e.g. verification steps) is sufficient to
demonstrate this requirement.
If, on the other hand, the intent of the requirement for the vendor to provide
actual results is to allow the evaluator to independently verify that the
results of the vendor's execution of the test suite demonstrate that the TOE
was correctly stimulated (which is supported by the first paragraph of the CEM
guidance for this work unit, which states, "A comparison of the actual and
expected test results provided by the developer will reveal any inconsistencies
between the results."), it is not clear how the evaluator could independently
verify actual results (or identify any inconsistencies) by comparing expected
results to a "check". Although this work unit does support the CC content
element ATE_FUN.1.5C, the checkboxes do NOT demonstrate security function
behavior; instead, the check boxes are a simply the developer's assertion that
the tests ran correctly. They do not provide the evidence necessary for the
CCTL to independently verify that expected test results match actual test
results.
RESOLUTION
The intent of CEM 2.3 work units for ATE_FUN.1-10 through 12 is to have the
evaluation team confirm that the developer was able to successfully execute
their test suite by following their written test procedures. The evaluator must
perform verification to confirm that the expected results were achieved.
ATE_FUN.1.5C states that "the test results from the developer execution of the
tests shall demonstrate that each tested security function behaved as
specified." For cases where an automated test suite operates on the data and
provides test results, the analysis that must be performed by the evaluation
team may be limited to a simple comparison.
However, for manual tests where the tester must perform some action, observe
the outcome, and compare it to the expected results (e.g., test cases that
involve viewing a screen shot, observation of a pop-up window), having the
evaluators merely view a list of hand-written checkboxes is not sufficient to
satisfy the intent of the requirements. The evaluation team must analyze
whatever actual test output is available and, most probably, the test
procedures themselves. The evaluator must determine how the tester's Pass
result was generated and must determine that the condition that generates the
Pass is consistent with the expected behavior of the TOE (even if this means
looking at the test source). The evaluator need not perform these tasks for
every step of every test but must provide some justification of the efficacy of
the analysis methods employed.
When selecting the set of tests to be rerun for ATE_IND, the evaluation team
should take into account manually-verified cases when selecting the tests that
they will rerun as part of evaluation team testing. In particular, manual cases
that involve comparison of displays should be considered for verification, to
ensure that the proper comparison was done and the results are appropriate for
the expectation.
RATIONALE
As noted in version 2.3 paragraphs 1511 and 1514 (et al), the purpose of the
ATE_FUN family is to help ensure that the test plan and procedures are adequate
from a procedural point of view to demonstrate that, if run, they will
successfully test the TOE. Judging the adequacy of these procedures requires a
clear expectation of what will be observed during the running of these tests.
CEM version 2.3 paragraph 795 (et al) says:
It may be that a direct comparison of actual results cannot be
made until some data reduction or synthesis has been first
performed. In such cases, the developer's test documentation
should describe the process to reduce or synthesize the actual
data.
The ODRB believes the point raised by this paragraph is that there will always
be some kind of manual verification involved in the work units of this family.
Even the most automated of tests will produce some result whose success needs
to be manually verified.
Date Index |
Thread Index |
Problems or questions? Contact list-master@nist.gov