According to Wohlin et al. “experimentation is not something simple, we need to prepare, conduct and analyze
experiments properly” [Wohlin:2012].
Therefore, this experimental protocol presented in this process seeks to assist in the construction of the usability
evaluation protocol, as well as to obtain a successful experiment.
The protocol follows the steps presented by Wohlin et al. and Jedlitschka et al. with the addition of some information
regarding the evaluation of usability.
-
Experimental Scoping
1.1. Experimental Goals
-
Study Object (Activity E4)
-
HCI Evaluation Describe the usability evaluation was executed (Activity P5)
The scoping follows the framework by Wohlin:
The scope of the experiment is set by defining its goals. The purpose of a goal defined is to ensure that important
aspects of an experiment are defined before the planning and execution take place. By defining goal of the
experiment are defined according to template below [Wohlin:2012]:
Analyze <Object(s) of study>
for the purpose of <Purpose>
with respect to their <Quality focus>
from the point of view of the <Perspective>
in the context of <Context>
What do you want to measure? (Activity P6)
Questions about the research.
metrics do you want analyze
2. Experiment Planning
2.1. Context Selection (Activity P9)
Off-line vs. On-line
Student vs. Professional
Toy vs. Real Problems
Specific vs. General
2.2. Hypothesis Formulation (Activity E4)
The experiment definition is formalized into hypotheses. Two hypotheses have to be formulated:
Null Hypothesis - A null hypothesis is represented by H0. This hypothesis states that there is no evidence,
evidence or patterns that confirm this statement in the scenario of the experiment, and the only reasons for the
differences found in our observations are coincidences.
The null hypothesis is one that the experimenter wants to reject with the greatest possible meaning. We can
exemplify this hypothesis when a new DSL finds on average the same number of failures as the one the experimenter
is comparing. i.e.
H0 : µ N old = µ N new. where μ denotes the mean and n is the number of
faults found. [Wohlin: 2012]
Alternative Hypothesis - it is represented by Ha, H1, …, is the hypothesis that is accepted when the null
hypothesis is rejected. For example, when evaluating a new DSL, or a new version of a given DSL and the average of
failures found is lower than the old DSL, or the old version, in this case the null hypothesis is rejected and the
alternative hypothesis, i.e. H0 : μ N old < μ N new
2.3. Variables Selection (Activity E4)
Before any design can start we have to choose the dependent and independent variables [Wohlin:2012].
2.4. Selection of Subjects (Activity P1)
Determined by the Activity P1 by the Usa-DSL Process
2.5. Experimental Design (Activity E4)
2.5.1. General Design Principles
When designing the experiment, it should be considered the general design principles. These principles are
composed by randomization, blocking and balancing, as well as their combination.
Randomization - “The randomization applies on the allocation of the objects, subjects and in which order the tests
are performed. Randomization is also used to select subjects that are representative of the population of interest”
[Wohlin:2012].
Blocking - Is used to systematically eliminate the undesired effect in the comparison among the treatments (DSL).
This technique increases the precision of the experiment. For example: When the participants of the experiment are
different, experience should be used as a blocking, that is to separate participants into groups. If some of
participants have experience in DSL in a domain in particular and some have not. To minimize the effect of the
experience, the persons are grouped into two blocks (groups), one with experience in DSL and one without.
Balancing - The balance of treatments is given by assigning the treatments a number of subjects. This happens
when the Process Executor defines this principle of balance to the total of participants equally among the groups.
“Balancing is desirable because it both simplifies and strengthens the statistical analysis of the data, but is not
necessary” [Wohlin:2012].
2.5.2. Standard Design Types (Activity E4)
One factor with two treatments - In this experiment, we want to compare the two treatments against each other. The
treatments are new DSL vs. old DSL, the factor is a DSL.
One factor with more than two treatments - In this experiment, we want to compare all the treatments, the
participants are assigned randomly to the treatments. The factor is a DSL.
2.6. Instrumentation
Instruments of the experiment (Activity P7 and P8 and P3)
Profile data (Activity P1 and E1)
2.7. Validity Evaluation (Activity E4)
A fundamental question concerning results from an experiment is how valid of the results. For this are considered
four threats validity: conclusion, internal, construct and external.
Conclusion Validity: This validity is related to the question that affects the ability to make a correct conclusion
about the relationship between treatments (DSL) and the results of the study.
Internal Validity: Threats to internal validity are influences that can affect factors. In case a relationship is
observed between the treatment and the results it is important to make sure that this relationship is casual and
that it is not a result of a factor that we have no control or do not measure.
Construct Validity: this validity concerns the relationship between theory and observation. That is, if there is
causality between cause and effect, we must ensure that the treatment (DSL) reflects the cause construct and that
the results of the study reflect the effect construct. With this we try to minimize the measurement bias.
External Validity: this validity relates to the generalization of the study, or to the ability to generalize this
experiment outside the scenario that was applied. This validity can be affected by some choices, for example design
of the experiment, selection of wrong participants, running in a wrong environment and with a time that can affect
the results.
3. Operation
The operation section presents the stages of execution of the experiment. In this section are described: the
preparation and execution of the experiment, as well as the validation of the data obtained for analysis.
3.1. Preparation
In the preparation we must consider two important aspects: selecting and informing the participants and preparing
the material (Guides, Questionnaires..).
3.1.1. Commit Participant
Invite participants and convince them to participate (Activity P1)
Check the availability of Participants (Activity E5)
Confirm Evaluation Date and Time (Activity E5)
Confirm Receipt of Online Questionnaires (Activity E5)
3.1.2. Instrumentation
Select the Instruments Data Gathering (Activity P7)
Select the Instruments Training (Activity P8)
Select the Informed Consent Term (Activity P2)
Evaluation Place Definition (Activity P9)
Organize the Instruments and Equipments for Evaluation (Activity E5)
Choose the Data Storage (Activity P10)
Analyze the documentation to be used in the evaluation (Activity A11)
Review the Evaluation Study Protocol (Activity E4)
3.2 Execution
In this section will be executed the activities that were previously planned and those that are listed in the
sequence.
3.2.1. Data Collection
Introduce Consent Term or Introduce Data Access Agreement (Activity E2)
Collect Signature of Subject (Activity E2)
Apply Profile Questionnaire (Activity E1)
Complete Questionnaire Pre- Evaluation (Activity E1)
Deliver the DSL Guide (Activity E8)
Deliver the Use Scenario (Activity E8)
Conduct training to DSL (Activity E8)
Perform Modeling of the use Scenario (Activity E9)
Complete Questionnaire Post-Evaluation (Activity E9)
4. Analysis and Interpretation
This section will analyze all the data collected, the profile of the participants, the tasks performed, the
post-evaluation questionnaire, among others. The data should be tabulated so that conclusions on the results of the
study can be drawn later, and thus the correlation between the knowledge of the participants and the data collected
can be made.
At the first moment in the interpretation of the data, descriptive statistics, analysis of visualize central
tendency, dispersion, etc. In a second moment the reduction of the data set should be performed by checking
abnormal or false data points, hypothesis testing shall be carried out with the aim of rejecting or accepting the
null hypothesis.(Wohlin:2012)
Analysis of the Profile Subjects (Activity A1)
Analyze the data collected during the evaluation (Activity A7)
Analyze the images and logs (Activity A7)
Analyze the developed models (Activity A9)
Verify the error rate committed by the participantes (Activity A9)
Verify the tasks incompleted (Activity A9)
Perform data standardization(Activity A7)
5. Presentation and Package
After completing the experiment, everything that was done during the process must be stored, such as protocol,
guidelines, artifacts, data, among others. The results should also be presented to different audiences, through
paper for conference or journal, to the report decision-makers or as educational material.
This process suggests the structure proposed by Jedlistska and Pfahl to report experiment, which can be seen in the
Experiment Report.
Present the evaluation according to the scientific paper template (Activity R11)
Present the evaluation according to the report template (Activity R11)
|