L2 Vocabulary Learning and the Use of Reading Tasks: Manipulating the Involvement Load Index

Document Type: Research Paper

Authors

1 Assistant professor of Islamic Azad University, najafabad branch

2 English department, najafabad branch, Islamic Azad University

Abstract

As Schmidt (2008) states, deeper engagement with new vocabulary as induced by tasks clearly increases the chances of learning those words. This engagement is theoretically clarified by the involvement load hypothesis (ILH, Laufer and Hulstijn, 2001), based on which the involvement index of each task can be measured. The present study was designed to test ILH by evaluating the impact of 4 different reading-based tasks on vocabulary acquisition and retention. Investigating learners' beliefs about the use and effect of the tasks they performed has been another concern of this study. To this end, 120 female and male English undergraduates, attending Reading Comprehension 4 at the Islamic Azad University of Najaf Abad took part in the experiment. There were 4 intact classes that received 4 different tasks randomly. The first class completed an input-oriented task (multiple-choice questions or MCQ) with an ILI of 2. The second class completed an output-oriented task (sentence making or SM) with an ILI of 3. The third class completed an input-oriented task (multiple choice cloze test or MCT) with an ILI of 2, and the fourth group completed an output-oriented task (blank-filling or BF) with the same involvement load as that of the second class. The results showed that all tasks were conductive in vocabulary learning and retention. However the SM task was found to be the most effective of all. The task effectiveness which was asked through an interview with some of the participants, chosen randomly from each group, was in line with the above mentioned results. In other words, participants agreed that reading-related tasks could facilitate L2 vocabulary acquisition and retention. Concerning their opinions about task type, they found SM easier and more conductive to vocabulary learning and retention. The findings of this research could attract EFL and ESL teachers' attention to utilize the same task types in their classes in the form of class activities and could provide EFL and ESL students with an effective way of vocabulary learning and retention.                                         

Keywords


Nowadays, the importance of learning English as a second language is remarkably on the rise. Despite some EFL teachers' preference to teach vocabulary directly, some others prefer indirect methods. They maintain that extensive reading is the best way to construct a sufficient knowledge of vocabulary. Furthermore, they think that teaching vocabulary or grammar directly is not only an ineffective way of teaching but also  an overwhelming task that takes a great deal of class time. As Hart and Risley (2003) stated, direct vocabulary acquisition is memorizing the words term after term with their respective translations or their synonyms and definitions. It is quick, but it is also superficial. Learners encounter vocabulary in an isolated, often infinitive form and remain incapable of using it correctly in context. Moreover learning vocabulary directly sinks faster into oblivion.

 In indirect vocabulary acquisition learners encounter words together with syntactic information through context in target language reading. Vocabulary in context often appears repeatedly under different aspects and hence engrains in the learners' minds. On the other hand, as reading involves highly complex cognitive processing operations, the reading task makes use of authentic and challenging texts and provides learners with a rhetorical framework for processing and analyzing the text (Nunaun, 2004).Thus, indirect vocabulary learning through reading can help learners in comprehending, manipulating, processing or interacting in the target language while their attention is focused on meaning rather than manipulating forms.

For judging the different degrees of processing unfamiliar vocabulary through reading, the involvement load hypothesis (ILH) is a good option (Laufer &Hulstijn, 2001). This hypothesis adopts three measurable and operational factors to define the involvement loads which each task may have. They are need, search, and evaluation. The need component is the motivational, non- cognitive dimension of involvement. It is concerned with the need to achieve something. This notion here is not interpreted in its negative sense, based on fear of failure, but in its positive sense based on a drive to comply with the task requirements, which could be either externally imposed or self-imposed. Need is moderate when it is imposed by an external agent, for example, the need to use a word in a sentence which the teacher has asked the learners to produce, and need is strong when imposed on the learners by themselves. Moderate and strong needs subsume different degrees of drive.

Search and evaluation are the two cognitive (information processing) dimensions of involvement, contingent upon noticing and deliberately allocating attention to the form-meaning relationship (Schmidt, 2001). Search is the attempt to find the meaning of an unknown L2 word or the concept by consulting a dictionary or another authority (e.g., a teacher). Evaluation entails a comparison of a given word with other words, a specific meaning of a word with its other meanings, or combination of the word with other words in order to assess whether a word (i.e., a form-meaning pair) does or does not fit its context.

By operationalzing the above issues the involvement index (Hustijn and Laufer, 2001), in which the absence of a factor is marked 0, a moderate presence of a factor equals 1, and a strong presence of a factor equals 2 can be measured. For example, consider two different tasks. In the first task, learners are asked to write original sentences with new words whose meanings are provided by the teacher. In this case the needis moderate (imposed by the teacher, hence given 1), there is no search(meanings are provided, hence given 0), and strong evaluationis required in that learners have to use the new words to produce their own sentences (given 2). In the second task, the learner is required to read a text with glosses of the new words and to answer comprehension questions. The task induces a moderate need (hence given 1), but neither search nor evaluation is present (given 0 to each). To be more clear the first task is thought (according the construct of task-induced involvement) to induce a greater involvement load than the second task 3 vs. 1.

The concept of involvement can be submitted to empirical investigation by devising incidental-learning tasks with various degrees of need, search and evaluation (Laufer & Hulstijn, 2001). For example, tasks with different involvement indexes can be presented to different groups of participants. After they have finished the tasks the results can be analyzed and compared to see if there is any relationship between the task involvement load and word acquisition and retention.

Taken together, these elements can explain the effectiveness of a range of task properties so that they form the involvement load of a task. It is expected that a task which elicits a higher involvement load is more effective than a task which elicits a lower one (Laufer & Hulstijn, 2001). The present study was designed exactly on the theoretical basis of the ILH and used the measurable criteria of three components to define three different reading tasks.

Laufer and Hulstijn (2001) noted that a great deal of support for ILH predates its formulation by studies not run to test their hypothesis. Understandably, studies having a direct bearing on the hypothesis are still few and far between due to its recent formulation.

Huljistin and Laufer (2001) conducted two parallel experiments in which their advanced Dutch- and Hebrew participants (adult English learners) were assigned six intact groups. Retention of ten unfamiliar words in an incidental learning setting was investigated across three task types: Task1 included reading comprehension with marginal glosses, Task 2 included comprehension plus filling in the target words, and Task 3 included composition writing with target words. The tasks had different involvement loads i.e., various combinations of need, search and evaluation. The results indicated that Task 3 was more involving and led to better retention than Task 1 and Task 2, thus providing strong support for ILH.

                                      

Background of the Study

 

Using eight nonsense words, Keating (2008) used three tasks with different involvement loads to assess the predictive nature of the ILH when low proficiency learners were involved: task1, which consisted of a reading passage with marginal glosses, task 2, which included reading comprehension plus fill-in-the blank texts, and task 3, which consisted of writing original sentences using target words. In task 1 the participants had to read a passage with five true/false comprehension questions. To correctly answer the questions, participants had to attend to the words which were highlighted in bold print and glossed in their L1. The involvement index was 1. Participants in group 2 had the same text but the words were deleted from it, each appearing with a brief definition, an example sentence, and an L1 gloss. The participants were instructed to fill in the blanks with the glosses in the margin. The involvement index was 2. Group 3 only had to write original sentences with the words. The index for this group was assessed to be 3. Based on ILH, it was predicted that group 3 would outperform group 2 which in turn would do better than group 1. The results supported the hypothesis that the ILH can be generalized to low-proficiency learners, though no significant difference was found between the groups on task 3 and task 2 regarding their passive knowledge of the target words.

Kim (2008) also provided empirical evidence for the ILH in a carefully designed study consisting of two experiments. The first experiment addressed the effectiveness of three vocabulary tasks with different involvement indexes. The second experiment examined whether tasks with equal involvement loads would bring about different results for learners with different levels of proficiency. In line with other studies, the results showed that a higher ILI led to more effective initial and delayed vocabulary learning. Furthermore, Kim found that identical involvement indexes in two tasks unfolded similar results for the two L2 proficiency levels.

In order to test the ILH, Xu (2009) investigated the impact of three tasks with different ILIs on vocabulary acquisition. The tasks were MC comprehension, BF, and SM questions. As far as the immediate task effects on vocabulary acquisition are concerned, the results partially support the ILH. That is, the BF and SM tasks, which induce higher involvement loads than the MC task, yield significantly higher acquisition in the overall immediate posttest as well as the spelling and collocation measures. However, the SM task did not produce an acquisition significantly superior to the BF task as predicted. In terms of the delayed task effects on vocabulary acquisition, the results supported the hypothesis only to a limited degree. As expected, the SM task had greater effects than the BF task, which in turn, had superior effects as compared to the MC task. The difference between SM and MC tasks reached a statistically significant level. However, no measurable difference existed between Tasks S and B, or between Task B and M.

Gorgi (2010) carried out a research on ILT. As he considered the nature of the task as being input or output oriented and its relationship with the involvement index, his study revealed contrary results to the prediction of the ILH. Three tasks were included: an input-oriented task with an involvement index of three, the same type of task but with an involvement index of two, and an output-oriented task with the same involvement load as that of the first group. The comparison of the performance of the groups in the immediate and delayed posttests revealed that contrary to the prediction of the involvement load hypothesis,task 2 with an involvement index of two was superior to task 1, which had a higher index. Besides, the participants who had completed the output oriented task (task 3) outperformed those that did the input-oriented task (task 1), despite their index equivalence.

Kargozari and Ghaemi (2011) investigated the effectiveness of the type of written exercises on L2 vocabulary retention. To this end 53 Iranian EFL university students practiced ten previously un encountered lexical items in three types of written exercises: multiple choice (MC), fill-in-the blank (FW) and sentence writing (SW). The participants were given a mini-dictionary designed to help them both for meaning and usage of the target words while doing the exercises. The findings revealed that the MC group significantly outperformed the other groups. The differences between MC group and two other groups were significant. Although Hulstijn and Laufer (2001) found a significant difference between SF and SW exercises, the findings of this study did not support theirs' .The depth of processing was found to be deeper and more mental processes are used in SW exercises than two other ones.

As shown, ILH has been found accountable for better learning and retention in some studies. However, more research is necessary to see if there is any difference between the reading tasks with different involvement indexes and the ones with similar indexes yet different in other features like being input or output oriented. Furthermore, the feelings and attitudes that learners have towards the use of reading tasks and their effectiveness in general and the type of the tasks involved, in particular, have not been the focus of sufficient researches. To address these objectives, this study was designed. The following questions were formulated to direct the present study.

The purpose of the current study is therefore to find the immediate and delayed effects of reading based tasks on vocabulary acquisition as follows:

  1. Which of the following reading tasks, MCQ, SM, MCT, and BF have a significant effect on vocabulary acquisition?
  2. Which of the following reading tasks, MCQ, SM, MCT, and BF can lead to better vocabulary retention?
  3. Do different reading based tasks have different effects on vocabulary acquisition and retention?
  4. What are the learners' beliefs about the tasks and their effectiveness?

 

Methodology

 

Participants

The target population of this study was male and female English undergraduates who were attending Reading Comprehension 4 at Azad University of Najaf Abad. There were 4 intact classes (120 students). However, in order to ensure homogenous language proficiency, an Oxford Placement Test (OPT, Edwards, 2007) was given to all of the students and those who got a score between one standard deviation above and below the mean were included in this study. The following table (Table 1) presents the number of participants in each group.                                                                                   

 

Table 1.  Number of Participants   

group

 

20

 

24.7

 

24.7

 

24.7

 

21

 

25.9

 

25.9

 

50.6

 

19

 

23.5

 

23.5

 

74.1

 

21

 

25.9

 

25.9

 

100.0

 

81

 

100.0

 

100.0

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

Valid

 

Frequency

 

Percent

 

Valid Percent

 

Cumulative

 

Percent

 

Instruments

The instruments used in this study the OPT (Edwards, 2007), four different reading tasks in the form of two texts followed by twenty questions, an immediate and a delayed posttest, and an interview. The following explanations clarify how these materials were prepared and used.                     

OPT-- This elementary to intermediate level test consists of three parts: grammar (15 items), vocabulary (15), and reading comprehension (20), which was totally 50 items. One point was given for each correct item and no negative point was considered for incorrect responses. It was given to the participants in the first session and it took 50 minutes to be administered.

 

Reading tasks.Four reading tasks were used in the study with different involvement loads to test their effects on vocabulary acquisition. Each task included reading passages that were selected from Reading Through Interaction (Farhady & Mirhassani, 2008), which was the English textbook taught for Reading Comprehension 3 at the time of this study at Azad University of Najaf Abad.

First of all, the participants' previous teachers, who taught Reading Comprehension 3, were consulted to determine which units were not covered the previous semester.  Out of these units four with interesting titles were selected to attract more readers' attention and boost better performance. The texts were given to the participants in the four classes three weeks before the treatment. The participants were asked to underline all new words that were not familiar to them in the four texts, but they were not told that they would be tested afterwards on the meaning of those words. To ensure that the selected target words was enough, from which to select sufficient number of items for each task, two texts out of four that had the highest number of new words were chosen. The first passage, which consisted of 126 words, was about the risks that people take in their lives. The target words were nine, which were underlined by students in the reading passage. The second passage, which consisted of 160 words, was about the protection of wild animals. The underlined target words were eleven in this text. Therefore twenty target words were finally selected to be used in the experiment. They were eight nouns, eight adjectives, and four verbs including:  dare, deaf, renowned, sentiments, attempt, marathon, gliding, feats, traits, nightmare, poisonous, hire, claim, hordes, arctic, savage, cautious, predictable, extermination, and campaign.

Inspired by Hulstijn and Laufer (2001), four tasks were selected: multiple-choice comprehension questions (MCQ), sentence- making (SM), multiple-choice cloze test (MCT) and blank- filling (BF).

Task one (MCQ), given to the first group, consisted of the two texts with the target words in bold print to help the participants notice the words (Schmidt, 1994). The first text had 9 multiple-choice questions with four choices for each while the second one had 11 items with four choices. The participants were to select one of the choices for each item that was closest in meaning to the word in bold print. The involvement index of this task was measured as 2 [+ (1) need, - (0) search, + (1) evaluation]. In other words, the need component is given one, as it was induced by the task itself. However, participants were not allowed to use a dictionary, so the search item is given zero. In the same line the evaluation component is given one because target words had to be evaluated against one another by students to determine their contextual appropriateness.

In task two (SM), given to the first group, the same texts with the target words in bold print were used. These participants in this group were asked to write the meaning of the underlined words in L1 or L2 along with a meaningful sentence for each in English. However during writing the sentences, they were not required to pay attention to the grammar. In this task, the need component was moderate and was given one, because it was externally induced by the task itself and there was no search component (i.e., it is given 0) involved since students did not have to look up the words in a dictionary. However the value of evaluation was high (i.e., 2) because the words were to be used in sentences and the participants had to put more effort to create them. Hence, the involvement load of the task was 3 [+ (1) need, - (0) search, ++ (2) evaluation].

In task three (MCT) the third group of the participants was asked to read the same reading passages with the target words omitted and to answer multiple choice questions. The number of questions was again 20. Four choices were provided for each blank and the participants were asked to select one from among the choices that best fitted the blank in each item. As for the involvement load, need and search were the same as those in task two. In order to fill in the blanks with the correct words, the given words had to be evaluated against one another to determine their contextual appropriateness. The task included a moderate evaluation of 1. Based on the involvement load hypothesis, the involvement index of the task was 2 [+ (1) need, - (0) search, + (1) evaluation].

In the fourth task (BF), after reading the same texts with the missing target words, the third group of the participants was asked to fill in the blanks with the target words placed on top of the page in a random order. They were also asked to change the part of speech of the presented target words if necessary. Since the participants had to know the meaning of the target words to put them in the blanks, they were told to bring their own dictionary to class and use it when necessary. It must be noted that all students had already been taught how to use a dictionary. As such the search component for this task was given 1. Since the words around the blank would function as stimuli to activate the readers' mental mechanisms, the participants were expected to look for clues from the passages that might help them find the missing words. Because the task itself was inducing, the need component was moderate and it was given1.The task motivated a moderate amount of evaluation too (1). Based on the involvement load hypothesis the involvement index of the task was 3 [+ (1) need, + (1) search, + (1) evaluation].   

 

Posttests. To assess the learning of the target words, an immediate posttest was administered. Also, to determine the retention of the target words, the same test was administered three weeks later. Due to the nature of the study, incidental learning, the participants were not informed of the upcoming immediate or delayed post tests they were supposed to take in all groups. Using The Best Test Preparation for the TOEFL (Penfield, 2007), Vocabulary and Tests for TOEFL (Farhady, 2005), Reading Comprehension and Vocabulary Workbook (Davy & Davy, 2007), and consulting with the experts as well as viewing the website of www.Test preview .com, the researchers constructed the posttests. To ensure the validity and reliability of the tests, they were piloted with another group at the same proficiency level before administration in the main study. The two posttests, which were equal in all respects, composed of matching (13 items) as well as definition selection (12 items) parts with the instructions in L2 on top. It has to be noted that no vocabulary pretest was given to the participants to avoid generating any memory traces.

 

Interview. After administering the post test, five participants from each group were selected as interviewees randomly. They were asked to express, in L1 or L2, their ideas about the tasks, their effectiveness in learning the meaning of new words, features of each specific task and other features such as class presentation and task administration. The purpose of the interview immediately following the post test was to prevent the students from forgetting the task effect and to get more precise information from the participants.

 

Results

 

In order to determine whether there were any overall differences among the experimental groups in the immediate posttest, their descriptive statistics were calculated. Table 2 displays the results.

 

Immediate test

 

20

 

12.40

 

3.747

 

.838

 

10.65

 

14.15

 

6

 

18

 

21

 

17.95

 

5.045

 

1.101

 

15.66

 

20.25

 

7

 

25

 

19

 

13.89

 

4.841

 

1.111

 

11.56

 

16.23

 

5

 

23

 

21

 

15.38

 

3.956

 

.863

 

13.58

 

17.18

 

7

 

24

 

81

 

14.96

 

4.815

 

.535

 

13.90

 

16.03

 

5

 

25

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

N

 

Mean

 

Std. Deviation

 

Std. Error

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval for

 

Mean

 

Minimum

 

Maximum

 

Table 2. Descriptive Statistics of Immediate Posttest Scores

 

 

The table shows that the highest and the lowest mean scores of the immediate posttest belong to SM and MCQ groups respectively. The results of a one way ANOVA showed a significance difference among groups, F (3.77) = 5.852, p =.001p =.002p=.046

 

Table 3.  Restuls of One-way ANOVA on Immediate Posttest among Groups

 

Table 4 shows the results of the descriptive statistics of the delayed post test scores.

 

 

Delayed test

 

20

 

16.30

 

5.592

 

1.250

 

13.68

 

18.92

 

7

 

24

 

21

 

22.19

 

2.421

 

.528

 

21.09

 

23.29

 

16

 

25

 

19

 

15.89

 

4.483

 

1.029

 

13.73

 

18.06

 

8

 

24

 

21

 

17.95

 

4.555

 

.994

 

15.88

 

20.03

 

10

 

25

 

81

 

18.16

 

4.996

 

.555

 

17.06

 

19.27

 

7

 

25

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

N

 

Mean

 

Std. Deviation

 

Std. Error

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval for

 

Mean

 

Minimum

 

Maximum

Table 4. Descriptive Statistics of Delayed Posttest

Table 4 shows that the SM group scored the highest in the delayed posttest while the MCT scored the lowest. In order to see whether the results were statistically supported, a one-way ANOVA was run. The results of the ANOVA showed that the difference among groups was significant, F (3, 77) = 8.774, p =.000.Table 5shows the results.

 

Table 5. Restuls of One-way ANOVA among Groups on Delayed Posttest

 

The results of the post hoc test confirmed that SM group significantly showed better retention than MCQ, MCT and BF (p =.001p =.046p=.026

 

Table 6.  Results of Scheffe Test on Mean Differences of the Delayed Posttest

Multiple Comparisons

 

Dependent Variable: delayed test

 

Scheffe

 

-5.890

 

*

 

1.374

 

.001

 

-9.82

 

-1.96

 

.405

 

1.408

 

.994

 

-3.62

 

4.43

 

-1.652

 

1.374

 

.695

 

-5.58

 

2.27

 

5.890

 

*

 

1.374

 

.001

 

1.96

 

9.82

 

6.296

 

*

 

1.392

 

.000

 

2.32

 

10.27

 

4.238

 

*

 

1.357

 

.026

 

.36

 

8.12

 

-.405

 

1.408

 

.994

 

-4.43

 

3.62

 

-6.296

 

*

 

1.392

 

.000

 

-10.27

 

-2.32

 

-2.058

 

1.392

 

.538

 

-6.04

 

1.92

 

1.652

 

1.374

 

.695

 

-2.27

 

5.58

 

-4.238

 

*

 

1.357

 

.026

 

-8.12

 

-.36

 

2.058

 

1.392

 

.538

 

-1.92

 

6.04

 

(J) group

 

SM

 

MCT

 

BF

 

MCQ

 

MCT

 

BF

 

MCQ

 

SM

 

BF

 

MCQ

 

SM

 

MCT

 

(I) group

 

MCQ

 

SM

 

MCT

 

BF

 

Mean

 

Difference

 

(I-J)

 

Std. Error

 

Sig.

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval

 

The mean difference is significant at the .05 level.

 

*.

Nowadays, the importance of learning English as a second language is remarkably on the rise. Despite some EFL teachers' preference to teach vocabulary directly, some others prefer indirect methods. They maintain that extensive reading is the best way to construct a sufficient knowledge of vocabulary. Furthermore, they think that teaching vocabulary or grammar directly is not only an ineffective way of teaching but also  an overwhelming task that takes a great deal of class time. As Hart and Risley (2003) stated, direct vocabulary acquisition is memorizing the words term after term with their respective translations or their synonyms and definitions. It is quick, but it is also superficial. Learners encounter vocabulary in an isolated, often infinitive form and remain incapable of using it correctly in context. Moreover learning vocabulary directly sinks faster into oblivion.

 In indirect vocabulary acquisition learners encounter words together with syntactic information through context in target language reading. Vocabulary in context often appears repeatedly under different aspects and hence engrains in the learners' minds. On the other hand, as reading involves highly complex cognitive processing operations, the reading task makes use of authentic and challenging texts and provides learners with a rhetorical framework for processing and analyzing the text (Nunaun, 2004).Thus, indirect vocabulary learning through reading can help learners in comprehending, manipulating, processing or interacting in the target language while their attention is focused on meaning rather than manipulating forms.

For judging the different degrees of processing unfamiliar vocabulary through reading, the involvement load hypothesis (ILH) is a good option (Laufer &Hulstijn, 2001). This hypothesis adopts three measurable and operational factors to define the involvement loads which each task may have. They are need, search, and evaluation. The need component is the motivational, non- cognitive dimension of involvement. It is concerned with the need to achieve something. This notion here is not interpreted in its negative sense, based on fear of failure, but in its positive sense based on a drive to comply with the task requirements, which could be either externally imposed or self-imposed. Need is moderate when it is imposed by an external agent, for example, the need to use a word in a sentence which the teacher has asked the learners to produce, and need is strong when imposed on the learners by themselves. Moderate and strong needs subsume different degrees of drive.

Search and evaluation are the two cognitive (information processing) dimensions of involvement, contingent upon noticing and deliberately allocating attention to the form-meaning relationship (Schmidt, 2001). Search is the attempt to find the meaning of an unknown L2 word or the concept by consulting a dictionary or another authority (e.g., a teacher). Evaluation entails a comparison of a given word with other words, a specific meaning of a word with its other meanings, or combination of the word with other words in order to assess whether a word (i.e., a form-meaning pair) does or does not fit its context.

By operationalzing the above issues the involvement index (Hustijn and Laufer, 2001), in which the absence of a factor is marked 0, a moderate presence of a factor equals 1, and a strong presence of a factor equals 2 can be measured. For example, consider two different tasks. In the first task, learners are asked to write original sentences with new words whose meanings are provided by the teacher. In this case the needis moderate (imposed by the teacher, hence given 1), there is no search(meanings are provided, hence given 0), and strong evaluationis required in that learners have to use the new words to produce their own sentences (given 2). In the second task, the learner is required to read a text with glosses of the new words and to answer comprehension questions. The task induces a moderate need (hence given 1), but neither search nor evaluation is present (given 0 to each). To be more clear the first task is thought (according the construct of task-induced involvement) to induce a greater involvement load than the second task 3 vs. 1.

The concept of involvement can be submitted to empirical investigation by devising incidental-learning tasks with various degrees of need, search and evaluation (Laufer & Hulstijn, 2001). For example, tasks with different involvement indexes can be presented to different groups of participants. After they have finished the tasks the results can be analyzed and compared to see if there is any relationship between the task involvement load and word acquisition and retention.

Taken together, these elements can explain the effectiveness of a range of task properties so that they form the involvement load of a task. It is expected that a task which elicits a higher involvement load is more effective than a task which elicits a lower one (Laufer & Hulstijn, 2001). The present study was designed exactly on the theoretical basis of the ILH and used the measurable criteria of three components to define three different reading tasks.

Laufer and Hulstijn (2001) noted that a great deal of support for ILH predates its formulation by studies not run to test their hypothesis. Understandably, studies having a direct bearing on the hypothesis are still few and far between due to its recent formulation.

Huljistin and Laufer (2001) conducted two parallel experiments in which their advanced Dutch- and Hebrew participants (adult English learners) were assigned six intact groups. Retention of ten unfamiliar words in an incidental learning setting was investigated across three task types: Task1 included reading comprehension with marginal glosses, Task 2 included comprehension plus filling in the target words, and Task 3 included composition writing with target words. The tasks had different involvement loads i.e., various combinations of need, search and evaluation. The results indicated that Task 3 was more involving and led to better retention than Task 1 and Task 2, thus providing strong support for ILH.

                                      

Background of the Study

 

Using eight nonsense words, Keating (2008) used three tasks with different involvement loads to assess the predictive nature of the ILH when low proficiency learners were involved: task1, which consisted of a reading passage with marginal glosses, task 2, which included reading comprehension plus fill-in-the blank texts, and task 3, which consisted of writing original sentences using target words. In task 1 the participants had to read a passage with five true/false comprehension questions. To correctly answer the questions, participants had to attend to the words which were highlighted in bold print and glossed in their L1. The involvement index was 1. Participants in group 2 had the same text but the words were deleted from it, each appearing with a brief definition, an example sentence, and an L1 gloss. The participants were instructed to fill in the blanks with the glosses in the margin. The involvement index was 2. Group 3 only had to write original sentences with the words. The index for this group was assessed to be 3. Based on ILH, it was predicted that group 3 would outperform group 2 which in turn would do better than group 1. The results supported the hypothesis that the ILH can be generalized to low-proficiency learners, though no significant difference was found between the groups on task 3 and task 2 regarding their passive knowledge of the target words.

Kim (2008) also provided empirical evidence for the ILH in a carefully designed study consisting of two experiments. The first experiment addressed the effectiveness of three vocabulary tasks with different involvement indexes. The second experiment examined whether tasks with equal involvement loads would bring about different results for learners with different levels of proficiency. In line with other studies, the results showed that a higher ILI led to more effective initial and delayed vocabulary learning. Furthermore, Kim found that identical involvement indexes in two tasks unfolded similar results for the two L2 proficiency levels.

In order to test the ILH, Xu (2009) investigated the impact of three tasks with different ILIs on vocabulary acquisition. The tasks were MC comprehension, BF, and SM questions. As far as the immediate task effects on vocabulary acquisition are concerned, the results partially support the ILH. That is, the BF and SM tasks, which induce higher involvement loads than the MC task, yield significantly higher acquisition in the overall immediate posttest as well as the spelling and collocation measures. However, the SM task did not produce an acquisition significantly superior to the BF task as predicted. In terms of the delayed task effects on vocabulary acquisition, the results supported the hypothesis only to a limited degree. As expected, the SM task had greater effects than the BF task, which in turn, had superior effects as compared to the MC task. The difference between SM and MC tasks reached a statistically significant level. However, no measurable difference existed between Tasks S and B, or between Task B and M.

Gorgi (2010) carried out a research on ILT. As he considered the nature of the task as being input or output oriented and its relationship with the involvement index, his study revealed contrary results to the prediction of the ILH. Three tasks were included: an input-oriented task with an involvement index of three, the same type of task but with an involvement index of two, and an output-oriented task with the same involvement load as that of the first group. The comparison of the performance of the groups in the immediate and delayed posttests revealed that contrary to the prediction of the involvement load hypothesis,task 2 with an involvement index of two was superior to task 1, which had a higher index. Besides, the participants who had completed the output oriented task (task 3) outperformed those that did the input-oriented task (task 1), despite their index equivalence.

Kargozari and Ghaemi (2011) investigated the effectiveness of the type of written exercises on L2 vocabulary retention. To this end 53 Iranian EFL university students practiced ten previously un encountered lexical items in three types of written exercises: multiple choice (MC), fill-in-the blank (FW) and sentence writing (SW). The participants were given a mini-dictionary designed to help them both for meaning and usage of the target words while doing the exercises. The findings revealed that the MC group significantly outperformed the other groups. The differences between MC group and two other groups were significant. Although Hulstijn and Laufer (2001) found a significant difference between SF and SW exercises, the findings of this study did not support theirs' .The depth of processing was found to be deeper and more mental processes are used in SW exercises than two other ones.

As shown, ILH has been found accountable for better learning and retention in some studies. However, more research is necessary to see if there is any difference between the reading tasks with different involvement indexes and the ones with similar indexes yet different in other features like being input or output oriented. Furthermore, the feelings and attitudes that learners have towards the use of reading tasks and their effectiveness in general and the type of the tasks involved, in particular, have not been the focus of sufficient researches. To address these objectives, this study was designed. The following questions were formulated to direct the present study.

The purpose of the current study is therefore to find the immediate and delayed effects of reading based tasks on vocabulary acquisition as follows:

  1. Which of the following reading tasks, MCQ, SM, MCT, and BF have a significant effect on vocabulary acquisition?
  2. Which of the following reading tasks, MCQ, SM, MCT, and BF can lead to better vocabulary retention?
  3. Do different reading based tasks have different effects on vocabulary acquisition and retention?
  4. What are the learners' beliefs about the tasks and their effectiveness?

 

Methodology

 

Participants

The target population of this study was male and female English undergraduates who were attending Reading Comprehension 4 at Azad University of Najaf Abad. There were 4 intact classes (120 students). However, in order to ensure homogenous language proficiency, an Oxford Placement Test (OPT, Edwards, 2007) was given to all of the students and those who got a score between one standard deviation above and below the mean were included in this study. The following table (Table 1) presents the number of participants in each group.                                                                                   

 

Table 1.  Number of Participants   

group

 

20

 

24.7

 

24.7

 

24.7

 

21

 

25.9

 

25.9

 

50.6

 

19

 

23.5

 

23.5

 

74.1

 

21

 

25.9

 

25.9

 

100.0

 

81

 

100.0

 

100.0

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

Valid

 

Frequency

 

Percent

 

Valid Percent

 

Cumulative

 

Percent

 

Instruments

The instruments used in this study the OPT (Edwards, 2007), four different reading tasks in the form of two texts followed by twenty questions, an immediate and a delayed posttest, and an interview. The following explanations clarify how these materials were prepared and used.                     

OPT-- This elementary to intermediate level test consists of three parts: grammar (15 items), vocabulary (15), and reading comprehension (20), which was totally 50 items. One point was given for each correct item and no negative point was considered for incorrect responses. It was given to the participants in the first session and it took 50 minutes to be administered.

 

Reading tasks.Four reading tasks were used in the study with different involvement loads to test their effects on vocabulary acquisition. Each task included reading passages that were selected from Reading Through Interaction (Farhady & Mirhassani, 2008), which was the English textbook taught for Reading Comprehension 3 at the time of this study at Azad University of Najaf Abad.

First of all, the participants' previous teachers, who taught Reading Comprehension 3, were consulted to determine which units were not covered the previous semester.  Out of these units four with interesting titles were selected to attract more readers' attention and boost better performance. The texts were given to the participants in the four classes three weeks before the treatment. The participants were asked to underline all new words that were not familiar to them in the four texts, but they were not told that they would be tested afterwards on the meaning of those words. To ensure that the selected target words was enough, from which to select sufficient number of items for each task, two texts out of four that had the highest number of new words were chosen. The first passage, which consisted of 126 words, was about the risks that people take in their lives. The target words were nine, which were underlined by students in the reading passage. The second passage, which consisted of 160 words, was about the protection of wild animals. The underlined target words were eleven in this text. Therefore twenty target words were finally selected to be used in the experiment. They were eight nouns, eight adjectives, and four verbs including:  dare, deaf, renowned, sentiments, attempt, marathon, gliding, feats, traits, nightmare, poisonous, hire, claim, hordes, arctic, savage, cautious, predictable, extermination, and campaign.

Inspired by Hulstijn and Laufer (2001), four tasks were selected: multiple-choice comprehension questions (MCQ), sentence- making (SM), multiple-choice cloze test (MCT) and blank- filling (BF).

Task one (MCQ), given to the first group, consisted of the two texts with the target words in bold print to help the participants notice the words (Schmidt, 1994). The first text had 9 multiple-choice questions with four choices for each while the second one had 11 items with four choices. The participants were to select one of the choices for each item that was closest in meaning to the word in bold print. The involvement index of this task was measured as 2 [+ (1) need, - (0) search, + (1) evaluation]. In other words, the need component is given one, as it was induced by the task itself. However, participants were not allowed to use a dictionary, so the search item is given zero. In the same line the evaluation component is given one because target words had to be evaluated against one another by students to determine their contextual appropriateness.

In task two (SM), given to the first group, the same texts with the target words in bold print were used. These participants in this group were asked to write the meaning of the underlined words in L1 or L2 along with a meaningful sentence for each in English. However during writing the sentences, they were not required to pay attention to the grammar. In this task, the need component was moderate and was given one, because it was externally induced by the task itself and there was no search component (i.e., it is given 0) involved since students did not have to look up the words in a dictionary. However the value of evaluation was high (i.e., 2) because the words were to be used in sentences and the participants had to put more effort to create them. Hence, the involvement load of the task was 3 [+ (1) need, - (0) search, ++ (2) evaluation].

In task three (MCT) the third group of the participants was asked to read the same reading passages with the target words omitted and to answer multiple choice questions. The number of questions was again 20. Four choices were provided for each blank and the participants were asked to select one from among the choices that best fitted the blank in each item. As for the involvement load, need and search were the same as those in task two. In order to fill in the blanks with the correct words, the given words had to be evaluated against one another to determine their contextual appropriateness. The task included a moderate evaluation of 1. Based on the involvement load hypothesis, the involvement index of the task was 2 [+ (1) need, - (0) search, + (1) evaluation].

In the fourth task (BF), after reading the same texts with the missing target words, the third group of the participants was asked to fill in the blanks with the target words placed on top of the page in a random order. They were also asked to change the part of speech of the presented target words if necessary. Since the participants had to know the meaning of the target words to put them in the blanks, they were told to bring their own dictionary to class and use it when necessary. It must be noted that all students had already been taught how to use a dictionary. As such the search component for this task was given 1. Since the words around the blank would function as stimuli to activate the readers' mental mechanisms, the participants were expected to look for clues from the passages that might help them find the missing words. Because the task itself was inducing, the need component was moderate and it was given1.The task motivated a moderate amount of evaluation too (1). Based on the involvement load hypothesis the involvement index of the task was 3 [+ (1) need, + (1) search, + (1) evaluation].   

 

Posttests. To assess the learning of the target words, an immediate posttest was administered. Also, to determine the retention of the target words, the same test was administered three weeks later. Due to the nature of the study, incidental learning, the participants were not informed of the upcoming immediate or delayed post tests they were supposed to take in all groups. Using The Best Test Preparation for the TOEFL (Penfield, 2007), Vocabulary and Tests for TOEFL (Farhady, 2005), Reading Comprehension and Vocabulary Workbook (Davy & Davy, 2007), and consulting with the experts as well as viewing the website of www.Test preview .com, the researchers constructed the posttests. To ensure the validity and reliability of the tests, they were piloted with another group at the same proficiency level before administration in the main study. The two posttests, which were equal in all respects, composed of matching (13 items) as well as definition selection (12 items) parts with the instructions in L2 on top. It has to be noted that no vocabulary pretest was given to the participants to avoid generating any memory traces.

 

Interview. After administering the post test, five participants from each group were selected as interviewees randomly. They were asked to express, in L1 or L2, their ideas about the tasks, their effectiveness in learning the meaning of new words, features of each specific task and other features such as class presentation and task administration. The purpose of the interview immediately following the post test was to prevent the students from forgetting the task effect and to get more precise information from the participants.

 

Results

 

In order to determine whether there were any overall differences among the experimental groups in the immediate posttest, their descriptive statistics were calculated. Table 2 displays the results.

 

Immediate test

 

20

 

12.40

 

3.747

 

.838

 

10.65

 

14.15

 

6

 

18

 

21

 

17.95

 

5.045

 

1.101

 

15.66

 

20.25

 

7

 

25

 

19

 

13.89

 

4.841

 

1.111

 

11.56

 

16.23

 

5

 

23

 

21

 

15.38

 

3.956

 

.863

 

13.58

 

17.18

 

7

 

24

 

81

 

14.96

 

4.815

 

.535

 

13.90

 

16.03

 

5

 

25

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

N

 

Mean

 

Std. Deviation

 

Std. Error

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval for

 

Mean

 

Minimum

 

Maximum

 

Table 2. Descriptive Statistics of Immediate Posttest Scores

 

 

The table shows that the highest and the lowest mean scores of the immediate posttest belong to SM and MCQ groups respectively. The results of a one way ANOVA showed a significance difference among groups, F (3.77) = 5.852, p =.001p =.002p=.046

 

Table 3.  Restuls of One-way ANOVA on Immediate Posttest among Groups

 

Table 4 shows the results of the descriptive statistics of the delayed post test scores.

 

 

Delayed test

 

20

 

16.30

 

5.592

 

1.250

 

13.68

 

18.92

 

7

 

24

 

21

 

22.19

 

2.421

 

.528

 

21.09

 

23.29

 

16

 

25

 

19

 

15.89

 

4.483

 

1.029

 

13.73

 

18.06

 

8

 

24

 

21

 

17.95

 

4.555

 

.994

 

15.88

 

20.03

 

10

 

25

 

81

 

18.16

 

4.996

 

.555

 

17.06

 

19.27

 

7

 

25

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

N

 

Mean

 

Std. Deviation

 

Std. Error

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval for

 

Mean

 

Minimum

 

Maximum

Table 4. Descriptive Statistics of Delayed Posttest

Table 4 shows that the SM group scored the highest in the delayed posttest while the MCT scored the lowest. In order to see whether the results were statistically supported, a one-way ANOVA was run. The results of the ANOVA showed that the difference among groups was significant, F (3, 77) = 8.774, p =.000.Table 5shows the results.

 

Table 5. Restuls of One-way ANOVA among Groups on Delayed Posttest

 

The results of the post hoc test confirmed that SM group significantly showed better retention than MCQ, MCT and BF (p =.001p =.046p=.026

 

Table 6.  Results of Scheffe Test on Mean Differences of the Delayed Posttest

Multiple Comparisons

 

Dependent Variable: delayed test

 

Scheffe

 

-5.890

 

*

 

1.374

 

.001

 

-9.82

 

-1.96

 

.405

 

1.408

 

.994

 

-3.62

 

4.43

 

-1.652

 

1.374

 

.695

 

-5.58

 

2.27

 

5.890

 

*

 

1.374

 

.001

 

1.96

 

9.82

 

6.296

 

*

 

1.392

 

.000

 

2.32

 

10.27

 

4.238

 

*

 

1.357

 

.026

 

.36

 

8.12

 

-.405

 

1.408

 

.994

 

-4.43

 

3.62

 

-6.296

 

*

 

1.392

 

.000

 

-10.27

 

-2.32

 

-2.058

 

1.392

 

.538

 

-6.04

 

1.92

 

1.652

 

1.374

 

.695

 

-2.27

 

5.58

 

-4.238

 

*

 

1.357

 

.026

 

-8.12

 

-.36

 

2.058

 

1.392

 

.538

 

-1.92

 

6.04

 

(J) group

 

SM

 

MCT

 

BF

 

MCQ

 

MCT

 

BF

 

MCQ

 

SM

 

BF

 

MCQ

 

SM

 

MCT

 

(I) group

 

MCQ

 

SM

 

MCT

 

BF

 

Mean

 

Difference

 

(I-J)

 

Std. Error

 

Sig.

 

Lower Bound

 

Upper Bound

 

95% Confidence Interval

 

The mean difference is significant at the .05 level.

 

*.

  Results of Scheffe Test on Mean Differences of the Delayed Post Test

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In order to compare the performance of participants on the immediate and delayed posttests a series of t-tests were run. The purpose was to see if all tasks helped learners to retain the meaning of the target words in memory. Table 7, 8, 9, 10 show the results.

 

Table 7. Comparison between Immediate and Delayed Posttest Scores in MCQ Group

 

 

 

Table 8. Comparison between Immediate and Delayed Posttest Scores in SM Group

 

 

 

Table 9. Comparison between Immediate and Delayed Posttest Scores in MCT Group

 

 

 

Table 10. Comparison between Immediate and Delayed Posttest Scores in BF Group

 

 

The results showed a significant difference between the mean scores of immediate and delayed tests in MCQ, SM, MCT and BF tasks, t (19) = 4.699, p =.000, t (20) = 4.797, p =.000, t (18) = 2.200, p =.41 and  t (20) = 2.853, p =.010 accordingly. It means that all four tasks led to the retention of vocabulary meaning over three weeks. All in all the results reveal that the SM task was more effective than the other tasks for both learning and retention. However, all tasks led to significant retention from the immediate to the delayed posttests.

The Results of the Analysis of Interview Questions

The participants who took part in the SM task were more comfortable in making meaningful sentences with the target words. They believed that writing a sentence seems to be easier than choosing the correct answer from among four choices as in multiple choice questions.

They also showed their interest to have the same tasks in their class as an activity to enhance their vocabulary learning by their own teacher. The following are excerpts from the interview with participants of the SM group. Since the answers were more or less the same, some of them are highlighted here.

a. It`s a good means for learning vocabulary.

b. Writing sentences is effective in vocabulary learning.

c. The task is helpful. So I agree to have the same task in my regular classes.

d. It can help the students learn new vocabulary items. Our teachers can plan for it.

e. It was clear. I knew exactly what to do.

BF and MCQ groups had almost the same opinions about their own tasks. Learners in the BF group stated that although the task seemed difficult at first, it became easier later and was found more conductive to vocabulary acquisition and retention. Another positive point mentioned by them was the permission of using dictionaries while doing the task. Learners in the BF and MCQ groups agreed to have the same tasks as their own class activity on and off to acquire more vocabulary in their classes. The following are excerpts from the BF group.

a. The exam is conductive in vocabulary learning.

b. It is a bit difficult, but useful.

c. I agree to have the same task, but not very often.

d. Although it is difficult, it gives me the power of thinking and I learn new vocabulary items. I like to have the same task every week.

e. It is good to have an extra activity at any convenient time.

MCQ group expressed the following.

a. It is difficult and needs more attention.

b. I like to use the dictionary while taking a task.

c. I like to have the same task. It`s useful.

d. Sure, I agree to have it. I learn new vocabulary.

e. A bit difficult, but the students can learn if they are required to do it.

The MCT group found their task a bit difficult. They had difficulty in recognizing and selecting the correct answer from among the choices. They also complained about the time of administration which was at the end of their class. They stated that the time of administration of the task was not suitable.  They believed that the task was effective in vocabulary acquisition and retention, but they did not show any interest in having the same task as an activity in their regular classes. They found it a little boring.     

a. It is a difficult task.

b. When I saw the questions in the form of multiple cloze, I got embarrassed.

c. It is not interesting. It is boring.

d. Having the same task can be helpful.

e. The task is boring. It is better to have it seldom.

 

Discussion and Conclusion

 

Vocabulary acquisition is crucial to students' language skills: speaking, reading, writing, and listening. Without enough vocabulary, listening, reading comprehension, and writing are inefficient. Looking for a better method for teaching vocabulary, this study aimed at finding the effect of different reading tasks with various involvement load components on vocabulary acquisition and retention. As mentioned before, four reading tasks were given to four different groups to see whether they had any effect on vocabulary learning and retention and weather any significant differences were found among the tasks in their effectiveness. The results of the study showed that SM with the ILI of 3 was better than MCQ with the ILI of 2 for word learning on the immediate posttest. Contrary to the prediction of ILH; however, task BF with the ILI of 3 did not show any superiority to MCT with the ILI of 2. As expected, no difference was observed between MCQ and MCT, both with the ILI of 2. Concerning meaning retention, all four tasks led to vocabulary retention, but the SM task showed better retention than the others on the delayed posttest.

Research question one in this study asked whether any of the reading tasks, MCQ, SM, MCT, and BF, had significant effects on vocabulary acquisition. This question can be answered by examining the results of the immediate post test which was administered to assess learning of target words. The results show that the SM task was better than the others in vocabulary acquisition.

The results partly confirm the predictions of Laufer and Hulstinjn`s (2001) ILH, which stated that tasks with a higher involvement load will be more effective than those with a lower involvement load, as the ILH index of SM task is 3 compared to MCQ with the ILI of 2. However, it did not show any priority to MCT with the ILI of 2 as expected based on ILH. The results further confirm Xu’s (2009), but run counter to those of the study done by Gorgi (2010), who showed that ILI did not affect vocabulary acquisition. As he stated one possible explanation for this is that the numerical values given to the motivational and cognitive elements of the tasks, which in turn yield the ILI, may not carry the same weight, or may have been roughly quantified. The advantage of SM over another reading based task which was reading comprehension with L1 gloss accompanied by true false questions (ILI equals 1) is reported by Keating (2008) as well. Similar to other findings of the present study, no difference was found between SM and BF.

The second research question asked whether any of the presented tasks, MCQ, SM, MCT, and BF would lead to better vocabulary retention. This question can be answered by examining the results of the delayed post test administered to assess retention of the meaning of the target words. The results of the study regarding the second research question indicated that SM task with an ILI of 3 was significantly led to better retention than the other three tasks. However as mentioned before, the predictions of the ILH were not met by all tasks, as all of them showed retention of the meaning of new words to some degree, though some had the ILIs of lower than 3. Again this finding is partly in line with Laufer& Huljistin`s (2001). The results ran counter to Gorgi (2010), who showed that function of ILI was fade in delayed word retention. 

The reason why SM task was more efficient than the others might be related to its being output oriented. As mentioned earlier, SM task requires the participants to make meaningful sentences with each target word, so it can be considered as output-oriented with the act of production demanding deeper cognitive effort as compared with MCT and MCQ that are considered as input-oriented tasks because they involve the act of recognition leading to less depth of processing. Following this line of reasoning, one might expect BF, which is output oriented to be as effective as SM. However, BF seems to be not as output oriented as SM, as it did not make the participants to produce new sentences. It just required them to look for the right word for the blank and make necessary changes in the form of the word.  In a nutshell it can be said that depth of processing of SM seems to be much more than the other task.

The same discussion can be put forward by referring to a feature of the word called generation. Joe (1995) showed that words with a higher degree of generation were better retained than words with a lower degree of generation, or no generation at all. This means that when words are used by learners to generate sentences, they would be retained better than when they are just memorized or are recognized in different tasks.

According to the Huljistin and Laufer (2001, p. 552) “the ILH does not predict that any output task will lead to better results than any input task. It predicts that higher involvement induced by the task will result in better retention, regardless of whether it is an input or an output task.” Furthermore, studies (Kim, 2008; Keating, 2008) had already shown that tasks of equal load will lead to similar results irrespective of proficiency level. The results of this study, however suggest that despite equivalency of ILI in tasks SM and BF, word retention was slightly different, a finding that is not in line with the prediction of the ILH. The reason might have to do with the task type. As task SM had learners write meaningful sentences with the new words, the act of production itself, which demands deeper cognitive effort (Swain, 1985, 1995) might have contributed more to word retention than the mere reading of the text and selecting an appropriate word from those target words. In other words, what contributes to word retention is not merely a product of deliberate manipulation of variable elements (need, searchandevaluation) in task, irrespective of its type; rather, other elements such as task type may be equally important. Another explanation might be related to the learners themselves. They seemed to be more powerful with more experiences in production than recognition.

The third question asked whether different reading based tasks have any effect on acquisition and vocabulary retention. The answer to this question is in the results of the comparison between immediate and delayed posttest. The comparison shows that all four reading tasks with different ILIs and depth of processing led to vocabulary retention. This finding is in line with others such as Laufer (2001), Gorgi (2010), and Xu (2009).

The facilitative effect of SM can be attributed to the deeper levels of processing involved in this task. As Anderson (1995) and Baddeley (1997) stated processing new lexical information more elaborately by paying careful attention to the word's pronunciation, orthography, grammatical category, meaning and semantic relations to other words will lead to higher retention than by processing new lexical information less elaborately by paying attention to only one or two of these dimensions. The former happens when learners are engaged in sentence writing than the other tasks in this study. Based on Anderson and Baddeley’s viewpoints, the time spend on the processing may have direct relationship with the retention of the meaning of the word. The same is approved by Craik and Lockhart (1972), who stated  that memory trace persistence is related to depth of analysis so that deeper levels of analysis could lead to more  elaborate, longer lasting, and stronger traces.

To summarie, based on the above discussions, the following findings emerge from the present study:  

1) In this study the ILH was partly supported as far as the immediate task effect was concerned in vocabulary acquisition. That is, SM task with higher involvement load and with higher depth of processing was found to be the most effective. However, it did not happen for BF with the same ILI.                                                                                              

2) In terms of the delayed task effect on vocabulary retention, the results partly supported the ILH. The SM task resulted in better vocabulary retention than the other three tasks as predicted but no difference was observed among the other tasks with different ILIs.           

3) All four tasks with different involvement loads and depth of processing led to some vocabulary retention, which is not in line with predictions of ILH.                                            

4) As the result of the interview showed, the participants in SM class found this task a more efficient way for learning new words. Then MCT and BF groups agreed to have the same tasks as extra activities in their classes to help them learn vocabulary better. Also, although MCQ group found this task a good way of learning and retention, they considered it somehow boring and difficult.

In light of the findings of the present study, we may find some useful pedagogical implications for vocabulary teaching and learning in Iran. First, the results of this study suggest that teachers should design different reading-based tasks that can attract learners’ attention to the target words to develop learners’ vocabulary knowledge. Second, teachers could design or select tasks varying in involvement load for different words depending on their goal carrying in their classes. Third, due to the significant time effect on vocabulary acquisition, teachers need to provide opportunities for students to practice the vocabulary they have learnt so as to help them learn better. However, it seems that the ILI cannot be the only predictor of the usefulness of the tasks. In other words some other factors such as the nature of the task as being input or output oriented should be considered as well. Furthermore, how much the task involves the learners in the generation of language can be a significant factor too.

Further research may replicate the current study with tasks with different ILIs and orientation for learning words and other components of language like grammar, collocations, etc. in relation to vocabulary research, the investigation could be focused on different word derivations (noun, verb, adjective, and adverb) to investigate whether this factor can moderate vocabulary learning and retention through different tasks. Last but not least, teacher's beliefs could also be considered based on their experiences about the effectiveness of the tasks and the possibility of their implementation in language classes in different coursesNowadays, the importance of learning English as a second language is remarkably on the rise. Despite some EFL teachers' preference to teach vocabulary directly, some others prefer indirect methods. They maintain that extensive reading is the best way to construct a sufficient knowledge of vocabulary. Furthermore, they think that teaching vocabulary or grammar directly is not only an ineffective way of teaching but also  an overwhelming task that takes a great deal of class time. As Hart and Risley (2003) stated, direct vocabulary acquisition is memorizing the words term after term with their respective translations or their synonyms and definitions. It is quick, but it is also superficial. Learners encounter vocabulary in an isolated, often infinitive form and remain incapable of using it correctly in context. Moreover learning vocabulary directly sinks faster into oblivion.

 In indirect vocabulary acquisition learners encounter words together with syntactic information through context in target language reading. Vocabulary in context often appears repeatedly under different aspects and hence engrains in the learners' minds. On the other hand, as reading involves highly complex cognitive processing operations, the reading task makes use of authentic and challenging texts and provides learners with a rhetorical framework for processing and analyzing the text (Nunaun, 2004).Thus, indirect vocabulary learning through reading can help learners in comprehending, manipulating, processing or interacting in the target language while their attention is focused on meaning rather than manipulating forms.

For judging the different degrees of processing unfamiliar vocabulary through reading, the involvement load hypothesis (ILH) is a good option (Laufer &Hulstijn, 2001). This hypothesis adopts three measurable and operational factors to define the involvement loads which each task may have. They are need, search, and evaluation. The need component is the motivational, non- cognitive dimension of involvement. It is concerned with the need to achieve something. This notion here is not interpreted in its negative sense, based on fear of failure, but in its positive sense based on a drive to comply with the task requirements, which could be either externally imposed or self-imposed. Need is moderate when it is imposed by an external agent, for example, the need to use a word in a sentence which the teacher has asked the learners to produce, and need is strong when imposed on the learners by themselves. Moderate and strong needs subsume different degrees of drive.

Search and evaluation are the two cognitive (information processing) dimensions of involvement, contingent upon noticing and deliberately allocating attention to the form-meaning relationship (Schmidt, 2001). Search is the attempt to find the meaning of an unknown L2 word or the concept by consulting a dictionary or another authority (e.g., a teacher). Evaluation entails a comparison of a given word with other words, a specific meaning of a word with its other meanings, or combination of the word with other words in order to assess whether a word (i.e., a form-meaning pair) does or does not fit its context.

By operationalzing the above issues the involvement index (Hustijn and Laufer, 2001), in which the absence of a factor is marked 0, a moderate presence of a factor equals 1, and a strong presence of a factor equals 2 can be measured. For example, consider two different tasks. In the first task, learners are asked to write original sentences with new words whose meanings are provided by the teacher. In this case the needis moderate (imposed by the teacher, hence given 1), there is no search(meanings are provided, hence given 0), and strong evaluationis required in that learners have to use the new words to produce their own sentences (given 2). In the second task, the learner is required to read a text with glosses of the new words and to answer comprehension questions. The task induces a moderate need (hence given 1), but neither search nor evaluation is present (given 0 to each). To be more clear the first task is thought (according the construct of task-induced involvement) to induce a greater involvement load than the second task 3 vs. 1.

The concept of involvement can be submitted to empirical investigation by devising incidental-learning tasks with various degrees of need, search and evaluation (Laufer & Hulstijn, 2001). For example, tasks with different involvement indexes can be presented to different groups of participants. After they have finished the tasks the results can be analyzed and compared to see if there is any relationship between the task involvement load and word acquisition and retention.

Taken together, these elements can explain the effectiveness of a range of task properties so that they form the involvement load of a task. It is expected that a task which elicits a higher involvement load is more effective than a task which elicits a lower one (Laufer & Hulstijn, 2001). The present study was designed exactly on the theoretical basis of the ILH and used the measurable criteria of three components to define three different reading tasks.

Laufer and Hulstijn (2001) noted that a great deal of support for ILH predates its formulation by studies not run to test their hypothesis. Understandably, studies having a direct bearing on the hypothesis are still few and far between due to its recent formulation.

Huljistin and Laufer (2001) conducted two parallel experiments in which their advanced Dutch- and Hebrew participants (adult English learners) were assigned six intact groups. Retention of ten unfamiliar words in an incidental learning setting was investigated across three task types: Task1 included reading comprehension with marginal glosses, Task 2 included comprehension plus filling in the target words, and Task 3 included composition writing with target words. The tasks had different involvement loads i.e., various combinations of need, search and evaluation. The results indicated that Task 3 was more involving and led to better retention than Task 1 and Task 2, thus providing strong support for ILH.

                                      

Background of the Study

 

Using eight nonsense words, Keating (2008) used three tasks with different involvement loads to assess the predictive nature of the ILH when low proficiency learners were involved: task1, which consisted of a reading passage with marginal glosses, task 2, which included reading comprehension plus fill-in-the blank texts, and task 3, which consisted of writing original sentences using target words. In task 1 the participants had to read a passage with five true/false comprehension questions. To correctly answer the questions, participants had to attend to the words which were highlighted in bold print and glossed in their L1. The involvement index was 1. Participants in group 2 had the same text but the words were deleted from it, each appearing with a brief definition, an example sentence, and an L1 gloss. The participants were instructed to fill in the blanks with the glosses in the margin. The involvement index was 2. Group 3 only had to write original sentences with the words. The index for this group was assessed to be 3. Based on ILH, it was predicted that group 3 would outperform group 2 which in turn would do better than group 1. The results supported the hypothesis that the ILH can be generalized to low-proficiency learners, though no significant difference was found between the groups on task 3 and task 2 regarding their passive knowledge of the target words.

Kim (2008) also provided empirical evidence for the ILH in a carefully designed study consisting of two experiments. The first experiment addressed the effectiveness of three vocabulary tasks with different involvement indexes. The second experiment examined whether tasks with equal involvement loads would bring about different results for learners with different levels of proficiency. In line with other studies, the results showed that a higher ILI led to more effective initial and delayed vocabulary learning. Furthermore, Kim found that identical involvement indexes in two tasks unfolded similar results for the two L2 proficiency levels.

In order to test the ILH, Xu (2009) investigated the impact of three tasks with different ILIs on vocabulary acquisition. The tasks were MC comprehension, BF, and SM questions. As far as the immediate task effects on vocabulary acquisition are concerned, the results partially support the ILH. That is, the BF and SM tasks, which induce higher involvement loads than the MC task, yield significantly higher acquisition in the overall immediate posttest as well as the spelling and collocation measures. However, the SM task did not produce an acquisition significantly superior to the BF task as predicted. In terms of the delayed task effects on vocabulary acquisition, the results supported the hypothesis only to a limited degree. As expected, the SM task had greater effects than the BF task, which in turn, had superior effects as compared to the MC task. The difference between SM and MC tasks reached a statistically significant level. However, no measurable difference existed between Tasks S and B, or between Task B and M.

Gorgi (2010) carried out a research on ILT. As he considered the nature of the task as being input or output oriented and its relationship with the involvement index, his study revealed contrary results to the prediction of the ILH. Three tasks were included: an input-oriented task with an involvement index of three, the same type of task but with an involvement index of two, and an output-oriented task with the same involvement load as that of the first group. The comparison of the performance of the groups in the immediate and delayed posttests revealed that contrary to the prediction of the involvement load hypothesis,task 2 with an involvement index of two was superior to task 1, which had a higher index. Besides, the participants who had completed the output oriented task (task 3) outperformed those that did the input-oriented task (task 1), despite their index equivalence.

Kargozari and Ghaemi (2011) investigated the effectiveness of the type of written exercises on L2 vocabulary retention. To this end 53 Iranian EFL university students practiced ten previously un encountered lexical items in three types of written exercises: multiple choice (MC), fill-in-the blank (FW) and sentence writing (SW). The participants were given a mini-dictionary designed to help them both for meaning and usage of the target words while doing the exercises. The findings revealed that the MC group significantly outperformed the other groups. The differences between MC group and two other groups were significant. Although Hulstijn and Laufer (2001) found a significant difference between SF and SW exercises, the findings of this study did not support theirs' .The depth of processing was found to be deeper and more mental processes are used in SW exercises than two other ones.

As shown, ILH has been found accountable for better learning and retention in some studies. However, more research is necessary to see if there is any difference between the reading tasks with different involvement indexes and the ones with similar indexes yet different in other features like being input or output oriented. Furthermore, the feelings and attitudes that learners have towards the use of reading tasks and their effectiveness in general and the type of the tasks involved, in particular, have not been the focus of sufficient researches. To address these objectives, this study was designed. The following questions were formulated to direct the present study.

The purpose of the current study is therefore to find the immediate and delayed effects of reading based tasks on vocabulary acquisition as follows:

  1. Which of the following reading tasks, MCQ, SM, MCT, and BF have a significant effect on vocabulary acquisition?
  2. Which of the following reading tasks, MCQ, SM, MCT, and BF can lead to better vocabulary retention?
  3. Do different reading based tasks have different effects on vocabulary acquisition and retention?
  4. What are the learners' beliefs about the tasks and their effectiveness?

 

Methodology

 

Participants

The target population of this study was male and female English undergraduates who were attending Reading Comprehension 4 at Azad University of Najaf Abad. There were 4 intact classes (120 students). However, in order to ensure homogenous language proficiency, an Oxford Placement Test (OPT, Edwards, 2007) was given to all of the students and those who got a score between one standard deviation above and below the mean were included in this study. The following table (Table 1) presents the number of participants in each group.                                                                                   

 

Table 1.  Number of Participants   

group

 

20

 

24.7

 

24.7

 

24.7

 

21

 

25.9

 

25.9

 

50.6

 

19

 

23.5

 

23.5

 

74.1

 

21

 

25.9

 

25.9

 

100.0

 

81

 

100.0

 

100.0

 

MCQ

 

SM

 

MCT

 

BF

 

Total

 

Valid

 

Frequency

 

Percent

 

Valid Percent

 

Cumulative

 

Percent

 

Instruments

The instruments used in this study the OPT (Edwards, 2007), four different reading tasks in the form of two texts followed by twenty questions, an immediate and a delayed posttest, and an interview. The following explanations clarify how these materials were prepared and used.                     

OPT-- This elementary to intermediate level test consists of three parts: grammar (15 items), vocabulary (15), and reading comprehension (20), which was totally 50 items. One point was given for each correct item and no negative point was considered for incorrect responses. It was given to the participants in the first session and it took 50 minutes to be administered.

 

Reading tasks.Four reading tasks were used in the study with different involvement loads to test their effects on vocabulary acquisition. Each task included reading passages that were selected from Reading Through Interaction (Farhady & Mirhassani, 2008), which was the English textbook taught for Reading Comprehension 3 at the time of this study at Azad University of Najaf Abad.

First of all, the participants' previous teachers, who taught Reading Comprehension 3, were consulted to determine which units were not covered the previous semester.  Out of these units four with interesting titles were selected to attract more readers' attention and boost better performance. The texts were given to the participants in the four classes three weeks before the treatment. The participants were asked to underline all new words that were not familiar to them in the four texts, but they were not told that they would be tested afterwards on the meaning of those words. To ensure that the selected target words was enough, from which to select sufficient number of items for each task, two texts out of four that had the highest number of new words were chosen. The first passage, which consisted of 126 words, was about the risks that people take in their lives. The target words were nine, which were underlined by students in the reading passage. The second passage, which consisted of 160 words, was about the protection of wild animals. The underlined tar