The Effectiveness of Quizlet in Improving EFL Learners’ Receptive Vocabulary Acquisition

This study compares the efficacy of a digital app, Quizlet, versus traditional paper flashcards in the second language (L2) vocabulary acquisition. These learning tools were examined in terms of L2 learners’ receptive vocabulary development, linguistic environments, and perceptions. The study employed a pretest-posttest, quasi-experimental design whereby 121 English vocabulary items were taught to an intact class of 39 high school students in Vietnam over four weeks. In this study, the students were assigned into two groups: Group A used Quizlet while group B, paper flashcards (PFs) for the first two weeks. Then, they swapped the learning tools in the following two weeks. Data consists of their test scores, questionnaire responses, and audio-visual recordings of six randomly selected participants’ individual learning activities during interventions. Results suggest that both Quizlet and PFs enhanced L2 vocabulary learning; however, Quizlet did so more effectively than PFs. The findings can be explained by Moreno’s and Mayer’s Cognitive-Affective Theory of Learning with Media, the different linguistic environments created by the instruments, and the participants’ perceptions of the tools.


Introduction
Lexical learning is central to L2 (second language) learning.L2 learners need to develop a rich L2 vocabulary to attain high proficiency as, according to Levelt's lexical hypothesis, words play the central role in generating utterances (17).Therefore, a number of vocabulary learning strategies have been developed to facilitate the memorisation of L2 vocabulary.Among them, PFs (paper flashcards) have been traditionally used in language classrooms due to their usability and effectiveness in increasing vocabulary size (Elgort and Nation 101).On the other hand, rapid advances in information communication technologies (ICT) in recent years have provided alternatives to traditional learning methods.One of them is Quizlet -a popular flashcard app with 40 million users every month (Dizon 45).Pedagogically sound digital tools should, however, incorporate learning principles supported by current research in education and cognitive science.Moreover, given the importance of receptive vocabulary to ensure the comprehension of natural texts, this study investigates the efficacy of Quizlet as compared with traditional PFs in the learning of L2 receptive vocabulary.
The theoretical framework of the research is the Cognitive Affective Theory of Learning with Media (CATLM, Moreno and Mayer 313).According to CATLM, "humans have separate channels for processing different information modalities" (313), and the channels operate on limited working memory capacity.The theory also suggests that affective factors such as attitude and motivation can determine the amount of cognitive effort devoted to a learning task (Moreno and Mayer 313).Thus, CATLM is used to interpret the efficacy of Quizlet and PFs in the current study.Additionally, the study followed Miyamoto's digital project evaluation framework for learning tools from three viewpoints: (i) learners' linguistic development, (ii) linguistic environments the tools created; and (iii) learners' perceptions of the tools(qtd.in Kawaguchi 441).Many digital tools have become available in recent years, and educators try to incorporate such tools in their teaching.Adopting a specific digital tool tends to depend on the tool's availability and innovativeness rather than educational effectiveness.It is, however, essential to evaluate the educational values of the tool.Quizlet has become popular for vocabulary learning, but is Quizlet truly better than the traditional PFs in all three viewpoints above?In order to investigate the effectiveness of Quizlet, the following research questions (RQ) will guide this research: RQ1: Do Vietnamese high school students achieve significant vocabulary gains with Quizlet and PFs? RQ2: Is there any significant difference in the learners' vocabulary gains through their using either of these two tools?RQ 3: Are there any critical differences between the multimodal linguistic environments created by Quizlet and PFs? RQ4: What are Vietnamese high school students' perceptions of these two tools?

L2 vocabulary acquisition and form-meaning connections
Vocabulary knowledge includes knowledge of receptive and productive vocabulary.Receptive vocabulary refers to the words that language learners can comprehend when they listen to or read them, while productive vocabulary refers to the words that the learners use when speaking or writing (Webb 79).Laufer's study suggests that L2 learners with the most frequent 3,000-word families' receptive knowledge are able to comprehend most authentic reading texts (131). 3dditionally, if an L2 learner has a receptive vocabulary size of 6,000 to 7,000 word-families, they can understand 98% of the words in spoken texts (Nation,"How" 77).Thus, L2 learners need to increase their L2 receptive vocabulary size to develop their receptive skills and overall proficiency in L2.
Every vocabulary item contains several aspects that L2 learners need to acquire.Among them, the written and spoken forms, and their meaning, are basic knowledge that people usually acquire in early stages of vocabulary acquisition.These can be "stored, manipulated and learned separately" and "a form can be recognised, but not linked to a fully elaborated meaning and vice versa" .However, whether a language learner can make word-form connections will "determine how readily the learner can retrieve the meaning when seeing or hearing the word form, and retrieve the word form when wishing to express the meaning" (Nation,Learning 73).Therefore, it is worth investigating vocabulary learning methods that facilitate the establishment of form-meaning connections.

Information communication technology (ICT) and multimodality in second language acquisition (SLA)
The digital era has seen a marked increase in the use of ICT in teaching and learning second languages.This is mainly due to the substantial resource provided by the technologies which can be used to develop L2 proficiency (Levy 777).As stated by Kenning, thanks to satellite televisions and the Internet, "exposure to, and communication in, a foreign language no longer entail travelling to the extent that they used to do" (159).Thus, advances in ICT can give solutions to the lack of L2 input, interaction, and output, all of which are essential to L2 acquisition (Krashen; Long; Swain).Several studies have suggested the efficacy of ICT in L2 acquisition (Awada et al.; Bower and Kawaguchi; Fukui and Kawaguchi; Ngo and Lee; Nicolas and El-aly; Qian and McCormick; Smith; Thang et al.; Yanagisawa et al.).In addition, ICT has enabled L2 learners to learn the target language via various modalities.For example, when watching videos in L2 subtitles, they practice their listening skills and acquire new words in the language.According to Kress and Leeuwen, multimodality is "the use of several semiotic modes in the design of a semiotic product or event, together with the particular way in which these modes are combined" (20).Other studies have shown that the multimodal learning environment benefits L2 vocabulary acquisition (Mohsen; Khezrlou et al.).
Cognitive Affective Theory of Learning with Media (CATLM, Moreno and Mayer 313) provides several explanations for the effects of multimodality in education.According to CATLM, humans process auditory and verbal information within the auditory channel, and visual and pictorial information in the visual channel.Another assumption of the theory is the limited capacity for cognitive processing of the channels (Moreno and Mayer 313).Thus, presenting information via both auditory and visual modalities enables it to be processed within two channels, preventing cognitive overload.The multimodal presentation, in other words, allows the learner to take advantage of both channels' cognitive processing capacity.Additionally, according to CATLM, affective factors can influence learning (Moreno and Mayer 313).For instance, if a student is more cognitively engaged in a lesson because it relates to their interest, this would, in turn, promote better learning outcomes.Therefore, the theory is suitably applied to the investigation of the effects of multimodality in SLA.

Paper flashcards (PFs) and Quizlet as vocabulary learning tools
Paper flashcards (PFs) are a popular, traditional tool for deliberate vocabulary learning.Typically, they are "doubled-sided cards" which learners can use to "practise form-to-meaning and meaning-to-form recall in repeated retrieval of L2 words, by flipping the front and backsides of the cards" (Hung 107).According to Elgort, deliberately learning vocabulary activities, e.g.,with PFs, can result in the "establishment of formal-lexical and lexical-semantic representations of L2 vocabulary items" (395).Furthermore, since the tool "triggers the acquisition of functional aspects of vocabulary knowledge," L2 learners can automatically access and use the vocabulary in communication (397).
Learners' deliberate attention to word form and meaning connections, triggered by PFs, may then speed up vocabulary acquisition.Additionally, the tool allows retrieval practice.The effort users make when they retrieve the form and meaning of a word can help them memorise and retain it (Barcroft 37).Moreover, L2 learners can use PFs to practice spaced repetition given that recalling "spaced items" can give them some challenges, and, according to Nation, "successful but difficult retrievals are better for memory than successful but easy retrievals" (Learning 454).
With the advent of ICT, L2 learners can learn vocabulary with not only paper flashcards (PFs) but also digital flashcards (DFs).Currently, one of the most popular digital flashcard apps is Quizlet, with 40 million users every month (Dizon 45).DFs on Quizlet are similar to PFs as they include pictures, forms, and meaning of a word on two sides.The notable difference between them is that DFs, but not PFs, enable learners to listen to the pronunciation of the word thanks to text-to-speech technology.Figure 1 illustrates the home page of Quizlet.Study modes include Flashcards, Learn, Write, Spell, and Test functions.In addition to DFs in Flashcards, the users can answer questions about written forms and meanings of words in Learn, Write, and Test.Besides, they must type written forms of the words that they hear in Spell.Play modes (i.e., games) include, on the other hand, Match, Gravity, and Live.When playing Match, users need to match words with their meanings.In Gravity, they must type correct answers to questions about written forms and meanings of words to prevent asteroids from falling.Both of the games are for individual use.In contrast, Live is a group game.Users are required to work in groups and answer multiple-choice questions about written forms and meanings of words.
A number of studies have compared the effectiveness of PFs with that of DFs in facilitating L2 vocabulary acquisition (Ashcroft et al.; Azabdaftari and Mozaheb; Başoǧlu and Akdemir; Kiliçkaya and Krajka; Lees; Nikoopour and Kazemi; Sage et al.).According to Azabdaftari and Mozaheb, Başoǧlu and Akdemir, Kiliçkaya and Krajka, DFs were more effective than PFs in developing L2 vocabulary.On the other hand, the other studies suggested that there was no significant difference between the efficacy of DFs and PFs in L2 vocabulary acquisition.Possible reasons for this result are limited Internet access and learners' preferences for PFs (Ashcroft et al.; Lees; Nikoopour and Kazemi; Sage et al.).
Despite the mixed findings, all the previous studies are similar in several ways.Firstly, all of them examined only group values (i.e., one group using PFs and another DFs) but did not investigate individual performance.However, "individual analysis" is crucial to research on ICT-assisted learning, since individual learners' considerable control over digital learning activities can determine the effectiveness of the activities (Kawaguchi 440).Secondly, in those studies, the linguistic environments created by DFs and PFs were not examined, although input, output, and interaction are key factors contributing to language acquisition (Krashen; Long; Swain).From the literature review, there is a clear research gap in aspects such as the relationship between L2 learners' linguistic development (i.e., vocabulary gains), the different linguistic environments provided by PFs and DFs, and the learners' perceptions of the tools.The current study examines these three aspects concerning PFs and Quizlet in order to identify possible reasons for the tools' effectiveness.

Methodology Participants
This study involved an intact class of thirty-nine grade ten students (thirty-six female and three male) in a public high school in Vietnam.All of the participants are Vietnamese and have been living in Vietnam since birth.They have been learning English for more than seven years and currently attend three compulsory 45-minute English lessons weekly.All of them have smartphones and computers connected to the Internet, so they would not encounter any problem with accessing Quizlet.Also, they participated in the study voluntarily.Signed consent forms were obtained from them before the commencement of the research.

Learning materials and tools
Four reading texts were selected from Tiếng Anh 11  -English textbooks for grade 11 students in Vietnam.These four texts were used as learning materials for participants during the experiment.One hundred twentyone vocabulary items selected from the passages were identified as the target (see Appendix A).The students used PFs and Quizlet to acquire target vocabulary.All the learning tools were prepared by the project's researchers.
Quizlet: The researchers created 121 DFs on the app.Each flashcard contained, on the front side, a target word, its word category, a sentence example including the target word, and a speaker icon so that participants could listen to the target word when clicking on it.On the back of the flashcard, there was the L1 (Vietnamese) translation of the target word.The L1 translation makes learners focus fully on the word itself.This facilitates memorisation of new words more effectively than L2 definitions (Laufer and Shmueli 103).Additionally, when the word could be represented by an image, the flashcard included such image on the back as according to Nation, pictures "may result in a deeper type of processing" (Learning 449).Apart from DFs, participants could use the other available Study and Play modes to memorise target vocabulary.
PFs: Researchers created 39 sets of 121 PFs (4.25 x 5.5 cm) for participants and 121 PFs (21 x 29.7 cm) for teaching.The PFs sets have all the same 121 target words, each flashcard with their L1 (Vietnamese) translation, an example sentence and pictorial representation just like the DFs on Quizlet.However, due to their paper nature, learners are not able to listen to target word.Instead, the front of each PF contained the phonetic transcription of the target word.Participants could use it to revise the words' pronunciation because they had already learned, in class, the International Phonetic Alphabet (Hoàng10).Regarding individual learning activities, participants could use PFs to retrieve forms and meanings of target words.

Intervention procedure, recordings of individual learning activities and questionnaire
At the beginning of the study, participants were required to take the Vietnamese bilingual version of the English Vocabulary Size Test (Nation and Beglar; Nguyen and Nation) as the baseline test in one hour.This test was used to measure only their vocabulary size since, given that all the test item choices were written in Vietnamese (the participants' L1) neither their English grammar nor their reading comprehension were tested (Nguyen and Nation 90).Participants were divided into two equivalent groups based on the test results: group A had 20 participants, coded A1 to A20 to protect anonymity and confidentiality, and group B had 19 participants (coded B1 to B19).The English vocabulary size of participants in each group ranged from 1,000 to 2,900 words.On the same day of their vocabulary size test (VST), all participants were instructed on how to use PFs and Quizlet for vocabulary learning.
After that, they were required to participate in a quasi-experiment (see Figure 2) with pre-tests and post-tests over approximately two months.The experiment was conducted at their school during after school hours so that they did not have any problems following their ordinary lessons.During the experiment, the participants were taught 50 target words in intervention 1 (over two weeks) and 71 target words in intervention 2 (over two weeks).Group A used PFs, and group B Quizlet in intervention 1.They swapped the learning tools in intervention 2. Therefore, both groups were given equal opportunities to learn vocabulary with Quizlet and PFs.This method was adopted to counterbalance the order effect.
During the experimental period, each group attended two 60-minute teaching sessions each week over four weeks.The lessons were delivered in both Vietnamese and English to ensure that, the students could understand the teacher's instructions adequately.Each lesson followed a set routine.Firstly, the teacher used, depending on the group, either Quizlet or PFs to teach target vocabulary.Next, the participants were required to read the reading passage in seven minutes.Following that, they had to tell which sentences contained target words.Then, the teacher used these sentences to explain collocations and parts of speech of the words to the participants.After this, they used either Quizlet or PFs to learn target words individually in ten minutes.The last activity was groupbased.In the lesson with PFs,participants in small groups had to group the cards containing the target words with the ones having their L1 translations.In the lesson with Quizlet, the group activity was Live.

Pre-tests, immediate post-tests and delayed post-tests
The current study aimed to compare the efficacy of Quizlet versus PFs to L2 receptive vocabulary acquisition.Thus, participants took two pre-tests, two immediate post-tests, and two delayed post-tests (see Appendix A), so that their vocabulary gains can be measured individually and as a group.The tests were all paper-based, and each of them included a listening section and a multiple-choice one.The listening section aimed to quantify participants' word-form gains, while the multiple-choice section was used to quantify the learners' word-meaning gains.Participants' vocabulary gains refer to word-form and word-meaning gains.
In the listening section, test takers listened to one target word at a time and spelled it out.The multiple-choice section was modelled on the Vocabulary Size Test (Nation and Beglar) and consisted of multiple-choice questions.Each of the questions included a target word, a simple non-defining sentence including the word, and four choices.All the choices in the tests were written in participants' L1 (Vietnamese), so the vocabulary tests assessed their knowledge of target vocabulary, not English grammar and reading skills (Nguyen and Nation 90).Immediate post-test 1 and delayed post-test 1 were the same as pre-test 1.Each of the tests consisted of a total of one-hundred questions about the target words taught during intervention 1, and lasted forty-five minutes.Also, immediate post-test 2 and delayed post-test 2 were similar to pre-test 2. Each of these had a total of 142 questions about the vocabulary items taught in intervention 2 and lasted one hour.
The pre-test was taken one day before the lesson in order to investigate whether participants already knew the target words taught in each intervention or not.The immediate post-test was used to test how many words participants learned after the intervention and was administered one day after the last teaching session of the intervention.The delayed post-test was taken two weeks after the intervention and aimed to examine whether participants retained target words after a certain period of time (see Figure 2 for the testing schedule).

The multimodal linguistic environments created by PFs and Quizlet: Recordings of individual learning activities
Another aim of the current study was to investigate the multimodal linguistic environments created by PFs and Quizlet.Therefore, three participants in each group (A1, A10, A17, B6, B14 and B34) were randomly selected, and each of them was recorded twice: while individually learning target vocabulary with PFs (video recording) and while doing that on Quizlet (screen-capture recording).Each of the video recordings lasted approximately 10 minutes.

Questionnaire about students' perceptions of Quizlet and PFs
All participants completed an online questionnaire through Survey Monkey (see Appendix B) one week after intervention 2. The questionnaire was written in Vietnamese and included eight questions.Questions one and two asked participants to report how often they learned target vocabulary with either PFs or Quizlet outside the classroom.The next two questions were about the length of each self-study session the participant spent.Questions five and six were in the Likert format.They included two question items about the enjoyment of learning vocabulary with the tools.Additionally, according to Davis's technology acceptance model, a person's "behavioral intention" of using a technology can be predicted by his or her "perceived usefulness" and "ease of use" of the technology (333).Thus, other question items in questions five and six asked about participants' perceptions of the effectiveness of Quizlet and PFs in developing vocabulary and the ease of using them.

Data analysis
In the pre-and post-tests, each question was worth one score.The researchers marked all of them manually, and then analysed participants' scores with t-tests.The dependent t-test was used to identify statistical significance of participants' gains after learning target words with PFs and Quizlet in each intervention.The independent t-test was employed to compare vocabulary gains of groups A and B. As for the multimodal linguistic environments created by PFs and Quizlet, the data collected from video recordings and screen captures were analysed to find out the input, output producing opportunities, and feedback provided by the tools.Regarding students' perceptions of the two tools, participants' responses to the questionnaire were analysed using descriptive statistics.Additionally, keyword analysis was performed using KWIC Concordance software (Tsukamoto) to evaluate the students' open comments in the questionnaire.

Results and Discussion
Vocabulary gains with either PFs or Quizlet One research question of the current study is whether participants achieved substantial vocabulary gains using either PFs or Quizlet.Vocabulary gains refer to word-form gains and word-meaning gains, measured respectively through listening and multiple-choice sections of pre-tests, immediate post-tests, and delayed post-tests.Paired t-tests were used to compare participants' scores in the tests with two tools.Also, the Bonferroni correction is applied to offset the chances of a Type 1 error for the analyses, so the level of statistical significance was set at p < 0.0167 (0.05/3).
Word-form gains: Tables 1 and 2 illustrate the results of dependent t-tests on group A and group B scores in listening sections respectively.Each table includes total scores (N), mean scores (Mean), percentages of the target words that the learners knew or remembered on average, range values (Range), and standard deviations (SD).were markedly better than those in their listening section of pre-test 2. However, they showed significantly lower scores in the listening section of delayed post-test 1 than in the listening section of immediate post-test 1 (t = 25.4488,p < 0.0001, Cohen's d = 5.689, df = 19).Similarly, their scores in the listening section of delayed post-test 2 were significantly lower than those in their listening section of immediate post-test 2 (t = 4.12, p = 0.0006, Cohen's d = 0.923, df = 19).Group B performed significantly better in listening sections of immediate post-test 1 (t = 35.3384,p < 0.0001, Cohen's d = 8.105, df = 18) and delayed post-test 1 (t = 14.4999, p < 0.0001, Cohen's d = 3.327, df = 18) than in the listening section of pre-test 1.Similarly, they obtained significantly higher scores in listening sections of immediate post-test 2 (t = 4.2895, p = 0.0004, Cohen's d = 0.984, df = 18) and delayed post-test 2 (t = 3.2892, p = 0.0041, Cohen's d = 0.754, df = 18) than in the listening section of pre-test 2. However, they obtained significantly lower scores in the listening section of delayed post-test 1 than in the listening section of immediate post-test 1 (t = 26.5683,p < 0.0001, Cohen's d = 6.095, df = 18).Additionally, their scores in the listening section of delayed post-test 2 were significantly lower than those in their listening section of immediate post-test 2 (t = 3.9506, p = 0.0009, Cohen's d = 0.907, df = 18).
Word-meaning gains: Tables 3 and 4 represent dependent t-tests of group A and group B scores in multiple-choice sections, respectively.Group A performed significantly better in multiple-choice sections of immediate post-test 1 (t = 6.3600, p < 0.0001, Cohen's d = 1.422, df = 19) than in the multiple-choice section of pre-test 1.In addition, their scores in multiplechoice sections of immediate post-test 1 and delayed post-test 1 were not significantly different (t = 1.7855, p = 0.0902, Cohen's d = 0.399, df = 19).Moreover, their scores in the multiple-choice section of delayed post-test 1 (t = 2.3877, p = 0.0275, Cohen's d = 0.534, df = 19) were not significantly higher than those in their multiple-choice section of pre-test 1. Besides, their scores in multiple-choice sections of immediate post-test 2 (t = 9.9276, p < 0.0001, Cohen's d = 1.633, df = 19) and delayed post-test 2 (t = 3.6527, p = 0.0017, Cohen's d = 0.817, df = 19) were significantly higher than those in their multiplechoice in pre-test 2. However, the group obtained significantly lower scores in the multiple-choice section of delayed post-test 2 than in the multiple-choice section of immediate post-test 2 (t = 6.2064, p < 0.0001, Cohen's d = 1.387, df = 19).Similarly, group B scores in multiple-choice sections of immediate posttest 1 (t = 10.4887,p < 0.0001, Cohen's d = 2.407, df = 18) and delayed posttest 1 (t = 6.5544, p < 0.0001, Cohen's d = 1.504, df = 18) were remarkably better than those in their multiple-choice section in pre-test 1.Also, the group obtained higher scores in multiple-choice sections of immediate post-test 2 (t = 12.2252, p < 0.0001, Cohen's d = 2.615, df = 18) and delayed post-test 2 (t = 4.3500, p = 0.0004, Cohen's d = 0.998, df = 18) than in the multiple-choice section of pretest 2. However, the group showed significantly lower scores in the multiplechoice section of immediate post-test 1 (t = 8.261, p < 0.0001, Cohen's d = 1.896, df = 18) than in the multiple-choice section of delayed post-test 1.Also, their scores in the multiple-choice section of immediate post-test 2 (t = 8.7159, p < 0.0001, Cohen's d = 1.999, df = 18) were significantly lower than those in their multiple-choice section of delayed post-test 2.
To summarise, since groups A and B obtained higher scores in immediate post-tests and delayed post-tests than in pre-tests, they made considerable vocabulary gains after learning target vocabulary with both PFs and Quizlet.Thus, both Quizlet and PFs are effective in developing L2 vocabulary.The finding is in line with previous research into these tools (Ashcroft et al.; Azabdaftari and Mozaheb; Başoǧlu and Akdemir; Kiliçkaya and Krajka; Lees; Nikoopour and Kazemi; Sage et al.).However, participants in both groups experienced attrition with the target vocabularyas shown in the delayed posttest.This result indicates that the newly learned vocabulary items should be revised regularly for vocabulary retention.

Vocabulary gains comparison between groups using PFs and those using Quizlet
Another research question is the efficacy of Quizlet versus PFs.The two groups' word-form gains (i.e., scores in listening sections of pre-tests, immediate posttests and delayed post-test) and word-meaning gains (i.e., scores in multiplechoice sections ofpre-tests, immediate post-tests, and delayed post-test) were then compared by independent t-tests.The Bonferroni correction is applied to offset the chances of a Type 1 error for the analyses, so the level of statistical significance was set at p < 0.0167 (0.05/3).
Word-form gains:In the listening section of pre-test 1 group A performed significantly better than group B (t = 2.9140, p = 0.0060, Hedges' g = 0.93, df = 37).However, the actual advantage of group A students' word knowledge at pretest 1 was only 1.04 words out of fifty.Therefore, before intervention 1, their knowledge about the forms of target words taught during the intervention was roughly equivalent.In the listening section of immediate post-test 1, group B (i.e., the group using Quizlet) performed significantly better than group A (i.e., the group using PFs) (t = 17.3180, p < 0.0001, Hedges' g = 5.55, df = 37).Similarly, group B scores in the listening section of delayed post-test 1 were noticeably higher than group A (t = 9.4916, p < 0.0001, Hedge' g = 3.04, df = 37).An independent t-test (t = 0.4151, p = 0.6804, df = 37) indicated that group A and group B knowledge about forms of target words taught in intervention 2 before its beginning was statistically not different.Similarly, although group A on average obtained higher scores than group B in the listening section of immediate posttest 2, their gap was not significant (t = 0.9485, p = 0.349, df = 37).Nevertheless, group A performed significantly better than group B in the listening section of delayed post-test 2 (t = 2.7861, p = 0.0084, Hedges' g = 0.89, df = 37) (see Appendix D).
Word-meaning gains: Participants in groups A and B had approximately the same knowledge about meanings of target words taught in intervention 1 before it commenced (t = 0.5789, p = 0.5662, df = 37).However, group B (i.e., group using Quizlet) performed significantly better than group A (i.e., group using PFs) in the multiple-choice section of immediate post-test 1 (t = 8.7902, p < 0.0001, Hedges' g = 2.82, df = 37).In contrast, group B scores in the multiple-choice section of delayed post-test 1 were not significantly higher than group A scores (t = 2.4853, p = 0.0176 < 0.0167, df = 37).Groups A and B scores in the multiplechoice section of pre-test 2 were not significantly different (t = 0.9847, p = 0.3312, df = 37).Also, their scores in the multiple-choice section of immediate post-test 2 were not markedly different (t = 0.3112, p = 0.7574, df = 37), though group A (i.e., the group using Quizlet) has a higher mean score than group B (i.e., the group using PFs).On the other hand, group A performed significantly better than group B in the multiple-choice section of delayed post-test 2 (t = 3.5383, p = 0.0011, Hedges' g = 1.13, df = 37) (see Appendix D).
All things considered, scores of the groups using Quizlet were considerably higher than those using PFs.In other words, when the groups used Quizlet their vocabulary gains were significantly higher than those using PFs.In general, Quizlet seems to help L2 learners acquire vocabulary more effectively than PFs.This is consistent with previous studies into the two tools (Başoǧlu and Akdemir; Kiliçkaya and Krajka; Azabdaftari and Mozaheb).The findings can be understood by the different cognitive loads caused by learning vocabulary with PFs and Quizlet.With PFs participants must process aspects of the word (i.e., written form, spoken form, meaning, and syntactic category) within the visual channel.The learning loads might easily exceed the limited working memory capacity of the channel, which leads to cognitive overload and impedes meaningful learning (Mayer and Moreno 45).In contrast, when using Quizlet, the students can process the spoken form of the word in the auditory channel and visual aspects in the visual channel.Thus, the channels are less likely to be overloaded by having to process all of the required information.Our findings, therefore, support the CATLM assumptions about two separate information processing channels (i.e., auditory and visual channels) each with limited cognitive processing capacity sharing, in parallel, the cognitive load (Moreno and Mayer 313).

Linguistic environments created by Quizlet and PFs
We observed six randomly selected participants through video recordings (PFs) and screen captures (Quizlet) while they were carrying out their individual learning activities.Due to the paper nature of PFs, learners received only visual input (i.e., written forms, meanings, syntactic categories, example sentences and phonetic transcriptions of target words, and pictures) from learning activities with the tool.On the other hand, when using Quizlet, almost all learners received both visual and audio input.Curiously, one learner (B34) used only the modes that provide only visual input (i.e., Test, Match and Gravity) and consequently this participant, unlike the other Quizlet users, did not get auditory input from the app.
Regarding spoken output, three learners (i.e., A10, A17 and B14) pronounced target words when looking at the word's phonetic transcription on PFs, although producing spoken output was not required and audio input was not provided by the tool.In contrast, despite a great amount of audio input provided by Quizlet, only B14 produced spoken output when using the tool.In terms of written output, the participants, except A17, all produced the output as almost all the modes on Quizlet required it.Three of them also produced written output when doing individual learning activities with PFs, though it was not compulsory.
Concerning interaction, PFs are relatively limited in providing feedback.According to video recordings, participants had to manually flip the cards to check whether they remembered correctlythe form or meaningof aword.On the other hand, thanks to its digital nature, Quizlet provides immediate corrective feedback.For example, the recording of A1's performance on the app showed that when the student mistyped the word illiterate, its correct written form was given immediately, and A1 had to type it again.
In conclusion, participants were provided with more substantial input, feedback and output opportunities by Quizlet than PFs.The findingsshow that when participants used Quizlet they attained, on average, higher scores in all immediate post-tests and delayed post-tests than when they learned their vocabulary with PFs.In other words, Quizlet is more effective than PFs in developing L2 vocabulary because the linguistic environment created by the app offers greater advantages for vocabulary acquisition.However, it is worth noting that despite the lack of spoken input, PFs encouraged the students to produce spoken output more successfully than Quizlet.The difference in participants' behaviours might result from the different characteristics of learning activities on Quizlet and PFs.With Quizlet, learners must follow guidance and complete tasks created by the app, making the participants more focused on memorising written and aural forms, and meaning of the target words, but it does not require speaking the target words.By contrast, the learners had full control of learning activities with PFs.Additionally, phonetic transcriptions of target words, available on the cards, might stimulate their pronunciation by the student.However, the pronunciations were not guaranteed to be target-like.
Listening to and repeating new words are a recommended strategy for remembering their pronunciations.The reason is that learners, especially those at the early stage of learning second languages, having limited vocabulary knowledge in the L2, mainly rely on the phonological representations of new words stored in working memory to remember their spoken forms (Gathercole et al. 403).Thus, the repetition of new vocabulary items extends the periods during which their phonological forms exist in working memory, which lead to their retention in long-term memory (Baddeley et al. 158;Ellis and Beaton 535).Therefore, a Quizlet module that requires users to pronounce vocabulary items and gives feedback on their pronunciations would be beneficial for L2 learners.

Learners' perceptions of PFs and Quizlet
All participants stated that learning vocabulary on Quizlet was enjoyable, while 82% agreed on the enjoyment of PFs.Also, no one disagreed about the usability of Quizlet, and three participants did not think that PFs were user-friendly.Additionally, the number of learners who agreed that Quizlet increased their vocabulary learning speed (87.2%) was slightly higher than the number of those reporting that PFs enabled them to acquire vocabulary quickly (82.1%).On the other hand, slightly more participants thought that PFs helped them improve and retain vocabulary than those who agreed about Quizlet.However, according to their self-reports, the number of participants learning vocabulary on Quizlet very frequently (i.e., at least four times a week, which amounts to 43.6%) was nearly double the number of those doing so with PFs (23.1%; see Appendix C).
Participants' responses to the online survey revealed, moreover, that the majority (56.41%) preferred Quizlet as against those who favoured PFs (43.59%).KWIC Concordance software (Tsukamoto) was applied to analyse key words appearing in their reasons for the preferences.The word remember was mentioned thirteen times by the students preferring Quizlet, and nine times by those preferring PFs.Participants stated that the tools helped them remember new words faster.The word interesting appeared ten times in reasons for preferring Quizletand six times in those favouring PFs.Also, participants stated that they favoured the tools due to their usability and convenience.The words user-friendly and convenient were mentioned five times each by participants preferring Quizlet, and six and seven times, respectively, by those favouring PFs.
All things considered, most of the participants had positive perceptions of both Quizlet and PFs, and used the tools to learn vocabulary regularly.Besides, more learners agreed about the usability, enjoyment, and positive effect of Quizlet on their vocabulary learning speed.The finding from participants' perceptions helps explain the efficacy of Quizlet versus PFs to vocabulary acquisition.Also, it supports CATLM's assumptions concerning the influence of affective factors and motivation on learning (Moreno and Mayer 313).

Conclusion
The study investigated the effectiveness of Quizlet and PFs for vocabulary acquisition within the theoretical framework of CATLM (Moreno and Mayer 313) and Miyamoto's evaluation framework of digital learning tools (qtd.inKawaguchi 441).The study posed four research questions.The first asked whether Vietnamese high school students achieved significant vocabulary gains with Quizlet and PFs.According to statistical analyses, students made significant vocabulary gains regardless of which tool they used in each intervention.Thus, our research suggests that both Quizlet and PFs should be utilised in classroom settings as they have positive effects on the acquisition of L2 vocabulary.The second question asked whether there are any significant differences in vocabulary gains depending on each of these two tools.Our analyses suggested that when the groups used Quizlet they had more significant vocabulary gains from pretests to immediate post-tests and from pre-tests to delayed post-tests.Therefore, Quizlet promotes vocabulary acquisition more effectively than PFs.
Statistical tests suggest that Quizlet, which includes both auditory and visual inputs, has more significant potential to develop L2 vocabulary than PFs, which only include visual input.This is in line with the assumptions of CATLM (Moreno and Mayer 313) and is supported by our findings concerning the linguistic environments created by Quizlet and PFs, as well as learners' perceptions of the tools.Thus, teachers should consider the advantages offered by ICT to facilitate L2 vocabulary acquisition and engage students in the classroom.The third question relates to the differences between the linguistic environments created by two tools.Data analysis revealed that the multimodal linguistic environment created by Quizlet offers more input, learning activities, output opportunities, and detailed feedback than the one created by PFs.The last question concerns Vietnamese high school students' perceptions of these two tools.According to their responses to the questionnaire, they were cognitively, behaviourally, and emotionally engaged in vocabulary learning activities with both tools.However, Quizlet encouraged them to engage in vocabulary learning with a greater frequency than PFs.They also expresseda higher emotional engagement in using Quizlet than PFs.They perceived Quizlet as user-friendlier.They also stated that the app helped them to acquire vocabulary faster than PFs.On the other hand, PFs were considered to be more effective in vocabulary memorisation and development.
On completion of this study, we have a couple of recommendations for Quizlet to strengthen its effectiveness.As listening to and repeating new words reinforce memorisation of their spoken forms, a mode should be added on Quizlet that encourages users to practice pronouncing words.This mode should include diagnostic feedback on pronunciation.We believe that current digital technologies have this capacity and such an addition would benefit EFL learners enormously, particularly in countries like Vietnam where there are few opportunities to receive English native speakers' input.
The present study has several limitations.First, we investigated only one high school in Hai Duong, Vietnam.It would be important to confirm (or otherwise) our results with other schools in other provinces in Vietnam, or in other countries.Another limitation is that the sample size is relatively small.This study involved only one intact class at high school level.It would be necessary for future studies to examine a larger sample size to support the generalisability of the current research findings.Also, the duration of the experimental study was relatively short, which may have not been long enough to observe whether participants were able to retain in memory the target words.A

Table 4
: percentage of target words that participants knew or remembered on average Table Group A and group B scores in listening sections of pre-test 2 and immediate post-test 2 %