Comparing Two Approaches for Measuring English as a Second Language

This paper aims to investigate how proficiency rating scales, such as the Common European Framework of Reference for Languages (CEFR; Council of Europe), measure English as a second language (L2). While the CEFR has played an important role as a reference tool in second/foreign language teaching, learning, and assessment worldwide, few empirical studies have been conducted to explore how L2 learners at each proficiency level of the CEFR perform in various linguistic situations. In the present study, the data was collected from eighty-eight Japanese native (L1) speakers learning English L2, and each of them performed two tasks, namely spoken and written narratives. The participants’ English L2 performance was assessed using a proficiency rating, i.e., the CEFR and their grammatical development were analysed through Pienemann’s Processability Theory (PT). The results of the analyses demonstrated that there was a statistically significant correlation between the CEFR levels and PT stages but only in their spoken production. The Japanese learners of English at the higher developmental stages as found in the PT analysis were not necessarily regarded as ‘independent’ or ‘proficient’ L2 users according to the CEFR rating. Further, discrepancies between the two approaches were evident, particularly in the L2 written production.


Introduction
The aim of the present study is to compare two approaches for measuring English L2.In this study, the participants' L2 proficiency is measured by the Common European Framework of Reference for Languages (CEFR), a proficiency rating scale.Their L2 grammatical development is analysed based on a theory of second language acquisition (SLA), namely Processability Theory (PT; Pienemann; Bettoni and Di Biase) which is a direct measure of the learner's L2 morphosyntax.The CEFR has been adopted as a useful reference tool in language teaching, learning, and assessment in various regions since its publication in Europe.However, it has received some criticism from SLA scholars.For instance, Wisniewski argues that little is known about how the CEFR scale system is related to empirical learner language (233).Since the CEFR scale system has been widely used in foreign language education in Asian countries as well, it would be important to conduct empirical research on the CEFR levels with Asian L2 learner corpora.
This paper first presents a brief sketch of the CEFR proficiency scale system and then discusses the issues raised in SLA literature.In what follows, PT, which is a theoretical framework used for the analysis of L2 morphosyntactic development in the present study, is described.After a review of previous studies, the research questions and methodology are explained.In the latter part of the paper, the results of the analyses are presented and discussed.

The Common European Framework of Reference for Languages (CEFR)
The CEFR offers a comprehensive description of "what language learners have to do in order to use a language for communication and what knowledge and skills they have to develop so as to be able to act effectively" (Council of Europe 1).In the CEFR scale system, learners' language communicative proficiency is assessed at six levels, including A1, A2, B1, B2, C1, and C2.Those at the A1/A2 level can be regarded as Basic Users, those at the B1/B2 level can be considered as Independent Users, and those at the C1/C2 level are generally thought to be Proficient Users (Council of Europe 23).The CEFR offers the descriptions of what learners at each CEFR level can do in their oral and written production, as shown in Tables 1 and 2, respectively.

C2
Can write clear, smoothly flowing, complex texts in an appropriate and effective style and a logical structure which helps the reader to find significant points.

C1
Can write clear, well-structured texts of complex subjects, underlining the relevant salient issues, expanding and supporting points of view at some length with subsidiary points, reasons and relevant examples, and rounding off with an appropriate conclusion.

B2
Can write clear, detailed texts on a variety of subjects related to their field of interest, synthesising and evaluating information and arguments from a number of sources.

B1
Can write straightforward connected texts on a range of familiar subjects within their field of interest, by linking a series of shorter discrete elements into a linear sequence.

A2
Can write a series of simple phrases and sentences linked with simple connections lie 'and', 'but' and 'because'.

A1
Can write simple isolated phrases and sentences.
The CEFR scale system has been widely used as a common reference tool in recent second/foreign language education, in particular for syllabus construction, curriculum coordination, and the preparation of teaching materials and examinations.Thus, it can be claimed that the CEFR should play a crucial role in education for language learners in the digital age.However, it has been pointed out that little empirical research on the CEFR levels with L2 learner corpus data has been conducted.For instance, Hulstijn claims: Any association between CEFR levels of L2P [L2 Proficiency] and L2 development as studied in the second language acquisition (SLA) literature would be completely misplaced […], unless empirical studies show evidence in its support.(241) In particular, studies with learner corpora for L2 spoken production and for lower CEFR levels are limited (Wisniewski 246).Also, few studies have addressed the issues of the association between L2 proficiency measured by the CEFR rating and L2 development analysed based on SLA theories.The next section provides a brief introduction to Processability Theory (PT) and one of its hypotheses, used in this study for the analyses of L2 development.

Processability Theory (PT)
PT assumes that a universal hierarchy of L2 development exists.Based on Levelt's Speech Model and Lexical Functional Grammar (LFG) (Bresnan; Dalrymple), PT hypothesises the learners' developmental stages regarding the acquisition of grammatical structures, including morphology and syntax.In 2005, PT proposed new hypotheses concerning the acquisition of syntactic structures (Pienemann, et al.) in accordance with the development of LFG.The current study uses one of the hypotheses proposed in this PT extension, the Lexical Mapping Hypothesis (The LMH ; Pienemann, Di Biase, and Kawaguchi), in order to analyse the participants' L2 development.
The LMH focuses on the development of argument mapping between thematic roles and grammatical functions in sentence construction.It assumes that L2 learners start using "default mapping" when they become able to produce utterances of more than one word.In "default mapping", the highest available role in the thematic hierarchy, the Agent, is mapped onto the Subject (SUBJ) grammatical function.The sample sentence in (1) shows a typical "default mapping" with a transitive verb "wash", which requires two arguments.In this sentence, the most prominent role, the Agent "Mike" is mapped onto the SUBJ and the less prominent role, the Patient "the car" is mapped onto the Object (OBJ), as shown in Figure 1.Many scholars, such as Pinker, have also argued that beginning language learners map the most prominent thematic role onto the SUBJ.In the LMH, L2 learners are assumed to gradually learn how to direct the listener's attention to a particular thematic role lower in the hierarchy by promoting it to the SUBJ and de-focusing the highest role by mapping it onto a grammatical function other than the SUBJ, or by suppressing it.A typical case of non-default mapping is the Passive.In the sample sentence in (2), the Patient "the car" is mapped onto the most prominent grammatical function, SUBJ, while the highest thematic role, the Agent, is suppressed and appears as Adjunct, "by Mike," as represented in Figure 2. Since a much higher processing cost is required for "non-default mapping," L2 learners are hypothesised to become able to produce it only after "default mapping" is in place.The developmental stages of English syntactic structures hypothesised in the LMH are summarised in Table 3.The LMH predicts that each argument mapping is acquired from the bottom up, from Stage 1 to 4, as shown in the table.To be more specific, "default mapping" appears at stage 2, and the additional argument can be produced after "default mapping" at stage 3.Then, "non-default mapping" occurs at stage 4. The developmental stages hypothesised in the LMH have been tested in various L2 contexts (e.g., Bettoni, et al.; Di Biase, et al.; Kawaguchi; Keatinge and Keßler; Wang).

Default Mapping
Canonical word order e.g., agent-event-patient Mike washed the car.

Lemma Access single words formulas
Listen.
Thank you.

Previous studies
The relationship between L2 proficiency and L2 development has not yet been investigated extensively with L2 learner data (e.g., Hulstijn 241; Wisniewski 232).However, positive connections between the CEFR levels and the developmental stages predicted in PT have been reported in some recent research (e.g., Granfeldt and Agren; Hagenfeld; Yamaguchi).In Granfeldt and Agren, the development of morphosyntax was analysed based on PT and communicative proficiency was measured by two CEFR raters with written data produced by thirty-eight Swedish speakers learning French as a third language (L3).The results indicated a strong connection between the CEFR rating and the PT analysis, while a dispersion at more advanced stages was shown to exist.Based on these findings, Granfeldt and Agren claim that learners' communicative proficiency up to the CEFR B1 level and morphosyntactic development up to PT's stage 3 seem to develop at the same rate.
Hagenfeld examined speech samples of nine learners of English at different CEFR levels and found the possible interfaces between CEFR levels and PT stages.Based on the findings, Hagenfeld argues that linguistic profiling within the PT framework may be able to complement drawbacks of proficiency rating scales based on the CEFR which relies on the raters' opinions and judgments lacking objectivity and consistency.
While these studies have demonstrated some relationship between L2 proficiency and L2 development for morphosyntactic structures, Yamaguchi's earlier research on Japanese learners of English showed some connection between the CEFR levels and PT stages for English syntax.However, studies on the relationship between CEFR levels and PT stages using both spoken and written data from a larger number of L2 learners, in particular Asian L2 learners, are still limited.Since the CEFR scale system has been extensively used in recent foreign language education in Asian countries, as discussed earlier, more research on the CEFR levels with learner corpora of various Asian learners would be beneficial.

Research Question
This study explores the following research question.
-Are L2 proficiency levels as measured by the CEFR rating related to the developmental stages of L2 grammar as found in PT analysis in both spoken and written production by Japanese learners of English?

Methodology
The participants were eighty-eight Japanese native (L1) speakers learning English.They were between the ages of eighteen and thirty and studied English as a school subject in Japan for at least six years before participating in the current study.
For data collection, each participant was asked to perform both spoken and written narratives using a picture book called Frog, where are you?(Mayer).This picture book containing twenty-four wordless pictures has been recognised to be able to elicit various linguistic features in a number of language acquisition studies (e.g., Berman and Slobin; Lee; Minami).Half of the participants (i.e., forty-four participants) were asked to start with speaking, and the other half to start with writing to minimize the ordering effects.Their speech production was audiorecorded and transcribed, and their written production was recorded with pen and paper by the participants.Each task was not time-bounded, but the participants were asked to continue to narrate a story until they finish describing all the twenty-four pictures.
The participants' English L2 proficiency levels were measured by two trained CEFR raters with a complied scale consisting of "can-do statements" as shown in Tables 1 and 2 in the previous section.The raters assessed each learner's speaking and writing and provided each narrative with a CEFR score ranging from A1 to C2.
Regarding L2 development, their L2 developmental stages for English grammar, focusing on the acquisition of English syntactic structures, were analysed based on the LMH in PT (Pienemann, et al.).In the PT analysis for English syntax, the sentences constructed with default mapping, as in ( 3) and ( 4), are coded as Stage 2 structures.Then, the syntactic structures with default mapping and additional arguments, as in ( 5) and ( 6), are coded as Stage 3 structures.As for non-default mapping, passives, as in (7), and causatives, as in (8) and ( 9) are coded as Stage 4 structures.
(3) bees chased Tim (4) the boy and dog found the frog (5) they named their new frog friend Froggie (6) the dog put his head into the jar (7) the dog was chased by many bees (8) the bees made the dog run (9) the dog let the bees' nest fall off on the floor While most previous SLA studies examined L2 development based on accuracy, PT applies the emergence criterion.PT argues that "emergence can be understood as the point in time at which certain skills have, in principle, been attained or at which certain operation can, in principle, be carried out" (Pienemann 138).According to PT, using a grammatical structure at a high level of accuracy, even 80% to 90%, does not guarantee that the learner will be able to continue producing that structure at the same or higher level of accuracy in the future.In this study, the emergence criterion in PT is applied to determine if the participant has acquired a target grammatical structure.In other words, to identify whether a stage has emerged in their English production, this study assesses whether each participant uses a grammatical structure systematically and productively regardless of the accuracy of the structure produced.While structural and lexical variation should be examined to exclude the formulaic uses of morphological structures, one sample can be regarded as evidence of the emergence of a syntactic structure.For instance, a learner was considered to have reached PT stage 4 when one sentence with non-default mapping, that is, passive or causative construction, appeared.

L2 proficiency levels as found in the CEFR rating
Figure 3 presents the results of the CEFR rating for L2 proficiency levels found in English speaking and writing by eighty-eight Japanese L1 speakers.The CEFR levels for the participants were found to range from A1 to B2 for both speaking and writing.
As for the spoken production, ten participants were rated as A1 level, while sixty of them were rated as A2 level.In other words, seventy Japanese L1 speakers in this study were regarded as Basic Users in the English spoken narratives.On the other hand, fifteen learners were rated as B1 level, while three learners were rated as B2 level.This shows that eighteen Japanese L1 speakers were regarded as Independent Users in speaking.
Regarding the written production, only one learner was rated as A1 level, while fifty-two participants were rated as A2 level.This suggests that fifty-three Japanese L1 speakers in this study were regarded as Basic Users in the English written narratives.On the other hand, thirty-one participants were rated as B1 level, and four of them were rated as B2 level.This shows that thirty-five Japanese L1 speakers in this study were regarded as Independent Users in writing.That is, more Japanese L1 speakers were rated as Independent Users of English in Writing than in Speaking.This suggests that Japanese L1 speakers tended to perform better in the written narratives than in the spoken narratives.In particular, only one participant was found to be at CEFR A1 level in the written task while there were ten participants who obtained CEFR A1 level in the spoken task.
According to SLA literature (e.g., Foster and Skehan 321), L2 learners should be able to use their linguistic knowledge more efficiently when more time for planning and monitoring is available in communicative situations.Also, it has been reported that L2 learners' performance becomes more accurate and complex in the written task, which can allow them to spend more time planning and monitoring, than in the spoken task (Kormos 208).Thus, it can be argued that the participants in the present study were also able to use their linguistic resources more efficiently in the written narratives than in the spoken narratives.

L2 developmental stages as found in the PT analysis
As shown above, the participants' English proficiency levels as found in the CEFR rating ranged from A1 to B2. Figures 4 and 5, respectively, present the distribution of the developmental stages of L2 syntax found in the PT analysis of the English spoken and written production by eighty-eight Japanese L1 speakers at four different CEFR levels.

PT stages in the spoken production
As shown in Figure 4, the developmental stages for English syntax found in the PT analysis of the participants' spoken production ranged from stage 2 to stage 4.That is, all the participants were found to be able to use default mapping in the sentence formation in the spoken task.
Concerning ten CEFR A1 level participants, five of them were regarded to be at PT stage 2 since they used only default mapping when they constructed sentences.On the other hand, the other five participants produced sentences with default mapping and additional arguments, as in ( 11), but non-default mapping was not used.participants (i.e., twenty-nine participants) who used non-default mapping in the sentence construction, as in ( 12) and ( 13), were regarded to have reached PT stage 4.This suggests that nearly half (i.e., 48.3%) of the participants who were rated as Basic Users in the CEFR rating were considered to have acquired syntactic structures belonging to the highest developmental stage of English L2 in the processability hierarchy.
(12) #8 the boy was chased by the owl and he was . he .he. he reached the the rock (13) #59 he fell in the tree and his dog was followed by bees Out of fifteen CEFR B1 level participants in the spoken task, four of them were regarded to be at PT stage 3, while eleven were found to have reached PT stage 4. As for three CEFR B2 level participants, all of them were found to have reached PT stage 4. Thus, nearly 80% (i.e., 77.8%) of the participants who were rated as Independent Users in the CEFR rating were found to have acquired advanced English syntactic structures belonging to the highest PT stage (i.e., stage 4).Also, it should be noted that those Independent Users produced causatives, as in ( 14), in addition to passives, more frequently than the Basic Users who were regarded to be at PT stage 4 did.( 14) #18 all of the sudden a deer pops out of the bush where the boy was standing near the rock which makes the boy fall onto the deer's head blinding the deer

PT stages in the written production
Figure 5 shows that the developmental stages for English syntax found in the PT analysis of eighty-eight participants' written production ranged from stages 2 to 4. As shown in the Figure, the participant who was rated as CEFR A1 level was regarded to be at PT stage 2 since only default mapping was used in the sentence formation.
Regarding fifty-two CEFR A2 level participants, one participant was regarded to be at PT stage 2, and twenty-three participants were considered to be at PT stage 3 since they produced the sentences with default mapping with additional arguments, as in (15), but did not use non-default mapping.On the other hand, twenty-eight participants used non-default mapping, as in ( 16) and ( 17), so they were regarded to have reached PT stage 4.This indicates that more than half (53.8%) of the participants who were rated as Basic Users in the CEFR rating were considered to have acquired the advanced grammatical structures belonging to the highest developmental stage for English syntax in the processability hierarchy.As for thirty-one CEFR B1 level participants, eleven of them were regarded to be at PT stage 3, while twenty were considered to have reached PT stage 4. Out of four CEFR B2 level participants, one of them was regarded to be at PT stage 3 and three were found to be at PT stage 4.This suggests that about 66% (i.e., 65.7%) of the participants who were rated as Independent Users in the CEFR rating produced the advanced English syntactic structures belonging to the highest PT stage (i.e., stage 4) in their written performance.As found in the spoken task, causatives, as in ( 16), as well as passives appeared more often in the written performances by the Independent Users than those by the Basic Users who were regarded to have reached PT stage 4. ( 16) #21 Suddenly an owl came out of the tree, causing the boy to fall off the tree

Relationship between L2 proficiency and L2 development
The scatterplots in Figures 6 and 7 present the correlation between L2 proficiency and L2 development found in the participants' English spoken and written production, respectively.In these scatterplots, the size of each circle varies according to the number of participants.According to the scatterplot in Figure 6, there seems to be a linear correlation between the CEFR levels and PT stages found in the English spoken production of the participants.On the other hand, as shown in the scatterplot in Figure 7, it is not clear whether there is a correlation between the CEFR levels and PT stages in the participants' written production.In other words, there seem to be more dispersions between the two approaches in writing than in speaking.
In order to answer the research question of whether L2 proficiency levels as measured by the CEFR rating are related to L2 development as found in PT analysis in both spoken and written production by Japanese learners of English, the strength of the association between the CEFR levels and PT stages was statistically assessed using a Spearman's rank-order correlation analysis.As for the spoken production, the results demonstrate that the correlation between the

CEFR level
proficiency levels shown in the CEFR rating and the developmental stages for English syntax found in the PT analysis is statistically significant at the 0.01 level (rs[88] = .486,p < 0.01), although the association between the CEFR levels and PT stages was not shown to be strong.
In contrast, the correlation between the CEFR levels and PT stages in the participants' written production was not found to be statistically significant.Although Granfeldt's and Agren's study found a strong correlation between the CEFR levels and PT stages found in French written production by Swedish L1 speakers, a statistical correlation between the two approaches was shown only in terms of the participants' spoken production in this study.While both previous studies (Granfeldt and Agren; Hagenfeld) examined the acquisition of morphosyntactic structures by European learners, this study focused on the development of English syntax by Asian learners.Further research should be conducted with various L2 learner data to explore whether the methodological differences (the differences in target grammatical structures, participants' language backgrounds, tasks, and so on) caused this discrepancy in the results between the present and previous studies.
While the results in this study were not consistent with the previous findings in terms of L2 writing, the dispersion was shown to increase in L2 production by the participants at higher developmental stages in both speaking and writing, as demonstrated in Granfeldt and Agren.Since PT applies the emergence criterion, the participants in PT studies are considered to have reached a certain developmental stage as long as a target grammatical structure belonging to the stage appears productively and systematically even though their L2 production is not accurate.According to the descriptors in Grammatical accuracy for early CEFR levels (i.e., A1, A2), as shown in Table 4, L2 learners who produce some advanced grammatical structures can be rated as 'Basic Users' if they make some basic mistakes.This may be a possible reason the discrepancies between two approaches occur in particular regarding the participants at higher developmental stages and that a majority of CEFR A2 were regarded to have achieved the highest developmental stage (i.e., PT stage 4) for English syntax in the PT analysis in the present study.Thus, it can be suggested that the descriptor items of "early CEFR levels in terms of the complexity of operations beginning learners are assumed to manage may require reconsideration" (Hagenfeld 135).
In addition, it should be noted that this study found that more participants exhibited higher L2 development than L2 proficiency in writing than in speaking.As mentioned previously, it has been argued that L2 learners are able to spend more time searching their linguistic resources in writing than in speaking (e.g., Foster and Skehan 321) and that the lack of time pressure and availability of monitoring should have beneficial effects on L2 written production (Kormos 208).Thus, it can be considered that the participants in this study were able to produce various linguistic features, including advanced syntactic structures, in the written narratives than in the spoken narratives regardless of the proficiency level.

Conclusion
This paper investigated the possible relationship between two approaches for measuring English L2 through a learner corpus of English production by Japanese L1 speakers.In order to address the research question of whether L2 proficiency levels as measured by the CEFR rating are related to the developmental stages of L2 grammar as found in PT analysis in both spoken and written production by Japanese learners of English, eighty-eight participants' English L2 proficiency levels and grammatical development were analysed based on the CEFR and Processability Theory respectively.The results of the analyses demonstrated that there was some connection between the levels of L2 proficiency and the developmental stages of L2 grammar.However, a statistically significant correlation between the CEFR levels and PT stages was shown only in the participants' spoken production.In addition, it is shown that the participants at the higher developmental stages as found in the PT analysis were not necessarily regarded as 'independent' or 'proficient' L2 users in the CEFR rating.In other words, the dispersion between the two approaches was found to increase at higher developmental stages, as found in previous research, in both spoken and written tasks.In particular, the participants in this study exhibited higher L2 development than L2 proficiency in writing than in speaking.This suggests that the Japanese L1 speakers at lower proficiency levels also attempted to use more advanced grammatical structures in the written tasks than in the spoken tasks even though the accurate written production had not yet been achieved.
The current study contributes to a better understanding of the possible relationship between L2 proficiency levels and L2 grammatical development with additional empirical evidence.However, there was a discrepancy between the results of the statistical analyses in this study and those in previous research.The present study used only English spoken and written narratives as elicitation tasks and focused on the acquisition of English syntax by Japanese L1 speakers.Thus, further empirical research should be conducted with more diverse learner data in order to investigate the generalisability of the findings in this study.Although there may be some issues in the comparability between the CEFR levels and PT stages, empirical research on the possible association between L2 proficiency as measured by the CEFR rating and the developmental stages as found in the PT analysis could be a plausible approach to explore the theoretical validity of the CEFR descriptors.A detailed understanding of how L2 learners at each proficiency level perform in various communicative situations would lead to the more efficient application of the CEFR scale system to foreign language education in the digital age.

Figure 1 .
Figure 1.Default mapping: Mike washed the car

( 2 )Figure 2 .
Figure 2. Non-default mapping: the car was washed by Mike

Figure 3 .
Figure 3. L2 proficiency levels found in the CEFR rating of English production by 88 Japanese L1 speakers

Figure 4 .Figure 5 .
Figure 4. PT stages of English syntax found in the spoken production by 88Japanese L1 speakers at four different CEFR levels deer threw Tom and Tim from.into the water Regarding sixty CEFR A2 level participants, thirty-one of them were found to have acquired PT stage 3 structures.The rest of the CEFR A2 And the deer put him on the head of deer and took him to somewhere (16) #36 A boy was carried by a deer.A deer ran fast (17) #64 A boy was attacked by the snake from the hole in the ground

Figure 6 .Figure 7 .
Figure 6.Correlation between English L2 proficiency and L2 development found in the spoken production by 88 Japanese L1 speakers

Table 2 After
OVERALL WRITTEN PRODUCTION (Council of Europe 61) CEFR level OVERALL WRITTEN PRODUCTION

Table 3
Developmental Stages for English Syntax Based on the Lexical Mapping Hypothesis(After Pienemann, et al. 246)