Investigating the Use of Academic Words in MA Students’ Trial Assignment

Jie Liang (Jessica)


This small-scale research is a corpus-based study that aims to explore the use of academic vocabulary in students’ academic writings at a tertiary level. A number of studies cited insufficient knowledge of academic vocabulary is challenging for students who learn English for academic or specific purposes (EAP or ESP), especially when reading and writing academic texts (Chen and Ge, 2007; Mozaffari and Moini, 2014; Vongpumivitch, 2009). In order to help students’ academic vocabulary learning, Coxhead (2000) proposed an Academic Word List (AWL) in 2000, which is a valuable resource to enrich EAP students’ knowledge of academic words. This study collected 15 trial assignments written by MA ELT students in total, evaluating what were the most frequently used AWL words and how these high-frequency AWL words were used in the assignments. The results show that participants have already known the majority of the academic words included in Coxhead’s (2000) AWL.

 Keywords: academic vocabulary, academic word list, EAP, corpus analysis


Learners’ proficiency of English has a direct connection with their knowledge of vocabulary (Lewis, 2002). The limited knowledge of vocabulary has been widely acknowledged as one of the predominate constraints for learners of English as a Foreign Language (EFL) as they were exposing to academic discourse, especially academic reading and writing (Chen and Ge, 2007; Mozaffari and Moini, 2014; Vongpumivitch, 2009). When reading academic texts, Chanasattru and Tangkiengsirisin (2016) found that Thai EFL learners’ familiarity with the academic words can positively lead to better reading comprehension. Similarly, Nicole and Cheryl (2015) also mentioned in their study that using accurate vocabulary can convey more precise meanings and learners who have the abilities to use an academic register or a specific register of words can generate more effective writing. However, Coxhead (2000) pointed out the most challenging aspects of vocabulary learning comes from the decision about which words are worth to teach in class. In order to solve this problem, Nation and Meara (2001) suggested that it is possible for both EFL teachers and learners to pay more attention to words which have high frequency in the context they encountered. They also recommended, considering the needs of students who learn English for academic purposes (EAP), the Academic Word List (AWL) designed by Coxhead (2000) can be regarded as a valuable resource for learning academic vocabulary because this list specifically targets at the academic context (Nation and Meara, 2001)

As a result, in this assignment, I will do a frequency analysis about AWL words in MA students’ academic writings. I have collected 15 trial assignments which were English academic writings composed by MA English Language Teaching (ELT) students from the Centre for Applied Linguistics (CAL) in the University of Warwick. and then, I will establish a small corpus called MATA (MA Trial Assignment) by compiling all the collections. My corpus contains 26148 running words. These assignments were mainly written by Chinese students (13 students) plus to one student from Korea and one from Indonesia. The topic of the trial assignment refers to discussing the role of English in students’ home country and people’s attitudes towards English in their context. In fact, it is hard to define a specific register for all these trial assignments because students’ compositions might involve discussion about different topics such as education, culture, politics, media, etc., However, it is still valuable to analyze the trial assignment, since this assignment was the first opportunity for MA students to practice their skills of writing academic essays. Although MA students have already been admitted to an advanced level of English learners and they have consciousness of using formal words in written register under academic contexts, they perhaps not be sure enough about whether the words they employed made sense to the context and whether they were academic words or not. As a result, exploring what were the most frequently used academic vocabulary in the assignments and how these high-frequency words were represented would be an interesting and worthy topic to study.

This essay will firstly count the frequency of AWL words in the MATA corpus by using Sketch Engine. Then, using Compleat Lexical Tutors, a useful website that helps to identify AWL words (Cobb, n.d.), to find out how these high-frequency AWL words were used by students in the corpus. Meanwhile, Compleat Lexical Tutors also has the function of identifying non-AWL words, represented as K-1 and K-2 words (the first and second 1000 words of high-frequency words) and off-list words (i.e. words that are either frequently used nor occurred in an academic context). With the help of this function, I will also pay attention to such non-academic vocabulary, typically focusing on what kind of off-list words are more frequently used in trial assignments.

Based on the explorations, the expected outcome of this research would be generating some pedagogical implications which not only facilitate MA ELT students learning academic vocabulary but also assist pre-sessional and in-sessional EAP tutors to teach MA students’ academic words more effectively, making them become more familiar with the words which were less commonly occurred in their academic writings.

Literature review

What is academic vocabulary?

At the beginning of investigating academic words, it is necessary to know the definition of academic vocabulary and the characteristics of academic vocabulary. According to Baumann and Graves (2010), they suggested that academic words refer to the words that are recurrent in academic texts but not so common in other kinds of texts. Also, Townsend and Kiernan (2015) stated academic words “are used across content areas, have abstract definitions, and are a challenge to master”. In order to make the definition of academic vocabulary more distinct, theorists often make comparisons between academic vocabulary with other two kinds of English vocabulary, high-frequency words, and technical words.

In 1953, Michael West created a collection of high-frequency words in English, which was called General Service List of English Words (GSL) (Coxhead and Nation, 2001). GSL consisted of 2000 word families which served as a fundamental and common role for all language use (Coxhead and Nation, 2001). Coxhead and Nation (2001) also mentioned GSL words normally make up 80% coverage of the running words in the academic text.

Technical words, also known as “domain-specific academic vocabulary”, often refers to the words which are more likely to occur in a specific register or domain with specific contextual meaning (Baumann and Graves, 2010). This group of words, addressed by Coxhead and Nation (2001), often consists of 5% of the running words in an academic text.

Compared with these two different domains of words, Coxhead (2000) suggested that it was highly important for EFL learners with academic purposes learning AWL words, because academic vocabulary plays a common but vital role in the academic register and allows language teachers to better implement vocabulary teaching syllabus.

How academic word lists develop?

As a matter of fact, the exploration of a word list which can fulfill learners needs when learning English for academic or specific purposes (EAP or ESP) has been continuously developing over the years. In the early phase of establishing a word list for academic purposes, researchers tended to rely on textbooks used in universities, listing high-frequency words occurred in various textbooks as a reference word list for EAP learners. For example, in 1972, Praninskas (cited in Silva, 2016) has produced an American University Word List by looking at the high-frequency words in ten textbooks with different disciplines. However, these word lists have some weaknesses such as the small-scale collection of texts and the lack of texts covering various subjects and disciplines (Coxhead, 2000). Then, consequently, in 2000, Coxhead created a large corpus, covering a range of diverse types of texts across 28 different subjects and firstly introduced a new Academic Word List (AWL), which contains 570 word families and excludes words in the GSL (Coxhead, 2000). In addition, it is noticeable that word families highlighted here are different from words in general. Word families are made of the lemma plus its other prefixes and suffixes forms (Hirsh and Nation, 1992). For example, the lemma of this group of words, “communicate, communicates, communicating, communicated, communication(s), communicative, communicator(s), etc.”, is “communicate” and the occurrence of each item in the Coxhead’s (2000) corpus will be counted as one time for a “communicate” word family. Moreover, the words included in Coxhead’s (2000) AWL basically have three features in common: constantly occurred (more than 100 times) in the corpus, occurred at least 10 times in each sub-corpus of different disciplines, and excluded from the GSL list. Since Coxhead (2000) designed the list by looking through a large number of written academic texts in different registers and subjects, the introduction of AWL has significantly contributed to EAP courses, setting basic vocabulary goals for learning and providing a reference word list for students when writing for academic purposes (Mozaffari and Moini, 2014).

Previous studies about analysis of AWL words

The purpose of the current study is to explore the frequency of the AWL words used in students’ academic writings. In fact, many previous studies have already done some explorations on the use of the AWL in texts in various specific contexts such as Applied Linguistic, Medical, Agriculture, Engineering, etc. For example, in Chen and Ge (2007) research about the use of AWL word in medical research articles, researchers showed that academic vocabulary has a high text coverage (10.09%) in medical research articles. Similarly, Vongpumivitch (2009) conducted a frequency analysis of AWL words and non-AWL words in applied linguistics research articles. They (Vongpumivitch, 2009) reported the AWL words accounted for 11.7% in their collection of applied linguistics research papers, which was even higher than the proportion of AWL words in medical texts as presented by Chen and Ge (2007). Based on the findings generated by previous studies, the important role that AWL words play in written texts in different professional fields has been justified.

Meanwhile, based on the frequency analysis for a specific domain in which language was used, researchers also attempted to create an academic word list for the specific domain. For instance, Vongpumivitch (2009) provided a list of 475 AWL words that were frequently used in Applied Linguistics. Chanasattru and Tangkiengsirisin (2016), similarly, developed an AWL with 394 word families that were marked as frequently appeared in the Social Science domain. These newly developed word lists were worthy for both learners and teachers who learn or teach English for specific purposes. The AWL for a special discipline can positively influence learners’ output, especially reading and writing output (Vongpumivitch, 2009). EAP and ESP teachers also can be instructed by the word list and able to select more effective materials for their learners with a specific purpose of learning English (Chanasattru and Tangkiengsirisin, 2016).

Moreover, a shared outcome for previous studies about the analysis of AWL words in different disciplines is to justify the value of learning academic words for students from different majors and provide pedagogical implications for EAP or ESP courses. For instance, Chen and Ge (2007) stated that teaching and learning AWL words is helpful for students writing and constructing more effective arguments in medical research articles. However, some people argued that paying more attention to teaching academic vocabulary may result in a lack of exposure to the specific vocabulary that students need (Hyland and Tse, 2007). Although Vongpumivitch (2009) also created a list of 128 non-AWL words that were frequently used in their corpus, it was only related to the specific field, applied linguistic, on which their research was based. Thus, further studies can be inspired that it is also worthy to explore high-frequency non-AWL words in different disciplinary writings, which might better cater for students’ needs, extending their knowledge of technical terms for the specific register or discipline.


Counted by the Lexical Tutor, as shown in Figure 1, K-1 words (the first 1000 words in GSL) undoubtedly make up the highest proportion (74.46% tokens) in students’ trial assignments, because K-1 words are the most common word used in English. Moreover, Figure 1 also indicates that the use of AWL word accounts for 9.03% of the total words in students’ trial assignments, which is quite close to the 10% coverage of running words consisted by academic words in academic texts (Coxhead and Nation, 2001). It suggests that MA ELT students from the University of Warwick are aware of using academic words to make more formal compositions, though the proportion of AWL words in their essays is lower than that in previous studies, like 10.07% coverage in Chen and Ge’s (2007) corpus of medical research articles and 11.7% coverage in the Vongpumivitch (2009) corpus of applied linguistics texts. Meanwhile, over the 570 word families included in Coxhead’s (2000) AWL, there are 344 AWL word families occurred in the MATA corpus, indicating that students are capable of employing most of the academic words, but on the other hand, their knowledge about academic vocabulary is still in need of expanding, especially those AWL words that are missing or seldom used in their assignment.


Freq. Level Families (%) Types (%) Tokens (%) Cumul. token %
K-1 Words 644 (53.22) 1259 (41.33) 19470 (74.46) 74.46
K-2 Words 222 (18.35) 334 (10.97) 1223 (4.68) 79.14

[570 fams]

TOT 2,570

344 (28.43) 598 (19.63) 2360 (9.03) 88.17
Off-List: ?? 860 (28.23) 3095 (11.84) 100
Total (unrounded) 1210+? 3046 (100) 26148 (100) ≈100.00

(Figure 1: Frequency of AWL words in the MATA corpus, counted by Compleat Lexical Tutors)

Since one of the major objectives of the present study is to find out the most frequently used AWL academic words in MA students’ trial assignment, Figure 2 has listed the AWL word families with the highest frequency (at least 30 times) in the MATA corpus.


Word Families Frequency
Status 114
Role (roles) 94
Attitude (attitudes) 90
Policy (policies) 70
Communicate (communicates, communicating, communication, communications, communicative) 67
Culture (cultural) 55
Media 43
Identify (identifying, identification, identity, identities) 38
Significant (significantly, significance) 38
Globe (global, globalization, globalisation) 33
Promote (promoted, promoting, promotion) 33
Emphasis (emphasize, emphasise, emphasizes, emphasized) 31
Economic (economically, economy) 30

(Figure 2: Frequently used AWL words in MATA corpus listed by Compleat Lexical Tutors)

In the collection of trial assignments, the most commonly used AWL word is “status”, occurring 114 times, following by “role(s)” with 94 times of occurrence. Other AWL word families such as “attitude”, “policy”, “communicate”, “culture” etc. also appeared very frequently. In order to investigate how these words were represented by students in the assignments, I used Sketch Engine to see the concordance lines of each word listed in Figure 2. It is quite interesting that the top two words with the highest frequency, “status” and “role”, seem to be often used as a pair of synonyms in the context as the concordance results showed (Figure 3 and Figure 4). Meanwhile, if searching the lexical chunk “status and role(s)” in the corpus, there were 13 results of this chunk (see Figure 5), which suggested two possible interpretations: (a) the title of trial assignment had a powerful influence, (b) students regarded this pair of words as synonyms. According to the trial assignment description, students were required to “describe some aspects of the status and roles of English and other languages in your home country”. When making MATA corpora, although I excluded the assignment title from students’ assignments, the core focus of their writing was discussing the status and roles of English in their nations. Therefore, it is possible that students tended to copy and paste this phrase when writing and led to the high coverage of this chunk in the corpus. The second probability indicated that MA students considered these two words has the same meaning when writing their trial assignment. Referring to Oxford Advanced Learner’s Dictionary (n.d.), which defined “status” as “the level of importance that is given to something”, while the definition of “role” in the dictionary was “the degree to which somebody/something is involved in a situation or an activity and the effect that they have on it”. Basically, the meanings of these two words are coincident with each other to a certain degree, but the “role” that is given to something marked as important or weak, while “status” only emphasizes the “importance” of something. As a result, the use of this lexical chunk suggested that students may not be very clear about the precise meaning of each item.


(Figure 3: Concordance of “status” in MATA corpus)


(Figure 4: Concordance of “role(s)” in MATA corpus)


(Figure 5: Concordance of lexical chunk “status and role(s)” in MATA corpus)

Moreover, I was curious about why the words listed in Figure 2 had a quite high frequency. As mentioned previously, the register of the trial assignment is hard to specifically define because students probably involve and elaborate various topics or aspects when explaining the role of English and people’s attitudes towards English in their culture. From my perspective, this topic should belong to a Social Science domain and in order to verify my opinion, I went through a word list for Social Science designed by Chanasattru and Tangkiengsirisin (2016), called Social Science Word List (SSWL). Searching in their SSWL, all the words with at least 30-time occurrences in my corpus can be found in Chanasattru and Tangkiengsirisin’s (2016) SSWL, except for “Emphasis”. Consequently, this finding demonstrates that students’ trial assignments have a tight connection with the Social Science domain and they can make reference to Chanasattru and Tangkiengsirisin’s (2016) SSWL when doing academic writing such as the trial assignment.

Apart from the AWL words, it is also worthy to see what are the off-list words appeared in my corpus because some non-AWL technical words in the off-list category perhaps reflect some important concepts in the domain that are useful for learners. Figure 6 shows the top 10 frequently-used lemmas in the off-list category.


Rank Word Families Frequency
1 China (Chinese) 740
2 Reform (reformation, reformed, reforming, reforms) 60
3 Dialect (dialects) 45
4 Province (provinces) 38
5 Gaokao 31
5 Korea (Korean, Koreans) 31
7 Compulsory 30
7 Score (scores, scoring) 30
9 Curriculum (curriculums) 24
10 Proficiency (proficient, proficiently) 21

(Figure 6: Top 10 frequently used off-list word families in MATA corpus, counted by Compleat Lexical Tutor)


It is not surprising that “China” was the most commonly used non-AWL word (740 times) in the MATA corpus because many assignments were written by Chinese students and the content mainly reflected the situation of English in China. Similarly, the appearances of “Province”, “Gaokao” and “Korea” were also caused by the target context that students dealt with. The rest of the words in the list, on the other hand, have pedagogical values because they are regarded as key technical items in the corpus. According to Vongpumivitch (2009), these commonly used but excluded from Coxhead’s (2000) AWL words provide learners “a window to the specialized content areas in the field as well as an important concept in academic research”.


Based on the findings, generally, it can be concluded that MA ELT students have the awareness of employing academic words in their trial assignments and they have already known the majority of the academic words included in Coxhead’s (2000) AWL. However, the absence of some AWL words in the MATA corpus probably also indicate that student has different familiarities towards the AWL words and further practices or drills about the words that they are unfamiliar are demanding in the classroom. Meanwhile, as the example of “status and role(s)” demonstrated, when teaching academic vocabulary, EAP teachers should carefully select teaching materials, putting vocabulary acquisition contextually and providing precise use of the individual word. Finally, it is also significant for teachers and learners to pay attention to the non-AWL words since they are key technical terms related to the specialized register and discipline.

However, this study has some limitations. The frequency analysis in the present study only based on a very small corpus and only 15 trial assignments were collected, which suggests that the findings may be not representative enough and have a deficiency of justifying a shared problem for all MA ELT students. In addition, this study only focuses on MA groups from the CAL, ignoring other groups of students such as MSc students in the department or other postgraduate students in other departments such as Engineering, Humanities, Business, and Finance etc.

Furthermore, based on the findings of the present study, further studies can continue to explore the words in Coxhead’s (2000) AWL that MA ELT students were unfamiliar with, through a corpus-based approach and generate effective pedagogical strategies about how to teach these less common AWL words in EAP classes. Meanwhile, it is also possible to conduct a research about whether MA ELT students are able to distinguish academic and non-academic words, through a qualitative research, such as doing interviews or a case study about a focused group of students.



Baumann, J. and Graves, M. (2010) What is academic vocabulary? Journal of Adolescent & Adult Literacy, Vol. 54, No. 1, pp. 4-12

Chanasattru, S. and Tangkiengsirisin, S. (2016) Developing of a high-frequency word list in Social Sciences. Journal of English Studies, Vol. 11, pp. 41-87.

Chen, Q. and Ge, G. (2007) A corpus-based lexical study on frequency and distribution of Coxhead’s AWL words families in medical research articles (RAs). English for Specific Purposes, Vol. 26, pp. 502-514

Cobb, T. (n.d.) Why and how to use frequency lists to learn words. Retrieved from: [30 March 2017]

Coxhead, A. (2000) A new academic word list. TESOL Quarterly, Vol. 34, No. 2, pp. 213-238.

Coxhead, A., & Nation, P. (2001). The specialized vocabulary of English for academic purposes. In J. Flowerdew & M. Peacock (Eds.), Research perspectives on

English for academic purposes (pp. 252–267). Cambridge: Cambridge University Press.

Hirsh, D. and Nation, P. (1992) What vocabulary size is needed to read unsimplified texts for pleasure? Reading in a Foreign Language. Vol. 8. No. 2, pp. 689-696

Hyland, K. & Tse, P. (2007). Is there “an academic vocabulary”?. TESOL Quarterly, Vol. 41, No. 2, pp. 235 – 253.

Lewis, M. (2002) The Lexical Approach: The State of ELT and a Way Forward. Chapter 7, pp. 115- 132. London: Thomson Corporation

Mozaffari, A. and Moini, R. (2014) Academic words in education research articles: A corpus study. Procedia-Social and Behavioral Sciences, Vol. 98, pp. 1290-1296.

Nation, P. and Meara, P. (2001) Vocabulary. In N. Schmitt (Ed): An Introduction to Applied Linguistics. Chapter 3, pp. 34-52. London: Hodder & Stoughton Ltd.

Nicole, B. and Cheryl,  B. (2015) Fostering academic vocabulary use in writing. The CATESOL Journal. Vol. 27. No. 1, pp. 131-148.

Oxford Advanced Learner’s Dictionary (n.d.) Retrieved from: [27 March 2017]

Silva, L. (2016) Academic vocabulary: a corpus linguistic study on how Brazilian students write academic English. Unpublished Ph.D. Dissertation. Centre for Applied Linguistic, University of Warwick.

Townsend, D. & Kiernan, D. (2015). Selecting academic vocabulary words worth teaching. The reading teacher, Vol. 69, No. 1, pp. 113 – 118.

Vongpumivitch, V., Huang, J., and Chang, Y. (2009) Frequency analysis of the words in the Academic Word List (AWL) and non-AWL content words in applied linguistics research papers. English for Specific Purposes, Vol. 28, pp. 33-41.


Jie Liang is currently a MA student in English Language Teaching at the University of Warwick. Before entering Warwick,

she obtained her BA degree from a Sino-British university in China in 2016.

She is interested in written discourse analysis and teaching English for academic purposes.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s