Mapping the Linguistic Context of Citations

Scientific papers are routinely structured in sections for introduction, methods, research and discussion, a standard since the 1970s. Citations originating within each section serve different purposes and can be meaningfully classified according to position, shedding light on an author’s purpose for the citation. Furthermore, words near the citations in the various sections differ, providing the basis for lexical and semantic analysis of citation contexts. Approximately 50,000 scientific papers from seven PLOS journals published between 2009 and 2012 were analyzed for citation use within the identifiable document structure and for verbs used in the context of the citations. Frequencies of verbs in the four section types demonstrate the predominant use of certain words by section. Introduction sections showed greater variety of verbs, while a more limited range of verbs was seen in Methods sections. The lexical distribution process may be applied to other contexts supporting text processing based on XML format.

