Examining Anti-Plagiarism Software:
Choosing the Right Tool

The CMC Anti-Plagiarism Software Survey


Cynthia Humes, Associate Dean of Faculty for Academic Computing
Jason Stiffler, Interim Director of the Writing Center
Micheal Malsed, Assistant Director of Student Computing

 


Abstract: Claremont McKenna College has conducted a survey of faculty, students, and relevant support staff to determine satisfaction levels with several ant-plagiarism tools. Students were asked to compare learning products from Glatt, Turnitin.com, and Prentice Hall, and high-profile plagiarism education websites at Indiana University, Purdue, and University of Michigan. Faculty and support staff were asked to compare plagiarism detection services from Turnitin.com, Canexus (Eve2), and the University of Virginia (WCopyFind). Students reported a statistically significant preference for learning tools offered by Prentice Hall, Purdue, Michigan, and Turnitin.com, over those offered by Glatt and Indiana. With less statistical significance, students preferred the tools available at Prentice Hall overall. Faculty reported a significant preference for detection services through Turnitin.com over those offered by the University of Virginia software, and by the Eve2 program.

Parallel to the collection of survey data regarding the software, instructors in the Literature Department have been testing a combination of Turnitin.com's detection services, and Prentice Hall's educational software. Results of this program are still preliminary, but indicate a pattern of support for the findings of the survey results. After one semester of employment, the program does appear to be helping faculty to detect student plagiarism, and students seem comfortable both with the detection program, and with efforts to better teach them the rules of academic honesty.



Introduction

For the past year, Claremont McKenna College has been involved in a systematic program to improve its response to the increasing national problem of online plagiarism. Essay mills have been proliferating for several years now, offering students a wide variety of downloadable papers on a very wide range of topics. It can be extremely difficult for college faculty to keep pace with the electronic cheating options available to their students, and the faculty at CMC are no exception to this pattern. In the wake of several online plagiarism incidents at the College, the CMC administration assigned the Educational Technology Services department to investigate anti-plagiarism software, and to work with faculty through an existing IT Fluency grant to run a pilot study of preferred tools.

To begin this process of exploration, a committee was assigned in late 2002 consisting of several students, the Associate Dean for Academic Computing, the Chair of the Literature Department, the Assistant Director of Student Computing, and an Educational Technology Specialist. This committee examined a wide range of software packages in an informal process. The committee concluded that an effective anti-plagiarism program should include both an educational/preventative element (to help students better understand the nature and consequences of academic dishonesty), and a detection element (to assist faculty in identifying acts of plagiarism when they occur). The committee recommended that a pilot implementation of educational software from Prentice Hall, and detection software from Turnitin.com begin immediately. It further recommended that a more formal process of software review be undertaken, to more accurately assess student and faculty opinions of various packages relevant to the problem of academic dishonesty. That software review process has been undertaken during 2003 by a team which grew out of the original committee, composed of the Associate Dean, the Chair of Literature, the Director of the Writing Center, the Assistant Director of Student Computing, a faculty representative from the Psychology Department specializing in learning and cognition, an Educational Technology Specialist, and a Staff Tutor from the Writing Center.

This report presents the results of the efforts of this committee to provide a quantitative assessment of faculty and student satisfaction with a broad range of anti-plagiarism software. It also provides a discussion of the first semester of CMC's adoption of turnitin.com, and the Prentice Hall plagiarism teaching module. The report is divided into individual sections concerning plagiarism education tools, plagiarism detection tools, and the current pilot program at CMC.


Technology-Based Approaches to Plagiarism Instruction.

The committee surveyed available options for training students about plagiarism and academic honesty, and compiled a list of third-party tools and academic websites that appeared to be providing leadership in the field. This list included the following:

GPTeach, by Glatt Plagiarism Services
MITT: The Multimedia Integrity Teaching Tool
The Prentice Hall Companion Website "Understanding Plagiarism"
The "Avoiding Plagiarism" page at Purdue's Online Writing Lab
The "Plagiarism" page at Indiana University
The University of Michigan's anti-plagiarism site
Plagiarism.org - the education arm of Turnitin.com

A brief discussion of each of these tools is available in Appendix VI.

Six of the tools on this list (all but MITT) were placed before a team of roughly 60 student test subjects, who were asked to use the tools to teach themselves about plagiarism and then to complete a survey indicating their confidence in the instruction they were receiving. Not all students completed survey for all software packages - in general, there were roughly 20 responses per program. The committee had originally intended to include MITT in the student survey - indeed, MITT had been the focus of more testing than any other package (with unsatisfactory results) during the informal testing process employed prior to setting up the pilot study in 2002. During that early testing, the committee was impressed by the scope of MITT's training, but found the interface to be problematic. However, the committee was unable to obtain an evaluation copy of the most recent version of the software in time for inclusion in this study, and we felt that it would be unfair to the program's developers at Ball State University - and potentially misleading to our report - were we to judge their teaching tool based upon an outdated version of the software, which we expect has been updated significantly since we reviewed it a year ago.

The survey students were selected from among student employees of various divisions of the Educational Technology Services department. They range across all four classes, and represent a range of skill-levels in both technology and writing, as included among the survey sample were a significant number of new student hires, who have not yet been trained in technology, and who in most cases are currently enrolled in their first college writing course. There is a slight male-leaning gender bias in the sample, as well as a bias in favor of underclassmen. This second bias likely works in favor of the survey: the target audience of these instructional tools is mostly composed of freshmen and transfer sophomores.


Table 1: Overall Survey Results


The students who took part in the survey process were asked to indicate their level of satisfaction on a 5-point Likert scale (1=strong dissatisfaction, 5=strong satisfaction) in the following areas:

1) Did the interface seem intuitive?
2) How well did the software teach you about ethical issues in academic honesty?
3) Did the software provide you with useful information about plagiarism?
4) Did the software help you to understand the harm done to the college community by plagiarism?
5) Did the software do a good job of teaching you about other areas of academic honesty besides plagiarism?
6) Would this program be useful to you, as a student?
7) Was the time commitment posed by this program reasonable?

(A listing of the exact questions posed on the survey is available in Appendix V). Students were also asked an eighth question concerning the usability of the documentation of the programs. However, most students found all of the programs easy enough to use that they made little reference to the documentation - as a result, only a few students were able to provide feedback concerning the usability of documentation. For this reason, the documentation question was removed from the overall study. A table itemizing the responses we received to that eighth question is available in the appendices.

Trimmed means were taken for each of the seven questions on the survey, the individual results of which are available in Appendix I. To provide a measure of the overall student software preference, an evenly-weighted average was taken of the mean score received on each of the seven questions by each software package. This overall mean score is presented above, in Table 1. Overall, the "Understanding Plagiarism" package from Prentice Hall scored best, earning an overall mean rating of 3.79. By contrast, the GPTeach package from Glatt earned the lowest marks, with a mean score of 3.06. Overall, this indicates that all six programs are seen in reasonably similar terms by the students; the overall score for all six being within one point. While the students did not report any statistically significant preference among the mid-range scoring packages (Plagiarism.org, and the websites at University of Michigan and at Purdue), they did indicate a statistically significant preference for the Prentice Hall learning tool, as compared with the majority of its competitors. Similarly, the students indicated a statistically significant preference for the mid-range tools over the website at Indiana University, and an even stronger preference for all other tools ahead of GPTeach. T-test figures for the statistical significance of all survey results can be found in Appendix II.

Table 2: Overall student preference, organized by type


Discussions with students involved with the testing indicated (anecdotally) that they perceived a qualitative difference between those learning tools which simply presented them with information (the university websites, and Turnitin.com), and those which actively quizzed them to reinforce knowledge (GPTeach and Prentice Hall). While this difference would likely be invisible in the classroom (as faculty develop and implement their own assessment programs with which to measure student comprehension), it does point to a possible selection effect in the testing process. Thus, Table 2 presents the same data, color-coded to show overall student preferences for programs with a testing component (purple base) and programs without one (blue base). Following the pattern in Table 1, high-scoring tools in each field are capped with a green gradient, and low-scoring tools in each field are capped with a red gradient. While Purdue's website scores slightly higher than those at Michigan and at Turnitin, there was no statistically measurable difference between the student perception of the utility of these tools.

Conclusions

Based upon the results of this survey, two primary conclusions are apparent. First, given that students judged no teaching tool as more than marginally acceptable as a means of learning about plagiarism, it seems clear that there is room for significant development in online and technology-based academic honesty teaching tools. While it is possible that the latest version of MITT (which was unavailable for inclusion in the survey) may help to fill this need, the general opinion of the review committee is that the plagiarism training field as a whole would benefit from new offerings, or from improved versions of some of the existing software.

Second, the results do indicate that, while no product is perfect in the field, there are measurable differences in student satisfaction with the six programs tested. These results indicate that the best available program at present is that offered by Prentice Hall, which tests as either measurably preferable, or tending toward preferable, as compared with all competitors. Similarly, the survey seems to agitate against the adoption of Glatt's GPTeach program. Particularly given that Prentice Hall offers its learning module for free, while Glatt charges a significant licensing fee for GPTeach, the Glatt package seems a poor choice, even given the limited success of the field. Finally, if a faculty member is looking for a website to provide reference materials for students, the only statistically measurable advice this survey offers is that the materials at Indiana University may not be as well received as those at Michigan, Purdue, or Tuirnitin.com.

Electronic Plagiarism Detection

The committee reviewed major plagiarism detection services, and compiled a listing of major automated packages and methodologies for performing wide-net online searches for plagiarized material. The committee identified the following services and programs:

The Essay Verification Engine, v2 (Eve2) by Canexus
Turnitin.com, by iParadigms
The University of Virginia's WCopyFind, in conjunction with Google
YAP, from the University of Sydney
MOSS, from UC Berkeley
SCAM, from Stanford University
Glatt Plagiarism Screening Program
IntegriGuard
JCIS Plagiarism Detection Service

However, many of these programs and services were immediately eliminated as unsuitable to the full college environment. YAP and MOSS each focus exclusively on plagiarism of software code, and were thus eliminated from consideration. SCAM, from Stanford, appears to have been a student project - and is no longer available or supported. The freely-testable version of Glatt's software indicates on its website that it is only suitable for advisory purposes, and will not provide truly accurate reports. IntegriGuard has gone out of business. The JCIS Plagiarism Detection Service is only available to institutions in Great Britain, and is actually just a re-branding of Turnitin.com.

Elimination of these services left the committee with three methods to choose from: Eve2, Turnitin.com, and web searches (we used Google) enhanced by the report-generating capabilities of WCopyFind. Details of each of these detection methodologies may be found in Appendix VII.

The faculty were asked to complete a 17-question survey rating their satisfaction with the software on a 5-point Likert scale (1=strong dissatisfaction, 5=strong satisfaction). The specific questions from this survey may be found in Appendix V. Additionally, Appendix III provides graphical breakdowns of responses to each of the 17 questions, for reference. Overall, the team expressed a strong, consistent, and statistically significant preference for Turnitin.com, followed by WCopyfind, and then Eve2. This pattern emerges not only in the overall score for the programs, but in 16 of the 17 individual questions. The only question for which this pattern did not hold concerned the level of technical ability (seen as a negative) which the software required of the user. In this one area, Eve2 slighly out-scored WCopyFind on average, though Turnitin.com still led the field in this area. Statistical measures of confidence in the difference between means for these tools exceeded 99.9% in all comparisons in the overall study. Statistical notes on the results from individual questions may be found in Appendix IV.

Table 3: Overall Faculty/Staff Preference in Detection Software

In addition to the 17 quantitative survey questions, each participant was also asked to provide anecdotal responses to two questions: "How was the "look and feel" of the software?" and "Please provide any comments". The comments provided in response to these two questions were surprisingly consistent, and provided useful insight into the reasoning of the team in its preference for Turnitin.com. The team indicated that Eve2 suffered in comparison to other products in that it generated a less-usable plagiarism report, and sometimes seemed to generate false-positives in that report which were then more difficult to track down because the report was hard to use. By comparison, the team indicated that Turnitin.com provided a very usable report, and one containing results which appeared to be highly reliable. In the mid-range, team members reported satisfaction with the reports generated by WCopyFind, but rated the University of Virginia method lower than Turnitin.com overall, as it did not include an automated searching tool. Generally the team members were comfortable using Google to seek out plagiarism on the internet, but found it to be a needlessly time-consuming process compared to the ease of Turnitin's automated search.

A final interesting note concerns the readiness of faculty to adopt these software packages in their courses, or to recommend them to others. The data on these two points are posted in tables 4 and 5, below. Overall, the team members would not be comfortable recommending any tool except Turnitin.com to their peers, nor would they be comfortable adopting WCopyFind themselves. By comparison, the team indicated that they would be extremely comfortable both adopting and recommending Turnitin.com, offering unanimous strong support for that software. This combination seems to suggest that turnitin.com is the only tool among those tested that would be likely to succeed in earning adoption at the College without a significant incentive program.


Table 4: Would Recommend To Others

Table 5: Would Adopt in Your Own Class


Conclusions

The results of the survey indicate both that faculty and support staff at CMC are highly satisfied with Turnitin.com as a method of detecting online plagiarism - and that they strongly prefer it to the other two major means of detecting plagiarism online. In the anecdotal section, several of the respondents did indicate that they felt the service would be of more use to faculty users if it were offered in conjunction with short workshops of other assistance to help users "get started" with the software. In general, the opinion of the team members was that it was quite simple to use, once you learn the basic interface.

Similarly, the team indicated quite strongly that Eve2 is not a sufficient tool for the CMC community. The poor report structure, coupled with the need to make close use of that report to check the consistency of the search results the software generates, serves as a strong barrier to success, particularly for users less familiar with the intricacies of plagiarism searches.

CMC's Pilot Program

Since December 2002, CMC has been conducting a pilot implementation of Prentice Hall's teaching software and Turnitin.com's detection software, to test the feasibility of implementing an academic honesty module as a component of CMC's "Fluency in Information Technology" (FITness) program, which attempts to incorporate technology skills across the curriculum at the College. The FITness program has listed instruction in Digital Ethics as a key component of the technology skills curriculum, and the effort to fight online plagiarism seems to work well with this goal.

During Spring Semester 2003, the Prentice Hall software was employed in three sections of Lit 10 - Composition and Literary Analysis. All three instructors using this tool indicated satisfaction with it as a source of information on plagiarism. The instructors were particularly happy with the fact that the Prentice Hall website sends email copies of the comprehension tests to the course instructor, allowing him or her to track student understanding of academic honesty issues. For the purposes of this study, supplemental quizzes were constructed using the College's courseware package (WebCT), thus avoiding the problems posed by several sections in the Prentice Hall module for which no self test is included. Each of the three instructors also made use of Turnitin.com for spot checks to detect plagiarism. One plagiarist was caught among the three courses.

Students in one course section were subsequently polled to determine their satisfaction level with the instruction offered by the software. Students were asked their opinion of the utility of the Prentice Hall system as a learning tool, as measured in three areas (quality of the module, effectiveness of the self-tests, and helpfulness of the reference materials). Using a 5-point Likert scale (1=strong dissatisfaction; 5=strong satisfaction). Overall, the students were ambivalent about the quality of the learning software was a whole, and felt only marginally positive about the self-tests, and only marginally negative about the reference materials (see Table 6).


Table 6: Student Perception of Prentice Hall in pilot testing
It is interesting to note that if these results are tested for significance of difference between this result and an expected outcome of perfect neutrality (a score of 3), we can only say with marginal 80% confidence that there is any statistically significant difference at all. Thus, the students cannot be said to have any strong opinion either way regarding their attitude toward the software.

Students were also asked to indicate their comfort level with the amount of time required to complete the learning module. Students indicated their sentiment on a 5-point Likert scale, where 1=too little time, 3=right amount of time, and 5=too much time was required to complete the module. The overall mean score of 3.53 indicates that the module may be slightly longer than some students would prefer, but is basically of a reasonable length. These results are presented in the histogram, below.


Table 7: Student Perception of assignment length for Prentice Hall in pilot testing

Running a t-test to compare the mean student response of 3.53 against an expected score of "just right (a score of 3), we can say with 98% confidence that the students did find the assignment to be slightly longer than they might have preferred.

Finally, students were asked to comment on their comfort level with the professors running spot checks through Turnitin.com on papers submitted in the class. Only qualitative information was collected on this point. Table 8 provides a histogram denoting the frequency of various responses (the full question text - along with the full text of all questions asked - is available in Appendix IX). Students were provided with six "comments" to choose from, and were allowed to select as many options as they felt applied to their opinion.


Table 8: Student comments on Turnitin.com spot checks during pilot study

Obviously any information we can draw from this table will be purely anecdotal, though it is interesting to note that as many people felt that their peers needed policing, as felt that their privacy had been invaded. There was also a nearly even split on whether the students at the College are sufficiently dishonest to justify plagiarism checking. It is encouraging that at least one student was discouraged from plagiarizing as a result of the efforts in the pilot study.

Conclusions

The results of the pilot study are useful for comparison to the results of the more recent survey of student, faculty and staff opinions regarding anti-plagiarism software. Perhaps most interestingly, whereas students felt ambivalent about the usefulness of the Prentice Hall software in the pilot study, they reported a statistically measurable (if still lukewarm) positive reaction to the software when it was placed in comparison with competitor products - with a greater than 99.9% measure of confidence. This indicates that the ambivalent reaction of the students to the Prentice Hall software in the pilot study might have been significantly worse, had a competitor product such as Glatt been used in its place.

The insights this pilot study provides into student perceptions of plagiarism detection software are also revealing. The significant number of students indicating that they felt privacy concerns with the spot checks indicates that further work exploring these concerns might be useful. Specifically, it would be interesting to test student reactions to Turnitin's policy of archiving past submissions for future anti-plagiarism checks. This policy has raised some alarms in the academic community regarding intellectual property rights, and might well turn out to be a source of student concern, as well.

In summary, the pilot study seems to confirm that students do not particularly dislike the Prentice Hall software, and are as likely to appreciate spot checks through Turnitin.com as they are to resent them. This indicates that campus-wide implementation of these two software packages is feasible, should the faculty determine that such a program is in keeping with the mission of the College. The results of the survey of academic honesty teaching software do seem to indicate that further development in the field might be appropriate before any such adoption is likely to occur.


Overall Recommendations

Overall, the results of this study suggest that, while there is room for improvement in the academic honesty teaching field, post secondary institutions interested in adopting such software may wish to consider the free "Understanding Plagiarism" module from Prentice Hall above competing products in the field. While it tests only marginally better than competing free resources available online thorugh Purdue and iParadigms, student reaction during the pilot study in the CMC Literature Department suggests that the slight preference students grant it in comparison to other products may be important in preventing a negative student reaction to the program as it is implemented. Since the pilot program suggests that positive response to the software declines in the absence of comparison, it seems reasonable to suggest that any margin or preferability is worth seeking. In any case, the report strongly suggests that free teaching modules like the Prentice Hall package are preferable to lower-ranking, fee-based services like Glatt's GPTeach.

In the area of plagiairism detection, Turnitin.com is a clear winner, as the only test subject throughout the survey to elicit a strongly positive response from the survey takers. In implementing Turnitin's services, anecdotal evidence suggests that some form of formal support mechanism may be helpful, and that privacy concerns of the student body should be addressed.










Appendices published under separate cover