Publication of evidence to UK Parliament

The submission to Parliament’s inquiry on the reproducibility crisis was accepted, and can be accessed as:

RRE0020 - Reproducibility and research integrity

https://committees.parliament.uk/writtenevidence/39610/html/

Other pieces of evidence are available here:

https://committees.parliament.uk/work/1433/reproducibility-and-research-integrity/publications/

The submission stated the following:

1. Introduction

With this submission I want to draw attention to the systemic nature of the reproducibility crisis, and the underlying shift in academic values. Professionally, I am working as a research impact officer at the University of Hull, which involves preparation for the Research Excellence Framework, and in general aiding researchers’ across the entire academic spectrum in their impact activities. Nevertheless, I am making this submission as a visiting scholar of the University of Eastern Finland and as the Reading Group Organiser of the Bacchus Institute of Science. My expertise on the subject matter, is not only that I have written my PhD thesis on a related matter (Brauer, 2018), but have also actively continued to research and participate in the scholarly debate about issues concerning academic integrity (e.g. Brauer et al. 2021). This clarification will touch upon the extent of the reproducibility crisis (2), briefly investigate the causes (3), identify different actors motivations as well as how their actions contribute to the crisis (4) and conclude by suggesting ways to address the issue at hand (5).

2. Scope of the reproducibility crisis

The reproducibility crisis is described within the primary research literature along the following lines:

70% of papers [within the Strategic Management Journal] did not provide sufficient data to reproduce the findings, while for cases where replication was possible, about one-third had statistically significant hypothesis tests that could not be reproduced.” (Biagioli et al 2019:402)

Journalists then comment in the following fashion and sensationalise the dynamic, by stating:

Scientific research findings that are probably wrong gain far more attention than robust results, according to academics who suspect that the bar for publication may be lower for papers with grabbier conclusions.” (Sample, The Guardian, 2021-09-21).

In its totality, this then shapes the thinking and tone around what the reproducibility crisis is, as exemplified in the Wikipedia entry defining it as:

The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which it has been found that many scientific studies are difficult or impossible to replicate or reproduce.” (Wikipedia)

As stated, the emphasis is on the methodological aspects, and not the underlying values of the researchers. Nevertheless, the reason of why such findings represent a ‘crisis’ in the first place, is that such conduct contradicts with some of the traditional tenants of scientific research, like that of the CUDOS norms articulated by famous sociologist of science Robert K. Merton (Ziman, 2001). The acronym stands for: communalism – which sees research as a social and collaborative endeavour, universalism – demands that anybody regardless of identity ought to be able to make claims and repeat the results, disinterestedness – holds that personal gain ought not to represent the goal, but rather that of advancing scientific knowledge, originality – postulates that the contributions ought to be new and finally scepticism – requires that new claims are critically scrutinised before being accepted. As we can see the circumstance created by the reproducibility crisis gnaws at the very heart of scientific integrity as replicability represents a key tenant of claims to universality. Hence, the scope of the crisis encompasses the entire research enterprise as it violates taken for granted scientific values. 

3. The causes of the crisis

To be fair, questions about the reproducibility of published results has always been a fruitful area of debate within scientific scholarship, as showcased with the differences between qualitative and quantitative research in terms of replicability. For example Frederick H. Gareau (1987) proposed that when it comes to the social sciences; transparency, plausibility and coherency do constitute better hallmarks of trustworthiness. The reason being, when qualitative data is concerned the complex interaction between researcher and research subject encumbers replicability from the outset. Such difficulties are not only the domain of social sciences, such qualitative human issues persist throughout the entirety of the scientific endeavour. For example, when building the first gas lasers no amount of technical instructions and correspondence could get the laser to function outside of the lab where it originated. First after a lab member from the original lab physically travelled and assisted other teams, was it possible to replicate the experiment (Collins & Harrison, 1975). Other similar issues could be mentioned in terms of the complexities around lab research (Latour & Woolgar, 2013), military-industry-research collaborations (MacKenzie, 1993) or even the development of historical political theory and physics (Shapin & Schaffer, 2011), just to showcase how difficult the transmission of new knowledge is and what practically is required to repeat any scientific results in the first place.

The implication for the reproducibility crisis ought to be a shift in perspective. Namely, not that research conduct in the past was free of biases, more robust and trustworthy but rather that the preconditions which allowed this are no longer present to the same degree. Let us take the remaining CUDOS norms to illustrate this change in culture. The university that used to be a community of scholars and teachers, transformed into mass educational factories and research enterprises that ruthlessly pursue income from whomever is willing to provide it (Anderson, 2010). Alongside came an increase in size, lowering of standards and weaponising of identity, that in their totality undermine any form of genuine community and subsequent trust (Hermanowicz, 2021). Any preamble of disinterestedness is gone with the rise of the ‘impact culture’, and that this generates mutually exclusive (impact) agendas is treated as less of an issue by some academics (e.g. Reed & Fazey, 2021). Furthermore, with the impact culture, originality becomes relegated to secondary importance compared to making a lasting influence upon society. Which brings us to scepticism, of which the concerns around the reproducibility crisis only represent the tip of the iceberg. Scepticism, for the individual academic, becomes difficult when the entire academic system is geared towards a dynamic of hyper-performativity (Macfarlane, 2021). Every publication, every grant application, every impact etc. no matter how small is logged and transformed into key performance indicators, in order to bolster the institutions bottom line. Resistance to such a state of affairs is few and far between, because even activities that occur outside of this system are then commodified and incorporated as part of its machinery (Edwards, 2020).

4. The role of different actors

As implied in the previous section, to understand the dynamic of the crisis a systemic perspective of the entire research ecosystem is necessary. To only briefly touch upon some of the involved actors, the following paragraphs will describe how; researcher’s themselves (4.1), funding bodies (4.2), the UK government (4.3), academic publishers (4.4) and the interaction with the public in general (4.5) all contribute to the crisis.

4.1. Researchers

Academic researchers’ operate in meritocratic hierarchies, where performance gets rewarded in terms of prestige and career advancement. What has happened over the past decades, is that the competition between researchers steadily increased. Where it was sufficient to have a few papers published previously it now takes many more to be eligible for the same academic position (Warren, 2019). Likewise, pressures to testify and account for impact outside of academia is steadily increasing as well (Boland et al. 2020). On top of that, every activity of the researcher is logged and monitored in order to ensure transparency and to score well within research evaluations (Ten Holter, 2020). With this hypercompetitive environment, it is not surprising that researchers feel this pressure to perform on a very visceral level, nevertheless how they respond is dependent upon the individual. For example, Professor Stefan Grimm took his own life as he couldn’t cope with grant income targets (Parr, Times Higher Education, 2014-12-03). Meanwhile, identified academic fraud Diederk Stapel admits that the prestige and recognition he received from his conduct was part of the motivation of why he fabricated data (Wichgers, Mare, 2018-04-18). The conduct that leads to the reproducibility crisis lies somewhere between these two extremes, like that of Professor Dan Ariely. Where he massaged the data to get a publishable and interesting result, but the manipulation is not so egregious that he can be accused of data fabrication outright (Bey & Boyd, The Chronicle, 2021-08-19).   

4.2. Funding bodies

The pressure that academics feel to perform, as aforementioned, stems from the all-encompassing research assessment of their work from systems like the Research Excellence Framework administrated by the UK funding councils. Nevertheless, whilst this is the cause, the specific pressure that academic researchers feel is how these evaluation metrics are interpreted by their specific institution (Wissenburg, WonkHE, 2021-06-02). In that, the universities are incentivized to only submit their best outputs, impacts and descriptions of their research environment. It is this competitive element translated into performance reviews and associated career advancement that then results into the visceral pressures researchers feel (e.g. Watermeyer & Tomlinson, 2021). Therefore, whilst the intention of research evaluation may be to promote research quality, transparency and provide rationales for the allocation of research funds. The reality of it, is that the unintended consequences of such conduct creates unhealthy incentives for the researchers that are detrimental to their own mental health and the vitality of the wider research community (Strathern, 2000). After all, the Research Excellence Framework and systems like it are meant to promote excellence, as stated explicitly in their remit. However, how much real academic excellence top down government research evaluation produce is an open question at best (e.g. Good et al, 2015Checchi et al. 2020Stockhammer et al. 2021). 

4.3. The UK government

On top of this, there is the notion of ‘evidence based policy’ where elected government officials ought to use scientific findings to justify their conduct. Whilst intrinsically there is nothing wrong with this dynamic, the competition amongst public officials and need for ever new evidence to justify policy aims, then pushes the available knowledge into more and more murky areas where there hitherto is no academic consensus. The sociologist of science Peter Weingart explains the ultimate consequences of such a dynamic in the following terms:

“It is evident that these interacting mechanisms, which constitute what I term the coupling of science and politics, do not come to a stable state. Rather, it is foreseeable that there will be no end to the continued production of knowledge which, to capture public attention and support, will be sold on promises and threats, a strategy that could be self-defeating in the long run.” (Weingart, 1999:160).

The latest manifestation of the formalisation of this unhealthy relationship between science and politics manifests within the evaluation of research based on its societal impact. Which in its most basic form, represented a politically motivated rationale to justify science expenditure in terms of societal benefit (Williams & Grant, 2018). That the unintended consequences of such conduct, is that the boundaries of academic autonomy are then restricted (Smith et al., 2011) and epistemic corruption is created (Kidd et al. 2021), that is manifested in the reproducibility crisis for the aforementioned reasons, are usually conveniently ignored.

4.4. Publishers and open access 

Another trend worth mentioning is the increase in open access publishing. Akin to the aforementioned aspects, once again we can observe a case of unintended consequences. In the sense, that intrinsically open access requirements re-affirm the aforementioned community ethos of science in that they promote accessibility of the research for users. Nevertheless, as publishing is a commercial enterprise there are a variety of counter-productive issues that arise (MacLeavy et al. 2020). For example, in that open access, library subscriptions and access fees represent a hidden cost for research, where the journals are incentivised to publish more as it generates profit maximisation (Buranyi, The Guardian, 2017-06-27). As good quality research takes time, and getting published in the top journals is difficult a questionable alternative for any entrepreneuring researchers is to get their work published in less reputable journals with less stringent peer review requirements (Grant et al. 2018). Sure, we can label journals that go too far as predatory, as is done by the likes such as Beall’s list.[1] Nevertheless, the dynamic that fuels the crisis occupies a grey zone, where such journals technically are not predatory but still contribute to the same dynamic due to their overarching business model. 

4.5. The public and supercompelxity

Due to the rise of the internet, the complexity of public life has increased sharply. No longer are just mass media, libraries and public events adding to the complexity of public life, but also social media, internet databases and on-demand video streaming services are part and parcel to an increasingly supercomplex social fabric. Mass education aims to train the public to cope with this dynamic, nevertheless due to the required specialisation not everyone can be an expert on everything (Barnett, 2000). Whilst stating this contingency in reference to the public in general is almost redundant, this also applies in reverse to researchers themselves when they interface with the public they are meant to influence. Sure, researchers’ may be experts within a certain domain of knowledge, their long-track record, acquired expertise and peer esteem does give them legitimate grounds to make claims that fall within the purview of this domain (Collins, 2004). Nevertheless, this expertise – like all forms of scientific expertise – has limits. Furthermore, in situations of knowledge uncertainty, like certain dimensions of the coronavirus pandemic, the way the public utilises analysis and argues for their viewpoints sometimes rivals – if not supersedes – academic sophistication. To turn around, and then dismiss such insights and criticism of the scientific status quo just because the individuals making them do not have official academic accreditation (e.g. Lee et al. 2021), directly feeds the unquestioning reverence of scientific expertise and newly emergent values of the research culture that enables the reproducibility crisis. Figure 1 represents an internet meme that mocks this new research culture and the associated contempt for the universality ethos central to the scientific enterprise.

5. Conclusions and suggestions

To summarise, the shift to metric driven, impact focused, hypercompetitive academic conduct, in a market economy of publication within the backdrop of a supercomplex society represents the breeding ground of the reproducibility crisis. Put in a different way, the crisis is a symptom, of a research sector which is changing its norms to adopt to this new landscape, where individuals try to eke out a competitive advantage and in the process violate scientific ideals. Granted, the norms mentioned above were always clashing with the reality of researchers’ day to day lives (Mitroff, 1974). What is and has changed, is the scale of the onslaught upon scientific ideals. They may have been ideals, and researchers may have always fallen short, nevertheless the attempt to live up to these ideals was what kept the scientific enterprise trustworthy. Today, there is no longer even an attempt to live up to these ideals, as the metrics of research (e.g. citations, grant targets, impact factor score, impact case studies etc.) count more than the actual content of the research. Predictably, such devaluation of the scientific ideals is then counter-productive to the validity of research claims, and therefore the predictable outcome are consequences like the reproducibility crisis.

One of the accepted scientific norms, is to state the limitations of the research, which is paramount in order to let the reader assess the quality of the claims/recommendations made by the author. The aim of this clarification, was to provide a roadmap to the systematic nature of the underlying dynamic which create the reproducibility crisis as a symptom. A necessary limitation, then is that the individual components of the argument here sketched are not fully elaborated. Here, the use of references and willingness to expand on any of the points raised in subsequent correspondence, ought to be viewed as an intention to provide further specificity wherever required. The below bullet points are just a short list of suggestions, that all need extra elaboration in turn, but represent a starting point to approach the crisis and reinstate the value base that enabled the trust in scientific claims:

  • There is a need for a mandate that researchers’ always actively need to acknowledge the limitations of scientific knowledge and expertise, within every communication.

  • Cases of academic misconduct need to be viewed as serious offences, requiring additional mandates to investigate the underlying systemic issues and change these if deemed necessary.

  • Alongside investments into the public literacy of science, there also needs to be an emphasis on the limitations of scientific knowledge claims. In the sense, that public communication of science not only showcases the success of science (e.g. impact case studies) but also its failures, tedious procedures and assumptions needed to be made within the process.

  • There is a need to legislate that scientific publishers are to be run as non-profit organisations and ease off on open access requirements. Academic publications need to be linked to a body of scholarship, with conferences, societies and the like, and not just as venues of publication.

  • In terms of research evaluation, the requirement of the assessment of research impact needs to be removed. The impetus created for ‘pragmatic’ actions conducive to the creation of impact, is antithetical to the robustness necessary for reliable scientific knowledge claims.

As is clear from the above list, these are very difficult issues where there is no easy solution, and hence require additional research and dialogue in their implementation. To borrow Weingart’s words, otherwise the shift in scientific values might ultimately be self-defeating for the trust the public has into science.[2]  Nevertheless, this shortlist, ought to suffice as a roadmap of which issues to investigate further in order to approach such a wide-reaching systematic issue. 

6. Footnotes

[1] https://beallslist.net/, last accessed: 2021-09-21

[2] Bertand Russel, wrote in his book “The impact of Science on Soceity”, that: “[t]he pragmatic theory of truth [I wrote in 1907] is inherently connected with the appeal to force. If there is a non-human truth, which one man may know while another does not, there is a standard outside the disputants, to which, we may urge, the dispute ought to be submitted; hence a pacific and judicial settlement of disputes is at least theoretically possible. If, on the contrary, the only way of discovering which of the disputants is in the right is to wait and see which of them is successful, there is no longer any principle except force by which the issue can be decided." (Russel, [1953] 2016:81-82)

7. References

Anderson, R. (2010). The ‘Idea of a University’ todayHistory & Policy, 1, 22-26.

Barnett, R. (2000). University knowledge in an age of supercomplexityHigher education40(4), 409-422.

Bey, N. & Boyd, L. (2021). Researchers raise concerns of fraud and ambiguity in two studies authored by Dan Ariely, renowned Duke researcher and professor. The Chronicle, published 2021-08-19. Available at: https://www.dukechronicle.com/article/2021/08/duke-university-dan-ariely-fraudulent-data-colada-research-2012-2004-economics-psychology-statistics, archived link:  https://archive.is/JepxF

Biagioli, M., Kenney, M., Martin, B., & Walsh, J. P. (2018). Academic misconduct, misrepresentation and gaming: A reassessmentForthcoming in Research Policy.

Boland, L., Brosseau, L., Caspar, S., Graham, I. D., Hutchinson, A. M., Kothari, A., ... & Stacey, D. (2020). Reporting health research translation and impact in the curriculum vitae: a surveyImplementation science communications1(1), 1-11.

Brauer, R. (2018). What research impact? Tourism and the changing UK research ecosystem. Doctoral dissertation, University of Surrey.

Brauer, R., Dymitrow, M., & Tribe, J. (2021). A wider research culture in peril: A reply to ThomasAnnals of Tourism Research86(1).

Buranyi, S. (2017). Is the staggeringly profitable business of scientific publishing bad for science?. The Guardian, published 2017-06-27. Available at: https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science, archived link: https://archive.is/IoAxa 

Checchi, D., Mazzotta, I., Momigliano, S., & Olivanti, F. (2020). Convergence or polarisation? The impact of research assessment exercises in the Italian caseScientometrics124, 1439-1455.

Collins, H. (2004). Interactional expertise as a third kind of knowledgePhenomenology and the Cognitive Sciences3(2), 125-143.

Collins, H. M., & Harrison, R. G. (1975). Building a TEA laser: the caprices of communicationSocial Studies of Science5(4), 441-450.

Edwards, R. (2020). Why do academics do unfunded research? Resistance, compliance and identity in the UK neo-liberal universityStudies in Higher Education, 1-11.

Gareau, F.H. (1987). Expansion and Increasing Diversification of the Universe of Social ScienceInternational Social Science Journal, 39(4), 595–606.

Good, B., Vermeulen, N., Tiefenthaler, B., & Arnold, E. (2015). Counting quality? The Czech performance-based research funding systemResearch Evaluation, 24(2), 91-105.

Grant, D. B., Kovács, G., & Spens, K. (2018). Questionable research practices in academia: Antecedents and consequences. European Business Review.

Hermanowicz, J. C. (2021). Honest Evaluation in the Academy. Minerva, 1-19.

Kidd, I. J., Chubb, J., & Forstenzer, J. (2021). Epistemic corruption and the research impact agenda. Theory and Research in Education, 19(2), 148-167.

Latour, B., & Woolgar, S. (2013). Laboratory life. Princeton University Press.

Lee, C., Yang, T., Inchoco, G. D., Jones, G. M., & Satyanarayan, A. (2021, May). Viral Visualizations: How Coronavirus Skeptics Use Orthodox Data Practices to Promote Unorthodox Science Online. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-18).

Macfarlane, B. (2021). The neoliberal academic: Illustrating shifting academic norms in an age of hyper-performativityEducational Philosophy and Theory53(5), 459-468.

MacKenzie, D. A. (1993). Inventing accuracy: A historical sociology of nuclear missile guidance. MIT press.

MacLeavy, J., Harris, R., & Johnston, R. (2020). The unintended consequences of Open Access publishing–And possible futuresGeoforum112, 9-12.

Mitroff, I. I. (1974). Norms and counter-norms in a select group of the Apollo moon scientists: A case study of the ambivalence of scientistsAmerican sociological review, 579-595.

Parr, C. (2014). Imperial College professor Stefan Grimm ‘was given grant income target’. Times Higher Education, published: 2014-12-03. Available online: https://www.timeshighereducation.com/news/imperial-college-professor-stefan-grimm-was-given-grant-income-target/2017369.article, archived link: https://archive.is/Zv7FD 

Reed, M. S., & Fazey, I. (2021). Impact Culture: Transforming How Universities Tackle Twenty First Century ChallengesFrontiers in Sustainability, 21.

Russell, B. [1952] (2016). The impact of science on society. Routledge.

Sample, I (2021). Research findings that are probably wrong cited far more than robust ones, study finds. The Guardian, published: 2021-09-21. Available online: https://www.theguardian.com/science/2021/may/21/research-findings-that-are-probably-wrong-cited-far-more-than-robust-ones-study-finds, archived link: https://archive.is/OV4rE 

Shapin, S., & Schaffer, S. (2011). Leviathan and the air-pump. Princeton University Press.

Smith, S., Ward, V., & House, A. (2011). ‘Impact’in the proposals for the UK's Research Excellence Framework: Shifting the boundaries of academic autonomyResearch Policy40(10), 1369-1379.

Stockhammer, E., Dammerer, Q., & Kapur, S. (2021). The Research Excellence Framework 2014, journal ratings and the marginalisation of heterodox economicsCambridge Journal of Economics45(2), 243-269.

Strathern, M. (2000). The tyranny of transparencyBritish educational research journal26(3), 309-321.

Ten Holter, C. (2020). The repository, the researcher, and the REF: “It's just compliance, compliance, compliance”The Journal of Academic Librarianship, 46(1), 102079.

Warren, J. R. (2019). How much do you have to publish to get a job in a top sociology department? Or to get tenure? Trends over a generationSociological Science6, 172-196.

Watermeyer, R., & Tomlinson, M. (2021). Competitive accountability and the dispossession of academic identity: Haunted by an impact phantomEducational Philosophy and Theory, 1-15.

Weingart, P. (1999). Scientific expertise and political accountability: paradoxes of science in politicsScience and public policy26(3), 151-161.

Wichgers, S. (2018) Everyone’s scared of me: Diederk Stapel was not looking for truth, just reassurance. Mare, published 2018-04-18. Available at: https://www.mareonline.nl/assets/Uploads/Documenten/d77bcd817a/Mare-26-41.pdf

Wikipedia https://en.wikipedia.org/wiki/Replication_crisis

Williams, K., & Grant, J. (2018). A comparative review of how the policy and procedures to assess research impact evolved in Australia and the UKResearch Evaluation27(2), 93-105.

Wissenburg, A. (2021). How the REF is used determines how the burden is felt. WonkHE, published: 2021-06-02. Available at: https://wonkhe.com/blogs/how-the-ref-is-used-determines-how-the-burden-is-felt/, archived link: https://archive.is/62kPv.

Ziman, J. (2001). Real science: What it is, and what it means. Cambridge University Press.

Submitted in September 2021

Previous
Previous

A DEDUCTION OF UNIVERSITIES

Next
Next

Clarification on the reproducibility crisis