Give us your search strategies!

Image source: Pixabay (CC-0)

This post is by Library Research Support Advisor, Kirstine McDermid

We’ve had a few researchers through the doors of the Research Hub recently who have told us that their publishers are requesting search strategies in order to publish them alongside papers.

It is encouraging to see that search strategies are becoming more of a prerequisite for some publications – especially for systematic reviews. However we were concerned to hear that one publisher appeared to be suggesting that the author retrospectively add keywords that had not been searched for to the published version of a search strategy. If they thought the search was inadequate it would be reasonable to ask the authors to revise it and take account of any new results but it is clearly bad practice to publish a false search strategy.

Like all good science, search results need to be transparent, open and reproducible.  In systematic reviews we need to see how the study came about the evidence, what search methods, and databases were used so that systematic reviews can be quality assessed, understood and validated, and can also be easily updated in the future. We’ve recently made a search strategies resource to facilitate making search strategies open access.   Openly accessible search strategies can also be used by other information specialists and researchers to develop searches that go on to inform future research projects. It is important to get the search right from the outset and essential that all searches conducted for the project are recorded accurately – not fabricated upon publication.

Rightly so, the researcher in question submitted only the search terms that were actually used, and they did not follow the poor advice of the publisher to improve their search strategies after the event; this would only be appropriate if the review was redone to incorporate the new results the extra search terms brought up.

The Library’s Research Support Team work with researchers to get their searches right from the outset and can advise on how to record all searching activity. If your project is funded we can use our expertise to do the searches on your behalf and provide neat search strategies documentation and search methods text in preparation for publication.

Contact Lucid if you require literature searching support for your next research project.


Seven ways to increase the visibility of your research

This post is by Library Research Support Advisor, Sally Dalton

So, you’ve published your research and you’re now hoping to sit back, relax and get ready for all those citations to roll in?

Unfortunately the hard work doesn’t stop here!

Now you need to promote your research to make sure it reaches the widest possible audience, this is part of the job of being a researcher. By making your research more visible you could potentially open up future collaboration / job / publication opportunities, increase citations to your work and increase the number of people finding, reading and building on your work.

Image source: (CC-0)

1. Promote your research at conferences

Conferences are a great opportunity to promote yourself and your research. Even if you aren’t presenting your work you can use the conference as an opportunity to meet other researchers and start to develop your research network. Keep an eye out for names of researchers you would like to meet and practice introducing yourself and your research. You may only have a few minutes so make sure you’re prepared!

2. Carefully consider which journals you are going to publish in

Choosing where to publish in an academic matter but there are certain questions you may want to ask yourself before choosing where to publish. Are the articles in the journal easily discoverable? Are they indexed in services such as Web of Science or Scopus? Does the journal have suitable open access options? Have you and your colleagues heard of the journal? The answers to these questions will determine how visible your article will be to other researchers. Think Check Submit provides a simple check list to make sure you choose trusted journals for your research.

3. Sign up for an ORCiD 

Having and ORCID can help to make your research more visible. ORCID is a digital identifier that helps to distinguish you from other researchers. You can link all your research outputs to your ORCID and you can keep it throughout your career. It is particularly useful for researchers with common names, who change their name throughout their career or who change institutions. No matter what changes are made you will always have the same ORCID, so other people can easily see details of your research outputs. More details on how to sign up for a free ORCID can be found here.

4. Make your research open access

Open access publishing makes scholarly works available online, free for anyone to find and read. The potential readership of open access articles is far greater than that for articles where the full-text is restricted to subscribers. Making your research open access will make it more visible. There are 2 ways to make your research outputs open access; by self-archiving in an open access repository or by publishing in an open access journal. More information on open access can be found on our open access pages.

5. Share your research data where appropriate

There is growing evidence that sharing data can increase the visibility of research. Sharing your data could allow other researchers to validate your work, build upon it and could potentially help to open up future collaboration opportunities. Learn more about managing and sharing your data on our Research Data Management pages.

6. Promote your research online

Promoting your research online will help you reach your potential audience, connect with other researchers and help you to start developing a network of online colleagues. There are a number of different social media tools such as Twitter, Instagram, Blogs and LinkedIn. Whichever tool(s) you use it is important to identify who your audience is, engage with them by asking questions, speaking up about issues that interest you and use eye catching images, videos or visualisations. You don’t need to spend a long time keeping your social media accounts up to date but you do need to be willing to write and check your account(s) regularly.

7. Track when your research is being used

Keeping up to date with who is discussing, citing or sharing your research is important. You can use this type of information on CVs and when applying for funding/jobs etc. To check who is citing your work you can look at your articles on sites such as Web of Science, Scopus or Google Scholar. If you are an early career researcher it may be more appropriate to use Altmetrics. Altmetrics looks at who is talking and sharing your research on places such as social media, in news outlets and on course syllabi. For more information on Altmetrics have a look at our Altmetrics pages.

The Research Support team run regular workshops on increasing the visibility of your research focused on different faculties, book online here (N.B. currently for postgraduate research students only, let us know if you would be interested in similar sessions for research staff).

Further reading


20,000 Open Access papers made available from White Rose Research Online

local support This post is by Beccy Shipman, Research Support Advisor from the Research Support Team based in the Research Hub on Level 13 of the Edward Boyle Library.

This significant milestone was reached earlier this year when “How Sharing Can Contribute to More Sustainable Cities” by Dr Milena Büchs (School of Earth and Environment) and colleagues was uploaded to White Rose Research Online via Symplectic.  This reflects the accelerating pace of deposit – numbers have doubled since December 2015 and could hit 25,000 by the end of 2017.

“I think open access research repository systems like Symplectic are very useful tools in today’s research world” says Dr Büchs. “The process of uploading your paper is very easy and just involves a few clicks, and it means everyone with internet access can search for and read our work. This is important because academic research should benefit everyone, not just those who have privileged access to expensive journals”.

Papers uploaded to Symplectic are available online to read, download and re-use via White Rose Research Online as soon as any publisher embargo expires and subject to licence terms. They are also searchable via Google Scholar, CORE and other search engines which helps to increase their visibility, reaching wider audiences and creating more opportunities for people to see, engage, consume and build upon. Submitting your full text papers to Symplectic as soon as possible after acceptance for publication will also help you meet funder requirements and preserve your eligibility for REF 2021.

Staff in both the Library and Research Innovation Service are working with school and faculty-based contacts in a variety of different ways to support academics, helping them ensure their work is open access and REF-ready.

Professor Nick Plant, Dean of Research Quality and Impact said, “It is very encouraging to see the rapid increase in deposit that has taken place since 2015.  Open access helps our research reach the widest audience possible, increasing its impact, as well as helping us to meet our obligations to funders and supporting our REF2021 submissions”.

Please contact the Research Support team in the Library should you have any open access queries.

There is also local support offered to help support you with HEFCE open access compliance.

Shut up and Write: tomatoes, biscuits, peace and quiet

Over the summer and during the autumn term we have been piloting Shut Up and Write sessions for researchers up on Level 13 of the Edward Boyle Library. Similar sessions have been running successfully for some time for undergraduates, but we weren’t sure what the interest would be from RPGs and staff.

It was considerable!

Sessions booked up quickly which led us to schedule weekly slots – alternating morning and afternoon – for the whole of the autumn term. We continue to monitor progress.

Il pomodoro

The sessions utilise the ‘pomodoro technique‘, named after the tomato-shaped timer used by Francesco Cirillo who developed the technique 30 years ago. Rather than a tomato we tend to use an Apple (iPhone), more sophisticated if less characterful, but the principle is the same with over 2 hours dedicated to focused writing time split into 25 minute ‘sprints’. The idea is that this structure enables you to concentrate and not become over-tired. After each sprint there is a short break to grab a brew and a biscuit or chat. The full process is outlined on our handout (word.docx) which includes links to useful resources as well as tips to running your own sessions.


So why do researchers who may have their own workspace want to come and sit in a structured, silent session in the Library? Why did we have good sign up over the summer when there are lots of free spaces to study in all the University Libraries?

Here are some of the reasons people find the session useful:

1. For PhD candidates in particular, writing can be a lonely pastime. It’s easy to feel isolated. In SUAW, the individual is part of a group and has opportunities to chat to others and feel part of a community.

2. For academic staff, it can be difficulty to carve out protected writing time. If you’re in your office, there are the regular distraction of emails, knocks at the door and all the other work you need to be getting on with. Shut up and Write is in your calendar; it’s protected, quiet time.

3. Getting out of your usual space can be stimulating and lead to greater productivity or new ideas. The same old four walls may not always be doing you a favour.

4. One PhD candidate noted that the regular writing slot is helpful psychologically and also in terms of ensuring there is new written work to discuss in supervision sessions. Put simply, if you know you’ve got time to write, you don’t have to worry about not writing the rest of the time.

5. Free tea and biscuits!

Turn up and Talk

As a counterpoint to Shut up and Write we’re hoping to pilot a series of sessions to facilitate conversation among researchers.

The Research Hub provides a great space for informal events and we would like academics from across the campus to use it to present their research and to develop cross-disciplinary networks.

Some ideas might be:

  • Speed networking – facilitated networking via timed one-on-one conversation
  • Data conversations – come and talk about your quantitative or qualitative datasets and associated issues
  • Conference clinic – come and practice your presentation in a supportive environment

Let us know what you think and get in touch with any ideas of your own.


Leeds Postgraduate Researchers access White Rose Libraries


Postgraduate researchers from the University of Leeds, University of Sheffield and the University of York can become White Rose Libraries members.

Leeds postgraduate researchers can register as White Rose Libraries members at the University of Sheffield and the University of York, providing them with the same borrowing rights and access to the physical collections that are afforded to local postgraduate researchers. Access to Library electronic resources is limited, however, due to licensing restrictions.

Becoming a White Rose Libraries member

Visit a Library enquiry desk at any of the University of Leeds libraries and show your Library card to receive a confirmation letter. Present this letter, along with your University of Leeds Library card on arrival to the University of Sheffield Library or the University of York Library to be registered as a White Rose Libraries member.



Open in order to…democratise knowledge: from Glasnost to the commodification of information

“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.”

Kofi Annan, 22 June 1997.1

Freedom of information is a condition of a free society. Access to information is prerequisite for education and learning, development and progress but many people find access to be impeded because of commercialisation in the supply chain. This inequality is particularly pertinent in low-income countries, where information can transform practices in every field of human endeavour, including health, agriculture and environmental management, and underpin sustainable development1.


Open access (OA) to scholarly information removes the affordability and access barriers and allows a broader audience to benefit from the knowledge in research papers, including those outside academia, such as business owners, educators and third sector organisations. The serials crisis, in which journal prices have risen rapidly and the need to reform scholarly communications have given impetus to the OA movement. There was, however, a different agenda in Central and Eastern European countries after 1989.

In the last in our series of posts for International Open Access Week we focus on the importance of information in a free society. This account of the role of libraries during the democratic and economic transition in post-communist Central and Eastern Europe (CEE) will draw examples from library assistance programs and initiatives to distribute academic journals to illustrate the importance of free and unrestricted access to information to support democratisation and other reforms. The signing of the Budapest Open Access Initiative in 2002 followed a decade of work by the OSI to instigate a sustainable model to supply depleted libraries with valuable journal content. OA would have been ideal to support scholars who are not privileged with journal subscriptions.

In a keynote speech, Russell Bowden of the Library Association outlined the need for information to sustain and assist in the development of emerging democracies. In the communist regimes of CEE, the Communist Party nomenklatura understood the importance of information and, specifically, the need for effective information control to retain their power. Newspapers were censored. Book production was centralised and publishing decisions were closely controlled, which is why illegal samizdat literature was distributed by dissidents2.

Behind the iron curtain, vast networks of libraries functioned primarily as propaganda agencies. Librarianship involved political duties and collection development was constrained by strict ideological adherence3. Book collections were usually closed with access to certain parts being strictly controlled2.

These measures were utilised, primarily, to restrict access to information and were discontinued after the collapse of communism that began with the revolutions of 19892.

The fall of communism and a tremendous thirst for knowledge

Soon after the communist regimes were removed, concerned librarians arrived in the former Eastern bloc to assess the state of libraries and librarianship after years of deprivation. Accounts of the impoverishment they encountered abound in the professional literature during the early 1990s. Ulla Højsgaard represented the Danish National Library Authority in a Danish-Swedish team that assessed Bucharest’s libraries in March 1990. Højsgaard reminds us:

It is necessary to try to understand how much the libraries have suffered, how much harm has been done through the total isolation from the international library community, the financial starvation, and the constant political control4.

The Scandinavian librarians concluded that the scarcity of photocopiers, lack of automation and methods of interlibrary cooperation resembled Western Europe in the early 1950s. Gheorghe Buluta, Bucharest Municipal Library Director, concurred, describing the situation as a ‘slip in time’, that resulted from segregation5.

Tanja Lorkovic, curator of Slavic and East European collections at Yale University Library, toured CEE in May 1990 to investigate the impact of the political and economic upheaval on library systems. She saw ‘evidence of economic devastation reflected directly in the status of the libraries… [which were] near the bottom of the list of priorities for reform’6 The use of library resources was hindered by: deteriorating facilities; staff shortages; preservation problems; a lack of photocopiers; and, on occasions, exorbitant fees7 Furthermore, access to collections was frustrated by closed stacks in most libraries8.

A hunger for books was prevalent in the fragile new democracies of CEE. However, the ideological constraints on collection development were replaced by limitations resulting from a scarcity of funding and lacunae remained plentiful in the book and journal holdings of major libraries9. In Romania, ‘a tremendous thirst for knowledge of Western culture’ was observed10 and the ‘hunger for Western business information [was] almost physical’11.

Economic reforms prompted an increase in demand for business information8, but libraries in CEE had little or no experience of providing information services to democratic participants or the business community12. It was soon recognised that as gate-keepers of and gateways to information libraries had an important role to play in the post-communist transition13 But, library systems required an overhaul to meet the emerging information needs14.

Wealthy benefactors

From 1990, libraries had to justify their existence and compete for scarce funding15 Philanthropic foundations, including the Open Society Foundations (OSF) and A. W. Mellon Foundation, funded projects which brought major changes to libraries in CEE15 16 Simultaneously, European Union and United States Information Agency initiatives sought to raise awareness of the ‘importance of information in advancing democracy’17 18, improve the outdated information infrastructures, and instigate new practices in the neglected service sector17

The changes to library administration, budgeting, education, technology, and collection policies emulated Western practices14. Library reforms prioritised access to information. The availability of technology, expertise and funding produced an efficient exchange of information in the libraries of the Czech Republic, Hungary, Poland and Slovakia which were quickly integrated with Europe17 19. Indeed, connecting the library networks of CEE to a global information network was a key element of the integration with the world community, based on freedom and democracy20.

Although exogenous funding made library development easier, Caidi questions whether it also imposed ‘the dominating discourse of development and modernity’, and, consequently, blocked alternative endogenous possibilities for more participatory processes15. This point is echoed by Robinson who states that ‘the motives for Western support… are, not surprisingly, an amalgam of idealism and self-interest’21. In the same vein, Pateman claims that genuine philanthropy was mixed with projects that sought to exploit a new market that rewarded information consultants handsomely22.

Deprivation of scholarly information

The deprivation of foreign scholarly publications in research libraries during the communist era had lasting, detrimental consequences for teaching and research at universities in CEE. Initially, international book and journal donations supplied much needed new texts but they also delivered masses of unwanted material. Acquisition grants helped a select group of research libraries update their book holdings and take five-year subscriptions to scientific journals. Informal networks of local and international academics were established quickly to ensure that partner libraries were receiving appropriate material23.

Professor William Hunt established the St Lawrence Solidarity Project to improve the holdings of research libraries in Poland, Hungary and Czech Republic and underpin the intellectual integration of CEE. He reported that ‘energetic and competent scholars in Eastern Europe are simply unaware of the existence of important western works in their field,’ and suggested that providing multiyear journal subscriptions would be the most efficient use of resources23. Single year subscriptions were a concern due to the implicit pressure to fund renewals24.

In 1990, Arien Mack founded the Journal Donation Project (JDP) to develop the research and teaching capacities of higher education institutions throughout CEE. By providing research libraries with subscriptions to high-quality English language titles and backfile collections, the JDP aimed to build journal archives in the countries of CEE that had, for 45 years previous, been unable to acquire these titles. The JDP was reliant upon subscription donations from publishers until 1995. From 1996, however, a reduced-cost subscription program was introduced with discounts of up to 50% available on over 5,000 journals25. Quandt suggests that it was a necessary resort because the expansion of the JDP eliminated any potential to provide all partner libraries with free access to requested titles. Through partnerships with major publishers, the JDP continues to offer libraries valuable assistance to acquire stellar journal titles.

A report by the Civic Education Project in 1994 assessed the effectiveness of Western assistance projects in fulfilling information needs in CEE. A general trend for prioritising quantity over quality and, consequentially, supplying libraries with material of limited utility was criticised and the difficulties encountered when librarians, who were often unfamiliar with market realities, were required to make selection decisions were also highlighted24.

Open societies in a digital age

The Open Society Foundations (previously the Open Society Institute) is a philanthropic network founded by George Soros to support the transition to democracy in the countries of post-communist CEE. Last week, on 17 October 2017, Soros transferred around $18 billion (£13.7bn) to the OSF, which became the third largest foundation in the world, with only the Bill and Melinda Gates Foundation and Wellcome Trust being better resourced.

Soros recognised the importance of information in a free society and funded library reforms, journal donation programs and electronic information initiatives to facilitate access to published research. In 1992, the International Science Foundation (ISF) was launched in the former Soviet Union (fSU) to help scientists and encourage new approaches to funding and managing research. The ISF’s Library Assistance Program was established in 1993 to supply major libraries with academic journals. Over 100 titles were distributed to almost 400 libraries in 199426

In 1995, the Library Assistance Program was extended into CEE and provided libraries with complete 1994 and 1995 volume sets. From 1996, the ISF continued as the Science Journals Donation Program, which supplied hard copy journals costing approximately $2 million per year. A planned switch to e-journal supply was hindered by limited internet access that was a consequence of the dilapidated communication infrastructure in some regions26. But the potential to increase access to information in the digital age did not go unnoticed.

EIFL (Electronic Information for Libraries) began as an Open Society Institute (OSI) initiative in 1999. Its mission is the enablement of ‘access to knowledge through libraries in developing and transition countries to support sustainable development.’ EIFL negotiated an e-journal license with EBSCO for full-text access to 3,500 journals in five databases; 1.4 million articles were downloaded in 200023.

For almost a decade, the Soros Foundations funded journal acquisitions and library reform projects to facilitate access to scholarly information and reinvigorate research institutions in the post-communist countries of Central and Eastern Europe. Digital information rapidly increased in importance during the same period but exploitative business practices had curtailed any possibility of an egalitarian turn in academic publishing. Establish EIFL and merging the Library Network Program and Internet Program with its Center for Publishing Development to form a new Information Program put the OSI in a strong position to utilise digital information. Open access to online information was an immediate focus for the Information Program27.

The Budapest Open Access Initiative (BOAI) was the outcome of a meeting convened by the Open Society Institute in Budapest on 1 and 2 December 2001. The meeting facilitated lively discussions between sixteen participants but often featured divergent analysis and critique of the dysfunctionalities of an outdated model of scholarly communications. It ended without an agreement and a position paper was crafted through online collaboration; convergence was reified in the document, bearing sixteen signatures, that appeared on 14 February 2002. The opening paragraph portrays a scholarly community in which knowledge is disseminated freely as a public good, thus removing the inequalities that exist when an expensive subscription is required to access commodified information28. This is not an unachievable idyll; it is the definition of open access:

An old tradition and a new technology have converged to make possible an unprecedented public good. The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment, for the sake of inquiry and knowledge. The new technology is the internet. The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds. Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.

Budapest Open Access Initiative, 2002.28

In 2002, shortly after the BOAI was agreed, the OSI Information Program committed at least $3million to promote OA during a three-year transition. By April 2005, it had provided grants to a total value of $1,766,632 for OA projects and realised that the transition to OA will take far longer than three years28.

It is fifteen years since the BOAI first defined OA and outline the dual strategies of self-archiving and OA journals for implementing an OA model of scholarly communications. These strategies are also known as green and gold OA.

Open access had achieved a 22% share of papers published by 2014; immediate OA accounted for 17% and embargo periods delayed OA to the other 29.

The damage to the library networks and research infrastructures in CEE that resulted from the information control of utilised by authoritarian rulers during the communism period are, in many instances, still being repaired. For example, the Bill & Melinda Gates Foundation Global Libraries initiative modernised the public library network in Romania between 2010 and 2016 by providing IT and internet connectivity. Similar initiatives to improve public libraries have been completed in Bulgaria, Lithuania, Latvia, Ukraine and Poland.

As libraries around the world are have internet connectivity and facilitate access to information. Access can, nevertheless, be impeded by financial barriers when information is commodified. Open access can provide readers with free access to the knowledge contained in research outputs. Depositing research outputs in an institutional repository can facilitate open access to current information that might otherwise be unaffordable and thus unavailable to fulfil information requirement of projects that support sustainable development.


1 Annan, K. (1997) ‘If information and knowledge are central to democracy, they are conditions for development, says Secretary-General’. United Nations Meetings Coverage & Press Releases. Press Release SG/SM/6268, 22 June 1997. Available at: (accessed 24 October 2017).

2 Bowden, R. (1995) ‘Emerging democracies and freedom of information: keynote address’. In Emerging democracies and freedom of information. Proceedings of a conference of the International Group of the Library Association (IGLA), Oxford, September 1994, edited by Barbara Turfan, 3-9. London: Library Association Publishing.

3 Smith, I. A. (1995) ‘Developments in library information services and access to information in the Baltic States since renewal of independence’. In Emerging democracies and freedom of information. Proceedings of a conference of the International Group of the Library Association (IGLA), Oxford, September 1994, edited by Barbara Turfan, 55-65. London: Library Association Publishing.

4 Højsgaard, U. (1990) Assistance to Romanian libraries: Results from a Danish-Swedish visit to Bucharest: Dan Shafran representing the Royal Library Stockholm, Ulla Højsgaard representing the Danish National Library Authority, March 11-18, 1990, and suggestions for action. Copenhagen: IDE, Danish Institute for International Exchange of Publications, Danish National Library Authority.

5 Mowat, I. (1990) ‘Romanian library development: Past, present and future’, Library Review, 39 (4), 41-45.

6 Lorkovic, T. (1990) ‘News special: service, collections in disarray: Revolution not over for Eastern European libraries’, American Libraries, 21 (8), 712-713. Available at: (accessed 24 October 2017).

7 Barr, T. (1993) ‘Three approaches to a brave new world: SEES’s special program’, College and Research Libraries News, 54 (9, October), 517-518.

8 Smith, E. (1995) ‘Facing the challenge of democratization’, College and Research Libraries News, 56 (5), 324-325.

9 Sigal, L. V. (1990) ‘The editorial notebook; Starved, for books’, New York Times, 21 May, 20. Available at: (accessed 24 October 2017).

10 Heald, T. (1990) ‘Books for Rumania’, The Spectator, 2 June, 264 (8447), 26. Available at: (accessed 24 October 2017).

11 Mudrock, T. (1992) ‘Business librarian reaches out to Romania’, Library Directions: A Newsletter of the University of Washington Libraries, 2 (3). Available at: (accessed 25 October 2017).

12 Mowat, I. (1993) ‘Eastern European libraries: the worst and best of times’. In The Bowker annual: Library and book trade almanac, 38th edn, edited by Catherine Barr, 106-112. New Providence, NJ: R. R. Bowker.

13 CORDIS (1999) Library cooperation with Central and Eastern Europe. Available at: (accessed 24 October 2017).

14 Raymond, B. and Adams, K. (1993) ‘Former Eastern Bloc librarianship in transition: Hungary, the Czech Republic and Slovakia’, Canadian Journal of Information and Library Science, 18 (3), 36-50.

15 Caidi, N. (2003) ‘Cooperation in context: library developments in Central and Eastern Europe’, Libri, 53, 103-117. doi:10.1515/LIBR.2003.103.

16 Caidi, N. (2006) ‘Building “civilisational competence”: a new role for libraries?’ Journal of Documentation, 62 (2), 194-212. doi:10.1108/00220410610653299.

17 Caidi, N. (2004) ‘National information infrastructures in Central and Eastern Europe: Perspectives from the library community’, Information Society, 20 (1), 25-38. doi:10.1080/01972240490269979.

18 Hausrath, D. C. (1990) ‘United States Information Agency Bureau of Educational and Cultural Affairs: The Eastern European challenge’. In The Bowker annual: Library and book trade almanac, 35th edn., edited by Filomena Simora, 118-128. New York: R. R. Bowker.

19 Lass, A. and Quandt, R. E. (2000) Library automation in transitional societies: lessons from Eastern Europe. New York: Oxford University Press.

20 Stoyanova, N. (1995) ‘Conference reports: Development of information and library networks in the countries of Central and Eastern Europe as a part of the global exchange of information 5-9 May 1995, Sofia, Bulgaria’, Electronic Library, 13 (4), 407-409.

21 Robinson, W. H. (1992) ‘Library has role to play in developing democracies’, Library of Congress Information Bulletin, 51 (January 27), 35-38.

22 Pateman, J. (1995) ‘Libraries under communism and capitalism’, Focus on International and Comparative Librarianship, 26 (1), 3-16.

23 Quandt, R. E. (2002) The changing landscape in Eastern Europe: a personal perspective on philanthropic and technology transfer (Europe in Transition Series). Oxford: Oxford University Press.

24 Civic Education Project (1994) Assessing the effectiveness of book and journal donations to Eastern Europe. New Haven: Civic Education Project. Available at: (accessed 24 October 2017).

25 New School for Social Research (2013) Journal Donation Program. Available at: (accessed 24 October 2017).

26 Hagemann, M. (2017) The Role of the Soros Foundation in disseminating scientific information in the former Soviet Union. Available at: (accessed 24 October 2017).

27 Soros Foundations Network (2002) Soros Foundations Network 2001 Annual Report. Available at: (accessed 24 October 2017).

28 Budapest Open Access Initiative (2005) Grants: Open Access Projects supported by the OSI Information Program as of April 2005. Available at: (accessed 25 October 2017). 29 Butler, D. (2016) ‘Dutch lead European push to flip journals to open access’, Nature, 529 (7584), 13.

Open in order to…discover buried connections: Text and Data Mining

By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.

Budapest Open Access Initiative (2002)

Text and data mining (TDM) is defined by the UK Intellectual Property Office as “the use of automated analytical techniques to analyse text and data for patterns, trends and other useful information”.  In our penultimate post of International Open Access week we consider how TDM will benefit from full open access and look at some of the initiatives, services and tools in this exciting area.

TDM and Copyright

One of the main impediments to TDM cited in a 2012 report, Value and benefits of text mining, was copyright restrictions. In June 2014, the U.K. Government introduced reforms to enable “researchers to make copies of any copyright material for the purpose of computational analysis…if they have “lawful access” to the work” (section 29A of the Copyright, Designs and Patents Act 1988 (CDPA)). However, well over 3 years later, TDM is still far from straightforward, in large part due to restrictions associated with subscription content.

Needles in haystacks

Imagine that there is a cure for cancer already out there in the scientific literature. All you need to do to win that Nobel Prize is read and synthesise tens of thousands of research papers and datasets, to find the needles of insight in the scientific haystacks.

Even the most assiduous academic, or the most well funded team of researchers, can’t hope to excavate the mountains of information at their fingertips, and that continue to accrete at an exponential rate. But a machine can, specifically a universal Turing machine, the digital computer.

Except it can’t because, still in 2017, over a quarter of a century since the invention of the web, vast swathes of the scientific literature are out of bounds, locked behind paywalls and controlled by corporations like Elsevier, Wiley-Blackwell, Springer, and Taylor & Francis

The UK copyright exception explicitly states that “researchers will still have to buy subscriptions to access material” and while Elsevier have developed an API to enable those with a subscription to access full text content as XML for the purpose of TDM, and will even consider requests for access from non-subscribers on a “case by case basis”, access is still very much on their terms, as demonstrated by this post by Chris Hartgerink from November 2015 – Elsevier stopped me doing my research.

It’s a far cry from the Budapest Open Access Initiative of 2002.

One initiative that is working to leverage the broad corpus of open access content is the CORE aggregation service from the Open University.

CORE – aggregating the world’s open access research papers

CORE harvests open access content that meets the BOAI definition and works with a range of stakeholders to exploit a vast corpus of nearly 80 million open access articles. Metadata and enriched full text content is made available for both human discovery with a Google style search box and via an API.

Two examples of the potential for TDM are their recommender service for repositories and ‘semantometrics’ – the first of these, the CORE Recommender, can be seen in action right now on just about any WRRO record whereas semantometrics is more experimental.

CORE Recommender

Recommendation systems are de rigueur for web based services, typically based on user bahaviour tracked by cookies. Think Amazon.

The plugin from CORE, however, uses an algorithm to discover ‘semantic relatedness’ between articles by representing text documents as ‘vectors’.

While the mathematics is one thing*, the crucial point is that similar documents offered to you for this paper about using text-mining analyse patients’ experiences of colorectal cancer care really are similar based on a semantic analysis of millions of articles.

* for more information see the ‘vector space model


N.B. Admittedly the first result is the same paper from another repository which should probably be filtered out. At least semantic analysis works!


Semantometrics is not easily summarised and interested readers are referred to the full report. Essentially what it says is that computer analysis of an article’s semantic content and comparison with the broader research corpus can provide insight into the quality of research practices -whereas traditional bibliometrics, or indeed alternative or ‘altmetrics’, are quantitative and provide only a proxy for quality.

Might an evolved metric based on this technology provide a viable and scalable alternative to peer review?

As part of the EU funded OpenMinTeD project the CORE team led workshop at the Open Repositories conference (OR2016) in Dublin last year covering the technical requirements that can enable the text mining of repositories – see the OpenMinTeD blog  for discussion of the workshop including presentation slides.

The UK Scholarly Communication Licence (UK-SCL)

There’s an irony in that established scholarly business models provide publishers with vast quantities of data they can mine to inform and develop yet more products and services to sell back to the academy. Full open access to the literature and underlying data with appropriate Creative Commons licensing will enable us to develop more effective tools and services of our own without being beholden to the commercial gatekeepers.

The UK Scholarly Communication Licence is an open access policy mechanism which ensures researchers can retain re-use rights in their own work and is a response to both the ongoing transition to open access and concerns around growing requirements for researchers to assign copyright to a publisher at the point of acceptance. It provides a standard set of licence terms (CC-BY-NC) which permits text and data mining, and re-use of all or parts of the work by the academic in ways other than as part of the original publication.

For more information about UK-SCL see the website –

Other tools for TDM

There are an increasing number of tools available for TDM, many free to use:

  • VOSviewer developed at the University of Leiden is a powerful tool to analyse text and data. It utilises natural language processing techniques to create term co-occurrence networks based on textual data and features advanced layout and clustering techniques. It can also be used to visualise different types of bibliographic network for example. See YouTube for an excellent video tutorial.
  • Voyant Tools is an open-source, web-based application for performing text analysis. It supports scholarly reading and interpretation of texts or corpus, particularly by scholars in the digital humanities, but also by students and the general public. It can be used to analyze online texts or ones uploaded by users [Wikipedia]
  • Medline Ranker is dedicated to scientists interested to rank the biomedical literature according to a selected topic. The query page allows to search for any biomedical topic. The web server is fast enough to process thousands of scientific abstracts from the PubMed database in few seconds [David Rothman]

An actual cure for cancer buried in the literature may well be hyperbole yet the principle stands, that a computer has the capacity to identify connections and patterns at a volume and speed that a human reader cannot hope to match and that full open access is crucial to get the most from this technology and from the scientific literature.