Skopher: Supporting the Research Process

I’ve recently revisited several tools for managing academic references and reference material and find myself disappointed at what’s on offer. Over the years I’ve tried combinations of JabRef, BibDesk, Yojimbo, Yep, Papers, iPapers, Sente, DEVONthink, and CiteULike, but have never been happy with their features. I’m continually looking for something to better support and integrate with my workflow — particularly when collaborating with others.

At the moment, I manage most of my reference material by hand.

Skopher Features

A couple of days ago I was pointed in the direction of Mendeley, a new entry in the market combining web based reference management with a desktop interface. It has the beginnings of support for collaboration, but is missing functions that would tempt me to become an early adopter. I tweeted as much, and, shortly after, received a reply from Victor at mendeley.com enquiring about what I thought was missing. It turns out that some of my desired features are in the works — which I look forward to seeing.

So what does any of this have to do with Skopher, you may well ask? Well, a couple of years ago (February 2007 to be precise), frustrated by the quality of tool support for reference management, Ric Glassey and I sat down and designed our ideal tool for carrying out research. The result was Skopher (or Skophr if you prefer – it was still fashionable to drop vowels in your company name back then).

Now, it’s likely you won’t have heard of Skopher, because we never found the time to build it. So — having recently admitted to ourselves that we probably never will — I wanted to take some time to document the idea publicly. It describes exactly what I’m looking for in reference management software. And, if somebody stumbles across it and agrees, then maybe I’ll get what I’m looking for somewhere down the road.

Skopher was to be a community-based website to support researchers and the research process in five specific tasks: 1) management of references and reference material; 2) end-to-end integration of reference material with the writing process; 3) navigation of key literature in a topic; 4) keeping up-to-date with newly published material; and, 5) finding relevant publication venues.

The core of Skopher was the modelling of research papers. Each paper was to be uniquely identifiable, with an associated webpage displaying general information and reference details (user submitted, extracted from pdfs, scraped from the web etc.), with links to digital copies of the material (similar to citeseer).

Skopher's reference management engineThe key feature we planned for reference management was collaborative improvement of references through a wisdom of crowds approach — the greater the number of data sources and contributors to the website, the more accurate the references became (this, excitingly, is on the cards for Mendeley).

There’d be sections for discussing each paper, although we hadn’t decided on an implementation – the comment on our design doc was “something cool and innovative here would be nice”. Users would be able to upvote or downvote a paper as recommended reading (ala Reddit or Digg), and this would factor into an overall rating for the paper that considered:  publication venue, amount of discussion generated, number of citations, recency etc. Finally, papers were to have ‘I’ve read this’ and ‘I want to read this’ buttons to help users manage their paper libraries.

In addition to cross-referencing papers that cite and are cited by each paper (Google Scholar, ACM Digital Library, CiteSeer), we wanted to exploit interesting semantic relationships between papers, authors, and venues. For example:

  • Inferring the impact of a conference/journal from the ratings of the paper it contains.
  • Papers in agreement [papers that support a given argument]
  • Papers in disagreement [different positions on the same issue]
  • Chains of reasoning [specific arguments from first principals]
  • Citation semantics [e.g., aims to improve upon work by, is the seminal paper in field at time of writing, is the position paper that this work relates to, is related work by the same author/group]

Another main ambition was the integration of Skopher with the activity of paper writing. As we work almost exclusively with LaTeX, we envisioned a replacement to BibTeX that would grab references from the web to resolve citation keys in the document. Fully automated, combined with the wisdom of crowds approach to automatic reference correction, Skopher becomes an incredibly powerful tool for reference management that integrates seemlessly with the document preparation process.

As with papers, users of the site had profiles including affiliations and affiliation history (LinkedIn, academia.edu), a personal bibliography (mendeley), and research interests (academia.edu). Each user would be able to create and join groups of researchers (friends/colleagues, research group, etc.), and create and share collections of papers with individual peers and groups (mendeley). Publication venues too would be modelled. Historical events or publications would link to papers they contained (and be scored based on the individual scores of those papers), while future events would have call for paper information, organiser and venue information etc. (eventseer.net).

Finally, all papers, publication venues, and research interests would be tagged with topic keywords. The keywords would be modelled as a hierarchy or lattice denoting topic -> subtopic relations and would be used to drive the delivery of content. For example:

  • suggested reading for users based on research interests and paper scores
  • possible publication targets
  • reading guides for topics based on a combination of citation counts and paper scores

In addition to working with topics, users would be able to set up notifications for individual authors, research groups, institutions etc. The key was making information access as easy and open as possible, encouraging the development of 3rd party tools to interact with the website and its data. As a first step, we planned to expose information via customisable RSS feeds, with an RDF based interface somewhere down the road.

So that’s Skopher. I’m disappointed we never got around to building it, but glad to see some of the ideas at its core have taken root elsewhere. There is great potential for tools to support researchers and the research process. Current offerings barely touch the tip of the iceberg.

Skopher TODO List


About this entry