SciSurfer: real-time search on journal articles

Imagine a world where real-time search is the norm. You will get just the information you seek landing on your lap the exact minute it becomes available, without you having to explicitly search for it. Will this change the way you do science? SciSurfer thinks it will.

The release cycle of scientific knowledge is slow. It may take up to 2 years for a paper to get accepted in a journal. The publishing process in itself will add a buffer of a few months (arguably because of the time cost of having a paper edition, even though most people will never use it). So, for some of us, it doesn’t feel like we are missing much if we do not get the latest updates on our field the very same minute they are published. Just going to conferences yearly feels like more than enough. But there is a portion of the academia that needs constant updates on their field, as close to real-time as possible. If you are in the life sciences, getting the latest paper about a molecule or a gene you work on before your competitor does may make or break your career.

For those academics, sciSurfer may be a very valuable tool. The basic idea of sciSurfer is to integrate all journal feeds and search over them. Note that they do not archive RSS, so only the latest articles are available. This is a different way to think about search, closer to twitter’s than to Google’s.

image 

[Read more...]

Introducing citeproc-js

Citation copy-editing is one of those deceptively small burdens that have a way of taking over the working day. If left untended, the task of tidying up casually scribbled references can snowball to crisis proportions as a submission deadline approaches. Similarly, when a submission to one publisher is unsuccessful, significant effort may be required to recast its citations in the format required by another. Collaboration outside of one’s own field can bring with it an unwelcome tangle of fresh style-guide quandaries to ponder and fight through. These are things that the machines, if they want to make themselves useful, should be doing for us.

There is plenty of collective experience in this line, and as fate would have it, there are also plenty of collective solutions. In the TeX/LaTeX world, authors and their editors can today choose between BibTeX and BibLaTeX — both of them excellent utilities — with the several variants of the former supported by no fewer than four separate versions of the BibTeX program. [1] Users of WYSIWYG word processors can look to the bibliographic support built into Word or Open Office, or they can turn to an external solution such as EndNote ™, ProCite ™, Reference Manager ™, or more recently Zotero or Mendeley. Migrating data between these environments is a process fraught with uncertainty, but it is sometimes unavoidable when you need this kind of output, and it can only be produced on that kind of system …

[Read more...]

LaTeXSearch: 1M snippets in a searchable database

Springer announced last week the launch of LaTeXSearch.com, a free online service allowing users to search a huge database of LaTeX snippets from Springer journals and publications. This follows the launch of a similar service, a few months ago exposing Springer’s database of scientific images (which suggests a precise strategy on how to build Web services on top of content in their publication database).

LaTeXSearch does what it promises, using similarity algorithms “to normalize and compare LaTeX strings so that, if similar equations are written slightly differently, the outputs are normalized and matched, granting you the broadest possible results set”. The only glitch is that snippets are not cached but generated on the fly, with the annoying result that it can take quite some time to display the rendered version of LaTeX formulas in search results.

Review of Google Wave as a scholarly HTML editor

Google_Wave_logo

Peter Sefton wrote a series of posts on wave. He has published on Scholarly HTML so I read attentively what he has to say. What follows is some highlights of his posts, and my thinking about where things are going. There are at least four things that bother me about wave –as it is today:

1- It’s not really HTML

I thought that waves being XML documents would be a good thing because it’d separate content and formatting. But it seems that they made some strange decisions about how to represent formatting with “very tenuous relationship to HTML”. For example

While there is talk of ‘XML documents’ in the whitepapers etc, a wave document in the current implementation is apparently a series of lines of text. All formatting and what you might think of as structure, such as whether something is a heading or not, is considered an annotation.

[Read more...]

The Changing Dynamics of Scientific Collaborations

Call for participation for a workshop at CSCW 2010
[submission deadline: November 20, 2009]

cscw 2010The confluence of two major trends in scientific research is leading to an upheaval in standard scientific practice and collaborative technologies. A new generation of scientists, working in large-scale collaborations, is repurposing social software for use in collaborative science. Existing social tools such as chat, IM, and FriendFind are being adopted and modified for use as group problem-solving facilities. At the same time, exponentially greater and more complex datasets are being generated at a rate that is challenging the limits of current hardware, software, and human cognitive capability. A concerted effort to create software that will support new scientific practices and handle this data tsunami is redefining the collaboratory and represents a new frontier for computer supported cooperative work.

This follow-on event to a similarly themed workshop at CHI 2009 is intended to foster community among researchers and practitioners from multiple disciplines interested in the changing dynamics of scientific collaborations.
[Read more...]