Gobbledygook

ResearchBlogging.org

In the March 25 edition of Nature, Julia Lane, Program Director of the Science of Science and Innovation Policy Program at the National Science Foundation, wrote an interesting opinion piece about the assessment of scientific performance. She argues that the current systems of measurement are inadequate, as they have several inherent problems and do not capture the full spectrum of scientific activities. Good scientific metrics are difficult, but without them we risk making the wrong decisions about funding and academic positions.

Julia Lane suggests that we develop and use standard identifiers both for researchers and their scientific output (examples given include the DOI for publications and ORCID as unique author identifier), that we develop standards for reporting scientific achievements (e.g. using the Research Performance Progress Report format), and that we open up and connect the various tools and databases that collect scientific output. She cites the Lattes database for Brasilian researchers as a successful example for systematically collecting scientific output. Another example given is the ongoing STAR METRICS project which measures the impact of federally funded research on economic, scientific and social outcomes.

The article emphasizes that is not enough to think about how to best collect and report scientific output, but that it is equally important to understand what these data mean and how to use them, and this may differ from field to field. Knowledge creation is complex and measuring this can not be reduced to counting scientific papers and the number of times they are cited. Social scientists and economists should be involved in this step. Julia Lane suggests an international platform supported by funding agencies in which ideas and potential solutions for science metrics can be discussed.

Flickr photo by jepoirrier.

The article contains a lot of food for thought and has already collected some insightful comments. In perfect timing, Nature this week not only made Nature News available without a subscription, but also added commenting to all their articles. I would like to add some thoughts on topics that were not covered because of space constraints and different perspective.

What are the standard identifiers for research output?
Using standard identifiers for research output is an essential first step, and the standard identifier for scientific papers is the DOI. So why is it that PubMed (the most important database for biomedical articles, published by the U.S. National Institutes of Health) still uses their own PMID and doesn't display the DOI in their abstract and summary views? And where is the DOI in abstracts, fulltext HTML or PDF of articles published in the New England Journal of Medicine, to take just one popular medical journal as an example? Both PubMed and the NEJM obviously use the DOI, but why do they make it so difficult for others?

The unique author identifier ORCID was mentioned in the article (disclaimer: I am a member of the ORCID technical working group). There are many other initiatives for uniquely identifying researchers, most of them older than ORCID which was started in November 2009. But is very important that we can agree on a single author identifier that is supported by researchers, institutions, journals and funding organizations. ORCID already has support from a growing list of ORCID members and is our best chance for a widely supported and open unique author identifier. But this list of ORCID members is very short on funding organizations (with notable exceptions such as the Wellcome Trust and EMBO). What is holding them back, and that includes the National Science Foundation (where Julia Lane works) and the U.S. National Institutes of Health (NIH)?

Persistent identifiers are essential to attribute, cite and share primary research data sets. We have a long tradition for this with sequence data, and there is growing demand in other research areas, especially when huge amounts of data are collected (one example is PANGEA for earth system research). DataCite is a new initiative that aims to improve the scholarly infrastructure around datasets, and to increase acceptance of research data as legitimate, citable contributions to the scientific record.

With the focus on research papers, we forget that we do not have standard identifers for many aspects of scientific activity, including

  • research grants
  • principal investigator in clinical trials
  • scientific prizes and awards
  • invited lectures
  • curation of scientific databases
  • mentoring of students

How do we measure scientific output?
Citations are the traditional way to measure the impact of a scientific paper. Some of the problems with this approach are well-known and were for example highlighted in a 2007 editorial in the Journal of Cell Biology (Show me the data). We need a metric that is open and not proprietary, and that measures the citations of an individual paper and not the journal as a whole. We should also not forget that the number of citations can't be compared between different fields.

A 2009 analysis by the MESUR project indicates that scientific impact of a paper can not be measured by any single indicator (A Principal Component Analysis of 39 Scientific Impact Measures). Alternatives to citations are usage statistics such as HTML page views and PDF downloads, popularity in social bookmarking sites, coverage in blog posts, and comments to articles. The PLoS article level metrics introduced in September 2009 combine these different metrics, and make the data openly available.

How best to measure the other aspects of scientific output is largely unknown. It is possible to count the number of research grants or the total amount of money awarded, but should we simply count the number of submitted research datasets, invited lectures, science blog posts, etc., or do we need some quality indicator similar to citations?

Why do we need all this?
Julia Lane emphasizes that we need science metrics to make the right decisions about funding and academic positions. And I fully agree with her that we need more research by social scientists and economists to better understand what these data mean and how best to use them. There is a lot of anecdotal evidence that suggests that science metrics alone may be poor indicators of future scientific achievements, simply because there are too many confounding factors. Maybe we also need to find a better term, as metric implies that scientific output can be reduced to one or more numbers.

Another important motivation for improving science metrics, and not mentioned in the article, is to reduce the burden on researchers and administrators in evaluating research. The proportion of time spent doing research vs. time spent applying for funding, submitting manuscripts, filling out evaluation forms, doing peer review, etc. has become ridiculous for many active scientists. Initiatives such as the standardized Research Performance Progress Report format mentioned in the paper or automated tools to created a publication list or CV can reduce this burden. Funding organizations are also trying to reduce the burden of evaluating research , e.g. by increasing the time of funding from 3 to 5 years, reducing the number of papers that can be listed in grant applications (German Research Foundation says that numbers aren't everything), or funding investigators and not projects.

Science metrics are not only important for evaluating scientific output, they are also great discovery tools, and this may indeed be their more important use. Traditional ways of discovering science (e.g. keyword searches in bibliographic databases) are increasingly superseded by non-traditional approaches that use social networking tools for awareness, evaluations and popularity measurements of research findings.

Lane, J. (2010). Let's make science metrics more scientific Nature, 464 (7288), 488-489 DOI: 10.1038/464488a

| 5 Comments | No TrackBacks

ResearchBlogging.org

Last Tuesday the Archives of Internal Medicine released a study that anayzed the news reporting about cancer in 8 large-readership newspapers and 5 national magazines in the United States. The authors identified 2228 cancer-focused articles published between 2005-2007 and did a more detailed analysis on a random sample of 436 (20%) articles.

20% of articles discussed cancer in general, 35% focused on breast cancer, and 15% focused on prostate cancer. 32% of the articles focused on survival and 8% focused on death and dying. 57% articles discussed aggressive treatments, but only two articles exclusively discussed end-of-life palliative care. Only 13% of articles reported that aggressive treatment might fail to cure or extend life, and only 30% of articles mentioned that cancer treatments can result in (sometimes serious) adverse events.

Flickr image by Andréia.

Cancer is the second most common cause of death in the United States and therefore cancer news coverage is relavant to many people. One important finding of the study is the relative under-reporting of death and dying and palliative care, despite the well-documented benefits for patients and their families. The Pallimed blog discusses this in more detail. The article was also discussed at Scientific Blogging and at syracuse.com.

I am not surprised by these findings, as they seem to reflect the expectations of most cancer patients and their families towards treatment. In my personal experience as a doctor treating cancer patients, most patients, relatives and their treating physicians (including myself) are overly optimistic about the potential benefits of an aggressive cancer treatment (especially if part of a clinical trial), and talk much less about the possibility of the treatment not working, side effects, or death and dying. The scientific literature supports this personal experience.

The study raises a number of additional questions:

  • What scientific information was used as background information for the news reports? Conference reports vs. published papers, case reports vs. large randomized trials, research in animal models vs. clinical research? Was a source for the research provided in the news reports?
  • What is the cancer news coverage by science/medical bloggers? Is there a similar bias towards aggressive treatment approaches and an under-reporting of treatment failures and adverse events?
  • Are there geographical differences (U.S. vs. Europe, urban vs. rural areas) in cancer news reporting and changes over time?
  • How are other areas of science covered in the media, e.g. other common diseases such as Alzheimer's disease or malaria, climate research or other reasearch areas with large public interest, or basic science research?

Thanks to Ivan Oransky and his Embargo Watch blog to alert me to this paper.

Fishman J, Ten Have T, & Casarett D (2010). Cancer and the Media: How Does the News Report on Treatment and Outcomes? Arch Int Med DOI: 10.1001/archinternmed.2010.11

| 1 Comment | No TrackBacks

Last month (shortly after ScienceOnline2010) David Crotty wrote in a blog post Science and Web 2.0: Talking About Science vs. Doing Science:

Nearly all of the more visible attempts (of science and Web 2.0) so far have focused on talking about science, rather than tools for actually doing science.

Flickr picture from Ivan Walsh.

The blog post is required reading for everybody interested in science and Web 2.0 and has attracted a lot of thoughtful comments (on the blog and on FriendFeed). In another discussion Thomas Söderquist from the Medical Museion in Copenhagen reminded me that there are limitations of what can be done online. My blog focusses on talking about science rather than doing science, but this post is about doing science online. My research focusses on clinical cancer research, and in this field the advantages and limitations of doing science online are obviously different from other subject areas (bioinformatics for example obviously looks very different). It probably makes sense to ask yourself the following questions:

  • Are your research data collected in (or easily converted into) digital form?
  • Are standard data formats and standard tools (preferably as open source software) available?
  • Do you regularly collaborate with scientists in other locations?
  • Is information about ongoing research projects publicly available?
  • Do journals have policies regarding the publication of your primary research data?
  • Are there objections to make the research data freely available?

Are your research data collected in (or easily converted into) digital form?
Electronic medical records (EMR) have the potential to improve patient care and reduce costs. But for now, often only some clinical information (particularly lab and radiology results) is available electronically and paper-based patient records are still commonly used. And both electronic and paper-based records have to be adapted to be useful for clinical research, e.g. by allowing a detailed documentation of adverse events.

The raw clinical data of a patient in a trial (called source data in clinical research) are entered into a case-report form (CRF). The purpose of this two-step process is to make sure that all required data are collected and that they are entered correctly. Many clinical trials now use electronic CRFs or electronic data capture (EDC). But these tools are still surprisingly difficult to use and more expensive than paper-based solutions, so that many trials stick to paper CRFs and enter the data into a computer at a later stage. It also doesn't help that the EDC market is very fragmented, so that institutions have to learn to use several different tools.

Are standard data formats and standard tools (preferably as open source software) available?
Clinical Data Interchange Standard (CDISC) is the standard format for clinical research data. OpenClinica and Clinical Trials Management System of CaBIG are two examples of Open Source tools for clinical research.

Do you regularly collaborate with scientists in other locations?
Most clinical trials are multicenter trials that are conducted in different locations, often even in different countries or continents. The coordination of the different trial locations uses email and web conferencing, but often relies more on human resources than on modern Web 2.0 tools.

Is information about ongoing research projects publicly available?
Clinical research is one of only a few research areas where information about (almost) all ongoing research project is publicly available. Clinical trial registries serve two purposes. They make it much easier for patients and their treating physicians to find relevant clinical trials. And they allow clinical researchers to understand what clinical research is going on in their field, and to avoid publication bias. ClinicalTrials.gov at the US National Institutes of Health is the largest clinical trial registry. For reasons that are difficult to understand, the European Clinical Trials database (EudraCT) is not available to the public, but work is in progress to change that.

Do journals have policies regarding the publication of your primary research data?
An article in the BMJ last month by Iain Hrynaszkiewicz and colleagues1 tries to provide guidance on how to provide raw clinical data for publication. The main focus of the paper is patient privacy. Publication of raw clinical data either as dataset or as part of a research paper is still very uncommon. The meta-analysis of individual patient data2 requires the raw clinical data of several clinical trials, and because of the required effort is probably underused.

Are there objections to make the research data freely available?
Patient privacy is a major concern when publishing raw clinical data, and it's therefore critical to remove all identifying information from the dataset. This not only includes direct identifiers such as patient names, birthdates, unique identifying numbers or facial photographs, but also indirect identifers such as place of treatment, rare disease or treatment, occupation or place of work, etc. It is the consensus of the authors of the BMJ paper that datasets with three or more indirect identifiers should be evaluated for the risk that individuals might be identifiable before they are made available,

In contrast to Open Notebook Science, it is impossible to make the results of a clinical trial publicly available before the trial is completed. The statistical design of the clinical trial is based on the number of patients needed to show a significant difference - looking at interim data could influence patient recruitment. Double-blind designs (where neither patient nor treating physician now which treatment arm the patient is in) are based on the same principle.

In clinical research there is often more at stake than the well-being of patients and the careers of the scientists involved. Pfizer and Roche last week each lost $1 billion in stock market value after they both announced negative results of large phase III cancer trials. Drug companies therefore have a great interest in whether and when research findings are published, and that includes the raw clinical data. The selective publication of positive research findings is called publication bias and the mandatory reporting of clinical trial results in ClinicalTrials.gov was introduced by the FDA to reduce publication bias.

Summary
Online tools can help with doing clinical research and there is probably a lot of untapped potential. From the perspective of an individual researcher or a small research group, it probably makes the most sense to develop and/or use tools that solve specific problems. I use the online project management tool Basecamp to coordinate one clinical research project. And I have created a web-based clinical trials registry for our university hospital. The internet version of this registry helps patients and referring physicians to find clinical trials at our institution. The intranet version helps us manage our clinical trials, e.g. by keeping all required documents in one place, keeping track of the patients registered in clinical trials, and serious adverse event reporting. And I don't see why clinical researchers can't adopt the Panton Principles - which endorse that data related to published science should be explicitly placed in the public domain - whenever possible.

1 Hrynaszkiewicz et al. Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers. BMJ 2010 doi:10.1136/bmj.c181

2 Simmonds et al. Meta-analysis of individual patient data from randomized trials: a review of methods used in practice. Clin Trials 2005 doi:10.1191/1740774505cn087oa

| 6 Comments

ResearchBlogging.org

Last Monday I was listening to a very interesting presentation by Ian Rowlands, reader in scholarly communication in the Department of Information Studies at University College London. He and his colleagues are interested in how researchers find and use information, and how this has changed with the internet, especially for the Google Generation (people born after 1993). If you want to be part in this research (and have some fun), you can take part in the BBC Web Behaviour Test. The test will help you discover which species of web animal you are (I'm a fox).

Flickr photo by Craig Anderson.

In another project, funded by the Research Information Network (RIN), Ian and his colleagues are studying how researchers are using electronic journals. The findings of the first part of the project were presented and discussed in a workshop last July. The presentations are available as PDF download, and as podcast with interviews of the speakers. The findings were summarized in a paper also published last July: Online use and information seeking behaviour: institutional and subject comparisons of UK researchers.

In the paper, the use of Oxford Journals by 10 major UK research institutions was analyzed in the fields of life sciences, economics and history, using the server logs for the full year 2007. Some of the key findings of the study include:

One third of users access Oxford Journals outside business hours
9.7% of uses happened on a Saturday/Sunday and 30.1% between 6 PM and 9 AM. This means that about one third of users accessed Oxford Journals outside typical business hours, either working late or from home (the study didn't distinguish between these two). These numbers indicate that remote access (from home, but probably also when traveling) is important for many users. This is obviously not an issue for Open Access journals, but institutions need to provide practical solutions (VPN, etc.) for subscription journals. From personal experience this remote access is still overly complicated. And these numbers also mean that librarians will not be available for support questions one third of the time.

Around 40% of sessions originated from a Google Search
In 2004 Oxford Journals opened up to Google for indexing. I didn't expect this important role that Google seems to play in finding scholarly papers, and I would be very interested in feedback from blog readers. Only 4% of sessions originated from Google Scholar (22% in economics). These results probably explain why Google Scholar hasn't seen that much development since it was launched. The search function at Oxford Journals was rarely used.

43% of users of history journals, but only 16% of users of life sciences journals used navigational tools (table of contents, etc.) provided by the journal. This statistic obviously doesn't look at users getting the table of contents via email or RSS, but it again shows that access via search now probably is more common than via browsing.1

Most users spend little time on journal webpages, but return often
The average number of articles viewed per session was 1.1, and the average session time was just over 4 minutes. Users rather return often, usually via a search. These numbers indicate that journal webpages are not a place where users spend a lot of time. Unless journals change this (e.g. by more active involvement of users via comments and other social networking features, etc.), they probably can't expect to generate significant revenue from online advertising. The internet has not only dramatically changed the role of libraries, but also for journals, as users are mostly interested in single articles, rather than the journal as a whole.

The median age of articles was 48 months (life sciences), 73 months (economics), and 90 months (history)
In the life sciences only 25% of the articles were no more than 16 months old, but another 25% were over 104 months old. I would have expected that the median age of articles would be much lower in the life sciences (it was two years in a similar study with ScienceDirect2). It seems as if most papers are not accessed when they are published (in the first few months after publication), but rather as the result of a search strategy, e.g. when writing a paper.

Life sciences users rarely read abstracts on publisher platforms
This should not come as a surprise, as life sciences users typically read abstracts in specialized databases, particularly PubMed. But maybe Journal publishers should stop displaying papers in an abstract view, saving users and themselves some effort. PLoS journals don't have an abstract view, but the Biomed Central journals (which are also Open Access) do. Subscription journals (including Nature) typically display the abstract instead of fulltext to users without subscription access, so there is also no need for a separate abstract view for them.

The number of PDF views was higher than the number of fulltext HTML views (178,152 vs. 106,582). This difference was much more pronounced in economics and history journals, probably indicating that here most papers were printed out and not read on the computer.

Nicholas, D., Clark, D., Rowlands, I., & Jamali, H. (2009). Online use and information seeking behaviour: institutional and subject comparisons of UK researchers Journal of Information Science, 35 (6), 660-676 DOI: 10.1177/0165551509338341

1 My July 2008 blog post Do online journals narrow science and scholarship? discussed potential consequences.

2 CIBER, Evaluating the usage and impact of e-journals in the UK. Working paper 5. Available at http://www.ucl.ac.uk/infostudies/research/ciber/

| 15 Comments | No TrackBacks

This post will be rather boring for regular Movable Type or Wordpress users. But we Nature Network bloggers are new to this stuff.

Create and edit blog posts from standalone applications

* Windows: BlogJet
* Macintosh: MarsEdit
* iPhone: BlogPress

Successfully tested MarsEdit and BlogPress.

Store pictures on blogs.nature.com

Hint: a scientific paper is hidden in here.

You can also store other files with Movable Type, e.g. PDF files.

Trackbacks
Trackbacks were originally invented for Movable Type. They show the links to your blog post. I haven't really used them much on other blogs.

Edit comments
We now can edit the comments on our blog. Can sometimes be helpful, but is also dangerous. Linking to comments is now much easier (click on the date).

Blog URL
The new blog URL is so much nicer (and easier to remember): http://blogs.nature.com/mfenner.

Plugins
No plugins enabled yet. What are recommended plugins for Movable Type?

| 9 Comments | No TrackBacks

Last Tuesday the German Research Foundation (DFG) announced changes to the grant application process, going in effect in July. Researchers are no longer allowed to list all their publications in their grant proposals. The number of publications is limited to five per researcher and to two per year of planned funding (e.g. 6 papers for a 3-year grant). Publications submitted but not yet accepted for publication will no longer be allowed.

Flickr image by CarbonNYC.

Some of the reasoning behind this change was explained in the press conference where the policy change was announced. The DFG wants to put more emphasis on quality instead of quantity, in other words counteract the trend to publish several small pieces of incremental research findings (the least publishable unit or LPU). The DFG didn't say so, but this might also reduce the practice of "honorary coauthorship" with some reaseachers being coauthors of 20 or even 50 papers per year. And the DFG is not happy with the increasing use of the Journal Impact Factor and other metrics as a token measure for the quality of research output. And as a reaction to problems with publication lists in Göttingen they want to stop the practice of including unpublished work in reference lists for grant applications.

These changes will decrease the administrative workload of the applicant, reviewer and the DFG. With much shorter reference lists in grant applications, reviewers will have it much easier to take a closer look at the research output of the applicant, instead of relying on an unfortunate proxy such as the Journal Impact Factor. Researchers seeking funding from the DFG will now probably be more likely to write fewer but more substantial papers. And research that doesn't have the potential for a substantial paper, but is nevertheless worth publishing, can be quickly published in a reasonable journal instead of going through several rounds of submissions to a number of journals.

But how do you select your five best publications (assuming you have written more than five)? Choices include:

  • publication date, e.g. a list of the five most recent publications
  • Journal Impact Factor
  • citation counts, page views, downloads or other article-level metrics
  • personal preference

Using my personal preference (and not too much thought), I picked four papers and one correspondence:

  • Shioda T, Fenner MH, Isselbacher KJ Msg1, a novel melanocyte-specific gene, encodes a nuclear protein and is associated with pigmentation. PNAS 1996 PubMed Central
    The first paper from my postdoctoral research project. We identified and cloned a new gene thought to be involved in cancer metastasis, using a technology called differential display to compare the gene expression profile of two melanoma cell lines. This was before the mouse and human genomes were sequenced, and before microarrays became available. What took us two years of work 15 years ago can now probably be done in a few weeks.
  • Sado T, Fenner MH, Tan SS Tam P, Shioda T, Li E X Inactivation in the Mouse Embryo Deficient for Dnmt1: Distinct Effect of Hypomethylation on Imprinted and Random X Inactivation. Dev Biol 2000 doi:10.1006/dbio.2000.9823
    I spent most of my time as a post-doc generating a knockout mouse for the gene identified in the previous paper. As the knockout mouse had no obvious phenotype, it took another post-doc (the first author) to finish the project.
  • Krege S et al. European consensus conference on diagnosis and treatment of germ cell cancer: a report of the second meeting of the European Germ Cell Cancer Consensus group (EGCCCG): part I. Eur Urol 2008 doi:10.1016/j.eururo.2007.12.024
    This paper summarizes the conclusions of a consensus conference on the diagnosis and treatment of testicular cancer, and is the best review on the subject. I am one of over 80 coauthors, something I haven't done before or since. The journal published this as two papers because of length. This would have been a perfect paper for an Open Access journal, I hope I can convince the coauthors to do so when we update this in 2011.
  • Fenner MH, Beutel G, Gruenwald V. Targeted therapies for patients with germ cell tumors. Expert Opin Investig Drugs 2008 doi:10.1517/13543784.17.4.511
    Testicular cancer is one of the few chemotherapy success stories, as most patients with advanced metastatic disease can be cured. Targeted therapies have become important treatment options in many cancers. This is the first review to look at the evidence for the use of targeted therapies in testicular cancer.
  • Fenner MH. Duplication: stop favouring applicant with longest list. Nature 2008 doi:10.1038/452029a
    This is a Nature correspondence, included here only to show that comments made in a Nature Network forum can end up in Nature. And because it is relevant to this blog post, as I suggested to ask applicants to select their best three, five or ten papers instead of giving grants or jobs to those with the longest publication list.

The Wellcome Trust last year announced a different change to they grant application process. Starting later this year, they will stop accepting proposals for project grants, and rather evaluate the reaseach output of the scientist asking for funding (*Investigator Awards*). They argue that researchers that alrady have shown excellence in the past shoudn't be burdened with the administrative overhead and restrictions of writing a detailed project proposal every three years.

It will be interesting to see how institutions and other research funders in Germany (e.g. Helmholtz or Leibniz) or elsewhere react to this DFG policy change. I would be happy if this is a step towards more reasonable publication policies. And I hope that the upcoming unique author identifier ORCID will not be used for even more complicated bibliometric calculations, but rather as a tool for researchers to showcase their most interesting work.

| 1 Comment | No TrackBacks

Last week Lambert Heller and myself did a two-day workshop Reference Management in Times of Web 2.0 for a group of German librarians. We introduced and tested the following five programs:

The goal of the workshop was to introduce the participants to the Web 2.0 aspects of these reference managers. We briefly talked about Papers and Citavi, but neither of them offers any Web 2.0 functionality. The goal of the workshop was not to pick the best reference manager. With the exception of CiteULike (which is more of a social bookmarking service and can't be used to directly put references into manuscripts), all of them are probably good choices for most users. For some of the minor differences, please check my reference manager chart that I have updated for the workshop (PDF here):

We had used FriendFeed for the slides, links and comments in a similar workshop last July. This time we picked ScienceFeed, both because ScienceFeed can be used for reference management, and to test the service that launched just three days earlier. The ScienceFeed group can be found here, but is in German. FriendFeed and ScienceFeed are not only great for conference microblogging, but are also excellent teaching tools, especially in a workshop where every participant has an internet-connected computer. We also had a few people listening in and putting up comments.

The workshop did help me understand what could become one of the most important features of reference managers. (I would exclude Endnote, because it doesn't allow public groups or sharing of fulltext files). Libraries used to be places where you could find, store and read literature. A library would hold a subset of all the available literature, but still far more texts than an individual could keep at his home. A library serves as an intermediary that helps the user get access to the literature he is interested in.

A reference manager that stores all references and the associated fulltext PDF files in an accessible (public or password-protected) place can fullfill exactly the same role. It is not necessary that an individual user stores every reference and fulltext paper on his own computer. And he doesn't have to find all references for himself. Librarians could help with this, e.g. by not only handling a users search request, but also filing the associated PDF files in a group folder. Other group folders would have the table of contents of your favorite journals (e.g. CiteULike Journals). We used to go to the library for exactly these things. And now we do this all on our own, often not asking for help from our local library.

Flickr photo by haydnseek.

In the last session I talked about non-traditional ways to find scientific literature. Traditional would mean one of the following search strategies, summarized by Duncan Hull et al.1:

  1. Search - Search bibliographic databases
  2. Browse - Scan tables of contents
  3. Recommend - Recommendations by colleagues

Twitter is just a modern tool for strategies #2 (check the Twitter list @mfenner/science-journals for some science journals using Twitter to announce interesting articles) and #3 (papers recommended by friends you talk to via Twitter).

The non-traditional approach basically lets other people do the work for you. Some examples include:

  • Experts pick noteworthy papers in your field - Faculty of 1000 and Research Blogging.
  • You follow what people with similar interests are reading - CiteULike and Mendeley
  • Recommendations based on what is in your library - CiteULike recommendations
  • Most popular articles in your research field of interest - CiteULike and Mendeley. The PLoS article-level metrics have the potential to do the same.

1 Hull D, Pettifer SR, Kell DB. Defrosting the digital library: bibliographic tools for the next generation web. PLoS Computational Biology. 2008 doi:10.1371/journal.pcbi.1000204

| 4 Comments | No TrackBacks

Microblogging is blogging of short text messages, photos or other media and is best exemplified by Twitter. Twitter use has grown tremendously in 2009, and this also includes many scientists.1 FriendFeed is a another microblogging tool that not only allows sending of short text messages, but connects them together in groups and discussions threads similar to what you can do in online forums. FriendFeed, especially The Life Scientists group has been a popular place for many scientists for the last 18 months or so. Cameron Neylon wrote a good introduction to the service back in June 2008: FriendFeed for scientists: what, why, and how?. FriendFeed is a great tool for conference blogging, and the ISBM 2008 conference was probably the first scientific conference where it was used extensively, resulting in a PLoS Computational Biology paper.2 FriendFeed is also often used to comment on blog posts, and here it is competing for attention with comments that are put directly on a blog. FriendFeed is a good example for a generic Web 2.0 tool that is much more useful to scientists than many Web 2.0 tools targeted specifically at scientists (the Facebooks for scientists).

FriendFeed was acquired by Facebook in August 2009, and users started to worry about the long-term future of FriendFeed. An Open Source version of the FriendFeed web server was recently released as Tornado (source code on GitHub). Although other services (including Facebook) offer similar functionality, no service has (yet) emerged as an alternative popular with scientists. FriendFeed use seemed to be declining at the two recent Science Online London 2009 and ScienceOnline2010 conferences, as more and more people were using Twitter.

Last Tuesday Google Buzz was released. Buzz is also a microblogging service, tightly integrated with Google Mail and Google Reader. If offers many of the same features as FriendFeed, and because it integrates with Google Mail, it has a large number of potential users from the start - including a large number of people involved in the Science Bloggosphere. Buzz will certainly get some of the features that are still missing, e.g. an easy way to import content from other sources, including a bookmarklet. And Buzz works very well on iPhone and Android phones and there also uses location information - e.g. all the Buzz discussions near you. But at the moment many people wonder how best to integrate Buzz with FriendFeed and Twitter, and all the other online tools they use - it doesn't make sense to read the same content again and again on all these services.

In this context it is very interesting to see ScienceFeed launching as a new microblogging service this week. ScienceFeed in many ways is similar to FriendFeed, but tries to add features of particular interest to scientists. I spoke with Ijad Madisch about ScienceFeed.

1. What is ScienceFeed?
ScienceFeed is science as it happens, communicated through a microblogging platform. Conceptualized and designed by scientists, it is a bridge between online scientific networking platforms, scientific databases, and the wider online science community. The ScienceFeed platform allows users to post microblogs, sometimes just a few sentences, on scientific headlines, new findings, controversy, conferences and ideas related to science. Community members can follow the feeds of fellow members and comment on topics in which they are interested, allowing real-time communication and transfer of ideas.

ScienceFeed is an interactive and dynamic platform - like science itself. Here, scientists, journalists, librarians, students, and those with an interest in science, will be able to communicate in a way that has no borders. Individuals from all over the world are able to participate and observe, helping to make science accessible to all. Integral to the concept of Science 2.0 is having online resources that are archivable and searchable - ScienceFeed will do just this. Science is not limited to the laboratory: it happens through interactions of communities. ScienceFeed is excited to build such a community.

2. How is ScienceFeed different from FriendFeed?
In basic functionality, ScienceFeed isn't much different to FriendFeed. However, I think with the help of the community we will develop and add applications to the platform that could make it very efficient for scientific communication. There are two differences from FriendFeed which we have already implemented: 1) Specific scientific publications can easily be searched for and then entered as a linked-in reference within a feed, and, 2) Groups can be marked as an event (e.g. a conference). My vision is to have event streams in ScienceFeed, which then can be visualized and presented in a much better way. However, the most important part is that we listen to the feedback of the community and develop specific applications based on their ideas and feedback.

3. What special features does ScienceFeed provide for conference microblogging?
An important feature from ScienceFeed is that groups can be marked as a specific event, such as a conference. Administrators of these groups will be able to import hashtags from Twitter, so all tweets will be aggregated and displayed within this group. A possibility for future growth is the integration of an entire conference program (sessions, panels, etc.) into the group, which then can be commented on by group members. It is important to us that the scientific community has an input into the development of this feature so that we can build a stronger, more efficient platform based on the needs of our users.

4. Can you import references into ScienceFeed only via your reference database, or also via CiteULike or other bookmarking service?
Martin, thank you so much for this great idea. Based on your feedback we worked hard to make this happen before launch. Yes, now ScienceFeed can import from other bookmarking services such as CiteUlike or Connotea. Furthermore ScienceFeed supports CoiNS, which identifies automatically based on a weblink whether or not bibliographic data is in the specified URL.

5. How is ScienceFeed different from Twitter?
There are several differences, but the largest are that in ScienceFeed there is no character limitation and groups can be tagged as a specific event - facilitating real-time, online communication about the event.

6. What is the advantage of having a social networking tool specifically for scientists?
I think the most important part is the non-dilution of information in an environment where the platform and focus is specifically on science. Consider the following: You can find a biomedical scientific paper by searching in Google, but you could also use PubMed, which has a high probability of faster and better results. It is the same as within ResearchGATE: You have large groups (Methods, Immunology, Neuroscience, Philosophy, etc.) with a very focused population, which again makes your search more directed and efficient with better results.

7. What is the relationship between ScienceFeed and ResearchGATE?
ScienceFeed will be a scientific microblogging platform completely autonomous from ResearchGATE, because I think they target the same group, but with various usage patterns.

The publication reference tool used for inserting papers into ScienceFeed accesses the custom-built database of ResearchGATE. This database now has a public API which makes it possible for everyone to connect to the ResearchGATE literature database. I think that microarticles which are pretty successful in Researchgate (published in our ResearchBLOG) could be a part of ScienceFeed as well. I see ScienceFeed as a platform which will be useful to various scientific platforms as Mendeley, Academia, ResearchGATE, etc. It could be a platform that helps connect all these different platforms.

8. Will there be a publicly available API for ScienceFeed?
Yes, there will be an API.

9. What are your responsibilities in ScienceFeed?
I am, as in ResearchGATE, one of the co-founders and a kind of CEO. I want to build a team of innovative and forward-thinking individuals to help develop ideas and work conceptually on the future directions of ScienceFeed.

10. What did you do before working on ScienceFeed?
I am a co-founder and CEO of ResearchGATE and I am also working at Massachusetts General Hospital, Harvard Medical School, in Boston as a researcher. Before ResearchGATE I studied Medicine and Computer Science and completed my doctoral thesis in Virology, while working for some time in Gastroenterology as a medical doctor.

11. Could you provide contact information for people that have further questions about ScienceFeed?
I can be contacted anytime at: ijad.madisch@sciencefeed.com.

1 Bonetta L. Should You Be Tweeting? Cell 2009 doi:10.1016/j.cell.2009.10.017

2 Saunders N et al. Microblogging the ISMB: A New Approach to Conference Reporting. PLoS Comput Biol 2009 doi:10.1371/journal.pcbi.1000263

| No Comments | No TrackBacks

Nature Network turns 3 years old today, and it has been a very interesting ride. I wasn't around when Nature Network started, but posted by first Gobbledygook blog post (the blog had a different name back then) in August 2007. We passed the 50.000 comments milestone just a few weeks ago. And we were told that big changes to the blogging platform underneath are imminent.

Flickr image by Graham Steel.

I have had many, many positive experiences in these 2 1/2 years. I learned a lot about science publishing and met a large number of very nice and very clever people both online and offline. I wrote about 160 blog posts and an uncounted number of comments during that time, and writing blog posts is still a lot of fun and something I like doing on a regular basis (I decided a while ago to aim for one blog post per week). I am also excited about the upcoming Science Online London 2010 meeting, although the exact date and location have not yet been set.

Happy birthday.

| 15 Comments | 1 TrackBack

Just four weeks ago I wrote a blog post titled How do you read papers? 2010 will be different. Not only have we since seen the announcement of the Apple iPad, but last Monday the free Nature.com iPhone app was launched. The application gives access to the full text of all Nature and Nature News content (through until 30 April 2010, how access is handled afterwards hasn't been announced yet). A version for the Android platform was promised for April, and the app will work with the just-announced iPad. I included a few screenshots for those without an iPhone or iPod Touch. A free Nature.com personal account is needed to use the app.

The iPhone app doesn't use HTML or PDF but rather the ePub format. The Nature.com website will soon offer downloads in ePub format (an example article is here). Adobe Digital Editions and Stanza are examples of ePub readers. In contrast to PDF, ePub adapts to the screen size and is therefore a much better format for the iPhone.

References are links in the text, clicking on them opens a new window.

Figures are also links in the text that open in a new window. The figures can be saved to the iPhone Photos application.

The iPhone app also gives access to the full text of Nature News. In contrast to Nature papers, images are rendered within the text.

Nature and Nature News content, as well as PubMed search results can be saved for later reading. This content (or rather the DOI) is also available from the new Nature.com mobile apps page. Because the Nature.com mobile apps page stores only the DOI, a Nature subscription is required to access the fulltext article from there. From the Nature.com mobile apps page you can also export the citation in RIS format.

The Nature.com iPhone app also searches both Nature.com and PubMed. Regular searches can be saved.

PubMed searches will retrieve abstracts, with a link to the fulltext article via the DOI.

Papers for iPhone is another app that allows PubMed searches. You can also search for the latest Nature content, but the fulltext content is available only with a subscription and only as HTML or PDF.

Support for ePub is the most exiting feature for me, as it opens the door for many interesting mobile applications. I hope that more scientific journals will start to use the format (Hindawi was one of the first publishers to support ePub), and that we then start to see mobile apps for more than a single journal.

Bug reports, suggestions and feature requests can be sent to mobile@nature.com. Or add your comments to Henry Gee's Nature On Your iPhone post.

| 7 Comments | 1 TrackBack