University Post
University of Copenhagen
Independent of management


Most cited Danish article was about using statistics on rocks and fossils

Citations — The most cited scientific article, of all-time, from a Danish university is a description of a software package developed and authored by a British paleontologist working at the University of Copenhagen. He has helped a generation of geology scientists and students analyse rocks and fossils.

»I am happy to talk about this unusual publication!«

This was the reply from the paleontologist Professor David Harper when the University Post emailed him a request for an interview about his article from 2001. His article has the highest number of scholarly citations — among all articles from scientists affiliated to Danish universities — all-time.

Later, when the University Post reached him by phone, he said that he was aware that the paper was highly cited, but not that it was the most cited of all ‘Danish’ papers.

Software set students off on new research path

To understand what the paper is about, you have to go back to the 1990s, when new software came via a floppy disk.

Most of us now take for granted that you download software via the internet. But it was not always this way, and a floppy disk-based piece of software called ‘PAST’ that Professor David Harper and a colleague developed in the 1990s has had a lasting legacy on how scientists now do geology.

The article from 2001, that describes the PAST software’s purpose and function, is the most cited, all-time, by a scientist affiliated to a Danish university. According to Professor David Harper, their article helped push the use of statistics and numerical methods in the study of rocks, fossils and the Earth.

»In the 19th and early 20th century geology was largely done subjectively, through observing specimens with little reference to numerical and statistical methods. Nowadays, by contrast, we use big samples, measure everything and base our analyses on this.  When we are describing fossils, for example, and the distribution of fossils through time and space, we use numbers and statistical methods as a matter of course. Our software package helped students start their own research programmes using numerical and statistical methods in geology, so I think it has had lasting value since the time of its first release,« he says.

He himself was at the University of Copenhagen in the period 1998-2011. And his own fieldwork offered an example of how the software could be applied.

»When I was working in the north of Greenland, and listing the fauna found in the rocks we could build up a list of all the species and compare it numerically to others in other areas across the globe of the same geological time period. We could then calculate the diversity of the set of species and map their global geographic distribution,« he explains.

Nowadays scientists typically analyse their data via multi-purpose programming languages like ‘Python’ and ‘R’. Both the data and the software to analyse it can be hosted on servers elsewhere, or in the ‘cloud’.

But not back then.

»When I had my first teaching job in Galway in Ireland, a lot of the students did not have any background in statistics and numerical methods. All statistical work was done on mainframe computers, if at all. But a colleague Paul Ryan and I developed software for microcomputers, something called the ‘PALSTAT’ package for a large format floppy disc. It was cumbersome, but the students could experiment, and the interface was quite easy to use, but this was still just a standalone operation in the geology department. Then in the 1990s we started to migrate it to MS DOS platform, and people started using it on Amstrad computers.«

Free, but asked people to cite the paper

It was when he moved to Copenhagen in 1998 that things really took off for David Harper’s programme.

»I met a mathematician and post doc, who was researching palaeontology called Øyvind Hammer on a visit to the Palaeontological Museum in Oslo. I suggested that we migrate it to the Windows platform. All of a sudden it morphed into a real monster package. It was really exciting!«

»But what should we do with it? Should we sell the software? In the end our decision was to make it free, but we wanted people to cite our paper. So we published it for free, at the same time asking people to cite the paper if they used the package in their work,« David Harper explains.

So this is where most of the 18,786 citations come from.

... some people are rather dismissive and say: It is only a software package and not fundamental science.
David Harper

The original idea for the package was his, but his colleague Øyvind Hammer was »the brains behind expanding the scope of the software and its programming, and together we developed the textbook Paleontological Data Analysis (2006),« he says.

As a geologist, David Harper has focussed on fossil brachiopods, which are animals with shells on the upper and lower surfaces, abundant as fossils, that were attached to the seafloor and were very common during the Paleozoic era 545-248 million years ago.

Lasting legacy

But Brachiopods aside, David Harper still sees the software package as one of his most lasting legacies.

»It is strange in a way, because I have written many books, textbooks and papers, but some people are rather dismissive and say: It is only a software package and not fundamental science. But it has helped students and colleagues start and evolve their own research programmes using numerical and statistical methods in the field, so I think it has been of real value.«

David T. Harper is emeritus professor at the University of Durham in England and chair of the International Commission for Stratigraphy that has the last say in defining the Earth’s geological time periods.

He now lives just across the border in Scotland.