Saturday, March 17th, 2012 by hinrich

Modern biological technologies

No, I am not talking about the financial crisis. I would rather like to point at the increasing importance of default settings in data processing or data analysis procedures. We are creating continuously novel technologies that have a common characteristic: they create ever more data per analyzed samples.

At the same time it becomes apparent that we need to take the time to not only look at one data type at a time, but should rather look at various data linked to a given project to identify relevant biological themes. Some people refer to this as data integration or system biology.

As those approaches become more common, there will be pressure to increase throughput by reducing the time spent on preparing the data. This is likely going to result in cutting corners by relying on proposed defaults from technology providers or software vendors.

We have seen in the past that this can be tricky as technology providers often have their core expertise around the technology and not necessarily around data preprocessing or data normalization.

Therefore I find it important to underline that we need to spend time to define suitable default parameter settings. People involved in the development of algorithms need to reserve time to look at applications of their algorithms and propose the settings that ought to be used for their algorithms when applying them for a given application.

In other words: yes, we do need research into new algorithms and new technologies - yes, we do need to properly document our algorithms. But we also need time to research the right default settings for an application. Scientists will come to rely on them. If we do not provide good ones, we may unintentionally underestimate the power of a technology simply because people cannot make proper use of them.

