This page describes aspects of the FARMS algorithm from a user's perspective. I do not discuss the underlying algorithms, but focus on some appealing characteristics of this technique.

The paper describing FARMS has been published in Bioinformatics:

FARMS as a summerization technique

A summarization techniques typically makes use of some form of averaging the individual probes in a probe set to generate a single intensity value per probe set. Typically the idea is that some probes might have not performed well and therefore calculating a median or somehow discarding the measurements of such "misbehaving" probes is desireable. As FARMS contains a calculation step that focusses on the signal, FARMS appears to be very good in removing noise from the data. The following graphs are GC-RMA data for an example gene across multiple cell lines measured in triplicates. When looking at the replicates, they seem much more reproducible after FARMS summarization vs. GC-RMA.

Summarization using GCRMA
Summarization for a given probe set using GC-RMA

Summarization using FARMS
Summarization for a given probe set using FARMS


A major problem in microarray data analysis is the large number of genes being measured while the experiment typically contains only very few samples. This often results in a lot of "interesting" genes, of which an unknown portion is actually showing this "interesting" behavior just by chance (false positive findings). When searching for biomarkers, one data analysis step that is often performed is called feature extraction. This step attempts to retain the genes / features that are most likely to be useful in constructing a set of marker candidates for classifying new samples.

An advantage of the Affymetrix platform is that multiple probes with different sequences are measuring a given transcript. In our paper we use this feature to identify the genes that show the "interesting" alteration in gene expression consistently for all of their probes, thereby providing the researcher with an objective tool to eliminate findings that have appeared just by chance. We call this filtering technique "I/NI-calls" which stands for making a call whether a gene is "informative" (it truly shows a potentially "interesting" behaviour) or "non-informative" for a given experiment.

The paper describing this approach has been published in Bioinformatics:

The function for calculating I/NI-calls is implemented in the FARMS package. Please visit the FARMS page at the Institute for Bioinformatics of the Johannes Kepler University Linz.