ColorSpace and quality data to FASTQ translator [cq2ip33fq]
and
ColorSpace to BaseSpace translator [colorQ2baseQ]
Despite some free tools on the web claim to do a translation of the 'ColorSpace' and 'Quality' files of the next generation sequencing machines of ThermoFisherScientific/ LifeTechnology/ ABI 5500xl and derivatives to the established and free 'FASTQ' data format, all the existing tools are failing in one or another aspect (e.g. solid2fastq.pl was introducing errors in 2015). Therefore we developed two GPL2 licenced C++ tools.
Cite this website in case of usage / publication.
-
cq2ip33fq
[by Mallela & Korsching] - which is taking a colorspace fasta file and a colorspace quality fasta file. Both files need a corresponding read name structure and the same order. Each data type will be converted into either basespace or a Phred33 conforming quality score. The result should be a true Pred33 (Illumina) conforming fastq fileformat.
-
colorQ2baseQ
[by Korsching] - which is taking a regular fastq format file, where the second line has a colorspace format while the fourth line already has a true Phred33 conforming quality score line. The colorspace will be converted into basespace and the (first) anchor base of the colorspace and the corresponding quality score will be removed - so the length is minus one. Also here, the result should be a true and Pred33 (Illumina) conforming fastq fileformat.
The C++ boost libraries should be installed on the system for this program to work (Debian: sudo apt install libboost-all-dev). This program was tested on Linux - Debian / Ubuntu 14.04-20.04 but should work also on newer LTS versions.
cq2ip33fq : the readme file [02.10.2015] ,
the binary [30.09.2020] ,
source code [30.09.2020].
Download an Eclipse workspace [30.09.2020].
Some test files
(in-house tests were running on a 0.5 TB scale).
colorQ2baseQ : the binary [30.09.2020] ,
source code [30.09.2020].
Download an Eclipse workspace [30.09.2020].
A minimal test file.
CGH data parser
The data parser transforms the exported ASCII raw data from an Applied Imaging system v.3.01 (from 1995)
in a vector like format usable to be easily imported in e.g. R or LibreOffice.
Optional the blanking of the centromer and none evaluable regions might be performed during the conversion process.
Details / Download [14.10.2003]