CGH

Description

The parser transforms the exported ASCII raw data from the Applied Imaging system v.3.01 (from 1995) in a vector like format usable to be easily imported in e.g. R or LibreOffice.
Optional the blanking of the centromer and none evaluable regions might be performed during the conversion process.

see ODF table: measurements and mapping to the chromosomal regions (24 KB)

Some features of the program:
-creating one output file for all experiments (or one file per experiment)
-building means from up to two measurements on one 'slide'
-replacing the decimal dot by a comma (european format)

Install a MONO runtime on Debian/Ubuntu

In short:

$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF
$ echo "deb https://download.mono-project.com/repo/debian wheezy main" | sudo tee /etc/apt/sources.list.d/mono-xamarin.list
$ sudo apt-get update
$ sudo apt-get install mono-complete

For coding issues you need additionally:

$ sudo apt-get install monodevelop mono-vbnc

In the case of problems look at the MONO web site.

 

Formats

Input

CGH ratio data in blocks, 8 in a row, some chromosomes might be missing
Data from such a folder, e.g.: /.../3651/slide0/cell9/cgh.rats

2 182
0.928207 0.928207 0.928207 1.012871 1.012871 1.040374 1.040374 1.040374
.....
1.073986 1.036183 1.036183 1.123358 1.123358 1.431219

2 182
0.892525 0.892525 0.892525 0.986438 0.986438 0.994899 0.994899 0.960951
.....
1.049657 1.049657 1.200850 1.200850 1.389061 1.000000

3 154
1.023162 1.023162 1.023162 0.993290 0.993290 0.993290 1.001561 1.001561
.....
0.979005 0.979005 0.916921 0.916921 0.832465 0.832465 0.832465 0.781722
0.781722 0.967822

3 154
0.904337 0.904337 0.904337 1.000224 1.000224 0.990892 0.990892 0.946762
.....
1.074098 1.074098 1.078660 1.078660 1.231315 1.231315 1.231315 1.304888
1.304888 1.406623

4 144
1.558542 1.558542 1.558542 1.463850 1.463850 1.463850 1.260583 1.260583
.....
0.731537
.....

Output

In the case of mean: 1..2099 containing sequentially the chromosomes 1,2,3, ... to X and Y.
e.g.:
1,166037;
1,166037;
.....
1,000000;
1,000000;

Chromosomes which are left out during data generation are filled with '0's in the vector.

In the other case: 1..4198 containing the chromosomes 1,1,2,2,3,3, ... to X,X and Y,Y.

Bulk files will have a header row (therefore 1..4199 rows).
e.g.:
c3651x08;c3651x09;
1,166037;0;
1,166037;0;
.....

 

Prerequisites

A MONO (LINUX) or a dotnet (for LINUX) runtime environment or a Windows OS with dotnet runtime environment 1.1 and up.

 

Mode of operation

Source directory: containing all the source data files (nothing else in it) and
a target directory: which will host the result and log files.
The system is tested up to 480 files in the 'master file' mode (see features). The 'single file' mode might only be limited by your hardware resources.

On a LINUX system start the binary in a terminal:

$ mono ./TransformCGH3.exe

and the window of the program will pop-up.

 

Download

binary: TransformCGH3 (VB.net exe)

source code archive: ZIP (VB.net code)