Comparative gene expression analysis shows remarkable differences in Ewing Sarcoma derived mesenchymal stem cells based on single cell data and bulk data
Daniela Wüller, [MSc] 2022
Abstract
The main objective of this work was to compare the gene expression results from single-cell-derived samples with samples from bulk samples
for the malignant bone tumor Ewing's sarcoma to provide an estimate of how the technological difference affects the results.
These comparisons were done based on multitude of gene expression analyses and by the intersection of their respective result gene sets.
The downstream analysis of the gene sets attempted to categorize the biological relevance of these gene sets for Ewing's sarcoma
in order to derive a good estimate for the differences between these two technologies.
Within the most significant results for the different analyses, there were some genes
and pathways for both comparions, that could be associated with ES, inter alia with
osteoclastic differentiation. These genes were considered in detail. Also there were
many significant cancer genes, so this is an indication of the changed cell behaviour in
these considered cells.
The comparison between the single-cell and bulk samples received by
the intersection of the gene expression analyses showed several accordance. But with this
comparison, it also became clear, that there is a remarkable difference between these
samples derived by different techniques. Important to note, the results of only the single-cell analysis showed
more findings regarding the formation of Ewing Sarcoma in the most significant genes and pathways.
Overall it can be concluded that the results for the comparison of single cell
samples and bulk derived samples showed some similarities but even strong differences,
especially for well known genes in cancer and their pathways known to be associated with Ewing Sarcoma.
For a better and less biased comparison it would be advisable to compare samples that are
derived from the same cell types to reduce the differences between the samples. This
future task would further improve the insight into the differences between single cells and bulk
measurements and here especially in the context of Ewing Sarcoma biology.
Supervisors: Prof.Dr.E.Korsching, Prof.Dr.M.Neuhäuser (Dean, University of Applied Sciences Koblenz-Rhein Ahr Campus)
Fusion genes and breakpoints of Ewing Sarcoma - comparing tumor driving Ewing Sarcoma cells with MSC and huFib as external controls
Nadine Urbanek, [BSc] 2018
Abstract
Ewing sarcoma (ES) is a malignant tumour of the bone and soft tissue mostly
occurring in childhood and adolescence. In general ES cases are diagnosed in a very late
progression state which complicates a successful treatment. Personalized therapy options are
becoming increasingly important, because not every patient responds equally well to the few
given therapeutic options. ES is characterized mainly by a translocation between a member of
the TET family (ten-eleven-translocation) and a member of the ETS (E26 transformation-
specific) transcription factor family, most often by the fusion of EWSR1- FLI1 or EWSR1-ERG. A
more in-depth analysis of the molecular reaction networks and the genomic organisation around
detected fusion genes is often exposing a patient-specific situation.
The goals of this study is to characterize the genomic structure on the mRNA and genomic DNA
level of the ES cell line CADO-ES1 in more detail and to analyse the transcriptomic products if
they reflect genomic alterations. The central hypothesis of this work is that there might be more
genomic alterations beyond the known fusion gene.
The experimental design of this study includes several work packages, all based on
the model ES cell line CADO-ES1 containing the fusion gene EWSR1-ERG, which was
confirmed by PCR studies in the local laboratory. At first the quality of the sequencing data was
was analysed, by determining the read size, the quantity and quality of the reads. The second
step was to gather suitable analysis tools, mainly to search for further fusion genes and
comparable genomic alterations. The individual tools were tested at first with the program's own
test data files, afterwards with randomly sampled files of the dataset, then with an entire library
and finally with the whole dataset. When errors were encountered, troubleshooting was carried
out to solve the problems, if possible. In the third step the quality of the results of the whole-
dataset runs were analysed.
The statistical analysis reveals that the data is of good quality and only contains an
average of 2,5 % undefinable bases. For most sites the average coverage was sufficient.
EricScript and STAR-Fusion can analyse the entire mRNA dataset. A comparison of the results
of both tools revealed STAR-Fusion as the more promising tool. STAR-Fusion also findsd more
than one fusion gene in a sample: EWSR1-ERG and FUS-FEV. This finding could be an
evidence for the hypothesis of this work, but needs to be reconstructed and verified. deFuse,
which was successful in former times, and the functional tool for DNA, FACTERA, are still
problematic and require further efforts.
Supervisors: PD.Dr.E.Korsching, Prof.Dr.S.Perrey (University of Applied Sciences Westfälische Hochschule-Recklinghausen)
Dependency structures in protein expression data of invasive breast cancer
Florian Boecker, [BSc] 2012
Abstract
In women breast cancer is the most common cancer
and the leading cause of cancer death. Cancer is a pathogenic pro-
cess with a high level of complexity. The appliance of high through-
put technologies to match this complexity has led to the recognition
of several molecular subtypes of breast cancer with different clinical
implications. Biomarkers are reliably measurable biological features
that allow the distinction of these subtypes. Good biomarkers allow a
reproducible diagnosis of particular cases and the prediction of their
outcome and treatment response. While remaining economically com-
petitive in terms of their particular measurement methods. With the
aid of high throughput technologies a plethora of potential molecular
biomarkers were already able to be identified. But as of today only
a few reached clinical practice. There is still a requirement of fur-
ther validation of existing, and discovery of novel clinically applicable
biomarkers.
In this project we use a combinatorial approach to search
for interrelations in protein expression profiles in two sets of invasive
breast cancer tissue samples. These profiles of several established and
upcoming biomarkers were measured with tissue micro arrays. To
avoid assumptions the algorithm analyses all possible combinations
of two biomarker expression profile partitions. The combinations are
evaluated by performing linear regressions of similarity measures be-
tween the expression profiles. Because of the combinatorial nature of
the procedure the number of biomarkers that are able to be incooper-
ated simultaneously, is limited by the available computing power.
To use existing computing power more efficiently and to make more
of it available we migrated a previous implementation to Fortran and
made use of parallel computing. Additionally we present here a par-
tially new approach that omits parts of the combinatorial space by
following a gradient towards the result.
Aided by a substantial runtime reduction we were able to find two groups of biomarkers in both data sets.
Respectively one group of biomarkers that is linked to a mild type
and one that is linked to an aggressive form of breast cancer by the
literature. This compliance to the literature and the comparison to
resampled data validates the approach and the similarity of the re-
sults in both data sets shows the robustness of our method. The new
approach to the procedure was able to reduce the runtime further and
gave insight in the extremes of the combinatorial space. It poses as a
good starting point for further testing and optimisation.
Supervisors: PD.Dr.E.Korsching, Prof.Dr.A.Zielesny, Prof.Dr.S.Perrey (University of Applied Sciences Westfälische Hochschule-Recklinghausen)
Thesis is published in
CANCER INFORMATICS. 2016;15:143-149.