MSc-BSc theses

Comparative gene expression analysis shows remarkable differences in Ewing Sarcoma derived mesenchymal stem cells based on single cell data and bulk data

Daniela Wüller, [MSc] 2022

Abstract
The main objective of this work was to compare the gene expression results from single-cell-derived samples with samples from bulk samples for the malignant bone tumor Ewing's sarcoma to provide an estimate of how the technological difference affects the results. These comparisons were done based on multitude of gene expression analyses and by the intersection of their respective result gene sets. The downstream analysis of the gene sets attempted to categorize the biological relevance of these gene sets for Ewing's sarcoma in order to derive a good estimate for the differences between these two technologies.
Within the most significant results for the different analyses, there were some genes and pathways for both comparions, that could be associated with ES, inter alia with osteoclastic differentiation. These genes were considered in detail. Also there were many significant cancer genes, so this is an indication of the changed cell behaviour in these considered cells.
The comparison between the single-cell and bulk samples received by the intersection of the gene expression analyses showed several accordance. But with this comparison, it also became clear, that there is a remarkable difference between these samples derived by different techniques. Important to note, the results of only the single-cell analysis showed more findings regarding the formation of Ewing Sarcoma in the most significant genes and pathways.
Overall it can be concluded that the results for the comparison of single cell samples and bulk derived samples showed some similarities but even strong differences, especially for well known genes in cancer and their pathways known to be associated with Ewing Sarcoma. For a better and less biased comparison it would be advisable to compare samples that are derived from the same cell types to reduce the differences between the samples. This future task would further improve the insight into the differences between single cells and bulk measurements and here especially in the context of Ewing Sarcoma biology.
Supervisors: Prof.Dr.E.Korsching, Prof.Dr.M.Neuhäuser (Dean, University of Applied Sciences Koblenz-Rhein Ahr Campus)


Fusion genes and breakpoints of Ewing Sarcoma - comparing tumor driving Ewing Sarcoma cells with MSC and huFib as external controls

Nadine Urbanek, [BSc] 2018

Abstract
Ewing sarcoma (ES) is a malignant tumour of the bone and soft tissue mostly occurring in childhood and adolescence. In general ES cases are diagnosed in a very late progression state which complicates a successful treatment. Personalized therapy options are becoming increasingly important, because not every patient responds equally well to the few given therapeutic options. ES is characterized mainly by a translocation between a member of the TET family (ten-eleven-translocation) and a member of the ETS (E26 transformation- specific) transcription factor family, most often by the fusion of EWSR1- FLI1 or EWSR1-ERG. A more in-depth analysis of the molecular reaction networks and the genomic organisation around detected fusion genes is often exposing a patient-specific situation. The goals of this study is to characterize the genomic structure on the mRNA and genomic DNA level of the ES cell line CADO-ES1 in more detail and to analyse the transcriptomic products if they reflect genomic alterations. The central hypothesis of this work is that there might be more genomic alterations beyond the known fusion gene.
The experimental design of this study includes several work packages, all based on the model ES cell line CADO-ES1 containing the fusion gene EWSR1-ERG, which was confirmed by PCR studies in the local laboratory. At first the quality of the sequencing data was was analysed, by determining the read size, the quantity and quality of the reads. The second step was to gather suitable analysis tools, mainly to search for further fusion genes and comparable genomic alterations. The individual tools were tested at first with the program's own test data files, afterwards with randomly sampled files of the dataset, then with an entire library and finally with the whole dataset. When errors were encountered, troubleshooting was carried out to solve the problems, if possible. In the third step the quality of the results of the whole- dataset runs were analysed.
The statistical analysis reveals that the data is of good quality and only contains an average of 2,5 % undefinable bases. For most sites the average coverage was sufficient. EricScript and STAR-Fusion can analyse the entire mRNA dataset. A comparison of the results of both tools revealed STAR-Fusion as the more promising tool. STAR-Fusion also findsd more than one fusion gene in a sample: EWSR1-ERG and FUS-FEV. This finding could be an evidence for the hypothesis of this work, but needs to be reconstructed and verified. deFuse, which was successful in former times, and the functional tool for DNA, FACTERA, are still problematic and require further efforts.
Supervisors: PD.Dr.E.Korsching, Prof.Dr.S.Perrey (University of Applied Sciences Westfälische Hochschule-Recklinghausen)


Dependency structures in protein expression data of invasive breast cancer

Florian Boecker, [BSc] 2012

Abstract
In women breast cancer is the most common cancer and the leading cause of cancer death. Cancer is a pathogenic pro- cess with a high level of complexity. The appliance of high through- put technologies to match this complexity has led to the recognition of several molecular subtypes of breast cancer with different clinical implications. Biomarkers are reliably measurable biological features that allow the distinction of these subtypes. Good biomarkers allow a reproducible diagnosis of particular cases and the prediction of their outcome and treatment response. While remaining economically com- petitive in terms of their particular measurement methods. With the aid of high throughput technologies a plethora of potential molecular biomarkers were already able to be identified. But as of today only a few reached clinical practice. There is still a requirement of fur- ther validation of existing, and discovery of novel clinically applicable biomarkers.
In this project we use a combinatorial approach to search for interrelations in protein expression profiles in two sets of invasive breast cancer tissue samples. These profiles of several established and upcoming biomarkers were measured with tissue micro arrays. To avoid assumptions the algorithm analyses all possible combinations of two biomarker expression profile partitions. The combinations are evaluated by performing linear regressions of similarity measures be- tween the expression profiles. Because of the combinatorial nature of the procedure the number of biomarkers that are able to be incooper- ated simultaneously, is limited by the available computing power. To use existing computing power more efficiently and to make more of it available we migrated a previous implementation to Fortran and made use of parallel computing. Additionally we present here a par- tially new approach that omits parts of the combinatorial space by following a gradient towards the result.
Aided by a substantial runtime reduction we were able to find two groups of biomarkers in both data sets. Respectively one group of biomarkers that is linked to a mild type and one that is linked to an aggressive form of breast cancer by the literature. This compliance to the literature and the comparison to resampled data validates the approach and the similarity of the re- sults in both data sets shows the robustness of our method. The new approach to the procedure was able to reduce the runtime further and gave insight in the extremes of the combinatorial space. It poses as a good starting point for further testing and optimisation.
Supervisors: PD.Dr.E.Korsching, Prof.Dr.A.Zielesny, Prof.Dr.S.Perrey (University of Applied Sciences Westfälische Hochschule-Recklinghausen)
Thesis is published in CANCER INFORMATICS. 2016;15:143-149.