Mitochondria are often introduced as the ATP-producing “powerhouses” of eukaryotic (nucleus-bearing) cells, but they fulfill essential roles in a number of other cell processes, including biosyntheses, programmed cell death, and the assembly of iron-sulfur clusters, to name just a few. Mitochondria are always surrounded by two membranes, and most mitochondria, but not all, contain their own DNA, which is an evolutionarily reduced bacterial chromosome. Since the early 1900s, mitochondria were suspected to have arisen through an endosymbiosis — one cell coming to live within another. By the 1970s, the existence of DNA in mitochondria and the overall similarity between mitochondrial ATP-producing biochemistry and that in free-living bacteria provided strong evidence in favor of that view. There is no longer any doubt that mitochondria arose through endosymbiosis, but there is currently a plurality of ideas about the kind of bacterium the ancestral mitochondrial endosymbiont was, the nature of the host that acquired the endosymbiont, and the nature of the initial symbiotic interactions that associated the host and the endosymbiont in their fateful encounter.
The cells in your body are like computer software: they're "programmed" to carry out specific functions at specific times. If we can better understand this process, we could unlock the ability to reprogram cells ourselves, says computational biologist Sara-Jane Dunn. In a talk from the cutting-edge of science, she explains how her team is studying embryonic stem cells to gain a new understanding of the biological programs that power life -- and develop "living software" that could transform medicine, agriculture and energy.
We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA …
Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package. In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.