Bioinformatika | Vince Grolmusz

Előadás: Péntek 10:15-11:45, Déli épület 3-114

Első óra: 2025. február 14.

……………..

Gyakorlat: http://pitgroup.org/bioinfogyak/

Előadók: Grolmusz Vince, (grolmusz@pitgroup.org), Varga Bálint

Ajánlott könyvek (de ezek nem kellenek a sikeres vizsgához):

Metagenomika: The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet (The National Academies Press, 2007, FREE)

Bioinformatika: Lengauer: Bioinformatics – from genomes to therapies.. Vol.1-3 (Wiley, 2007)

Isaev A. Introduction to mathematical methods in bioinformatics (Springer, 2006)

Az előadások anyaga (felhasználónév és jelszó kell hozzá)

Tematika:

What is covered in this course, and why? Brief introduction. Genome sequencing techniques: Sanger, 454/Roche, Illumina/Solexa The FASTA and FASTQ formats. Introduction to metagenomics. Microbial diversity.

Genome sequencing

Levels of sequence assembly: chromosomes, Scaffolds & contigs, SRA or trace. Re-sequencing vs. de novo sequencing. Assembly of short reads.

Strategies: Hashing. Hashing in parallel (example with the element distinctness problem).

The Burrows-Wheeler transform. Easy substring-search, computing its inverse. Sequence assembly with graphs. The Hamiltonian cycle reduction. The Eulerian path reduction. De Bruijn graphs

Sequence analysis

Strings, basic problems related to string search: pattern matching, alignments.

Two approaches for pattern matching: preprocessing the text or the searched pattern. Distance functions on strings: Hamming-distance, Levenshtein-distance, Levenshtein-distance with different costs.

Dynamic programming: main ideas and applicability to sequence comparisons. “Scoring functions” and their motivations. Brief introduction on scoring matrices (PAM, BLOSUM). Databases of amino acid sequences: UniProt = (SwissProt U TrEMBL); SwissProt and TrEMBL difference; RefSeq (also nucleotide sequences); corresponding websites; download hints

Pattern matching using pattern preprocessing: the Boyer-Moore algorithm. Pattern matching using text preprocessing: Suffix trees . Quickly solvable tasks using Suffix trees.

Sequence alignment algorithms

The concept of local, global alignments. Gap penalty types. Alignment with different overlap requirements / different gap penalty types.

Basic idea: BLAST (in detail)

Phylogeny, evolution trees. The NCBI taxonomy tree. Different methods for constructing phylogenetic trees.

From genes to proteins: finding protein coding genes (GV) Transcription and translation. CDS, ORF, gene finding.

Structure prediction

Molecular structure primer; Molecular structure prediction

Drug-protein docking

Interaction networks

Molecular networks: metabolic and physical interaction networks

Source of information: online databases

Molecular networks

Protein function prediction and similarity

Brain informatics

MRI data processing, brain graphs, brain graph analysis

professor of mathematics at the Institute of Mathematics of Eötvös University, Budapest, Hungary