Bioinformatika

Előadás:   Péntek 10:15-11:45,  Déli épület 3-114

Első óra: 2025. február 14.

 

……………..

Gyakorlat: http://pitgroup.org/bioinfogyak/

Ajánlott könyvek (de ezek nem kellenek a sikeres vizsgához):

Metagenomika:  The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet (The National Academies Press, 2007, FREE)

Bioinformatika: Lengauer: Bioinformatics – from genomes to therapies.. Vol.1-3 (Wiley, 2007)

Isaev A. Introduction to mathematical methods in bioinformatics (Springer, 2006)

 

Az előadások anyaga (felhasználónév és jelszó kell hozzá)

Tematika:

What is covered in this course, and why? Brief introduction. Genome sequencing techniques: Sanger, 454/Roche, Illumina/Solexa The FASTA and FASTQ formats. Introduction to metagenomics. Microbial diversity.

Genome sequencing

Levels of sequence assembly: chromosomes, Scaffolds & contigs, SRA or trace. Re-sequencing vs. de novo sequencing. Assembly of short reads.

Strategies: Hashing. Hashing in parallel (example with the element distinctness problem).

The Burrows-Wheeler transform. Easy substring-search, computing its inverse. Sequence assembly with graphs. The Hamiltonian cycle reduction. The Eulerian path reduction. De Bruijn graphs

Sequence analysis

Strings, basic problems related to string search: pattern matching, alignments.

Two approaches for pattern matching: preprocessing the text or the searched pattern. Distance functions on strings: Hamming-distance, Levenshtein-distance, Levenshtein-distance with different costs.

Dynamic programming: main ideas and applicability to sequence comparisons. “Scoring functions” and their motivations. Brief introduction on scoring matrices (PAM, BLOSUM). Databases of amino acid sequences: UniProt = (SwissProt U TrEMBL); SwissProt and TrEMBL difference; RefSeq (also nucleotide sequences); corresponding websites; download hints

Pattern matching using pattern preprocessing: the Boyer-Moore algorithm. Pattern matching using text preprocessing: Suffix trees . Quickly solvable tasks using Suffix trees.

Sequence alignment algorithms

The concept of local, global alignments. Gap penalty types. Alignment with different overlap requirements / different gap penalty types.

Basic idea: BLAST (in detail)

Phylogeny, evolution trees. The NCBI taxonomy tree. Different methods for constructing phylogenetic trees.

From genes to proteins: finding protein coding genes (GV) Transcription and translation. CDS, ORF, gene finding.

Structure prediction

Molecular structure primer; Molecular structure prediction

Drug-protein  docking

Interaction networks

Molecular networks: metabolic and physical interaction networks

Source of information: online databases

Molecular networks

Protein function prediction and similarity

Brain informatics

MRI data processing, brain graphs, brain graph analysis