Most of the statistical methods you are likely to have encountered will have specified fixed functional forms for the relationships between covariates and the response, either implicitly or explicitly. These might be linear effects or involve polynomials, such as x + x2 + x3. Generalised additive models (GAMs) are different; they build upon the generalised linear model by allowing the shapes of the relationships between response and covariates to be learned from the data using splines. Modern GAMs, it turns out, are a very general framework for data analysis, encompassing many models as special cases, including GLMs and GLMMs, and the variety of types of splines available to users allows GAMs to be used in a surprisingly large number of situations. In this course we’ll show you how to leverage the power and flexibility of splines to go beyond parametric modelling techniques like GLMs.
The course is aimed at graduate students and researchers with some statistical knowledge; ideally you'll know something about generalized linear models, likelihood, and AIC. However we’ll recap what GLMs are so if you’re a little rusty or not everything mentioned in the GLM course makes sense, we have you covered. From running the course previously, knowing the difference between "fixed" and "random" effects, and what the terms "random intercepts" and "random slopes" are, will be helpful for the Hierarchical GAM topic, but we don't expect you to be an expert in mixed effects or hierarchical models to
take this course.
Participants should be familiar with RStudio and have some fluency in programming R code, including being able to import, manipulate (e.g. modify variables) and visualise data. There will be a mix of lectures, in-class discussion, and hands-on practical exercises along the course.
1. Understand how GAMs work from a practical view point to learn relationships between covariates and response from the data
2. Be able to fit GAMs in R using the mgcv and brms packages
3. Know the differences between the types of splines and when to use them in your models
4. Know how to visualise fitted GAMs and to check the assumptions of the model
Sessions from 14:00 to 20:00 (Monday to Thursday), 14:00 to 19:00 on Friday (Berlin time). From Tuesday to Friday, the first hour will be dedicated to Q&A and working through practical exercises or students’ own analyses over Slack and Zoom. Sessions will interweave mix lectures, in-class discussion/ Q&A, and practical exercises.
Monday– Classes from 2-8 PM Berlin time
Brief overview of R and the Tidyverse packages we’ll encounter throughout the course
Recap generalised linear models
Fitting your first GAM
Tuesday– Classes from 2-8 PM Berlin time
How do GAMs work? What are splines? How do GAMs learn from data without overfitting? We’ll dig under the hood a bit to understand how GAMs work at a practical level and how to use the mgcv and gratia packages to estimate GAMs and visualise them.
Wednesday– Classes from 2-8 PM Berlin time
Model checking, selection, and visualisation. How do we do inference with GAMs? Go beyond simple GAMs to include smooth interactions and models with multiples smooths.
Thursday– Classes from 2-8 PM Berlin time
Hierarchical GAMs; introducing random smooths and how to model data with both group and individual smooth effects.
Doing more with your models; introducing posterior simulation
Friday– Classes from 2-7 PM Berlin time
Fitting GAMs with brms
Going beyond the mean; fitting distributional models and quantile GAMs
COURSE OVERVIEW & PROGRAMME
Transposable elements (TEs) can be major components of eukaryotic genomes. Such repeated sequences, which can make up very large proportions like about 50% of mammalian genomes to more than 80% in the genomes of some plants, can promote various types of mutations, from gene interruption and expression alteration to large-scale chromosomal rearrangements. They can also promote the formation of new genes. Despite their deleterious effects, TEs are currently considered as major actors in genome evolution due the genetic and epigenetic diversity they can generate.
Even if they have a fundamental biological role, detection and analysis of TE sequences are still technologically challenging. The length and quality of sequenced reads make their detection and annotation difficult (40% detection error). Moreover, the presence of TEs in a genome can also lead to important assembly errors due to rearrangement and the merge of repeats, and to difficulties in the identification of splicing events and in the estimation of gene expression in transcriptomic analyses. It is thus important to be able to identify these sequences in genomic and transcriptomic data.
Since several years, a large number of bioinformatic tools have been developed allowing a better identification of TEs in genomes. New tools are released regularly to follow the progress of sequencing technologies but also to answer particular biological questions allowing to go from the TE annotation in assembled or unassembled genomes, to insertion polymorphism detection in natural populations. The result is a particularly large choice for users leading to difficulties in the determination of the best tool(s) to use according to the case.
In this course, we aim at proposing an introduction of selected bioinformatic tools for the detection and analysis of TEs in genomic data (RepeatMasker, DnaPipeTE, T-lex).
About this event
Integrate Middle East is the premier forum and sourcing platform for the global professional AV & Media Technology community, connecting technology leaders with integrated solution buyers from the intersecting worlds of Education, Media, Entertainment, Hospitality, Retail and Communication.
Integrate Middle East is where you can boost your business, find the most innovative solutions from global leaders in broadcast, AV and Media technology giving you a competitive advantage and get the latest deep-dives into international best practices and lessons learned.
About the event
"Bridging Excellence in the Field of Cancer Science and Radiology"
4th International Conference on Cancer Science and Radiology “Cancer Summit 2023” going to be held on March 15-16, 2023 | Dubai, UAE to bring together a unique and international mix of experts, scientists, researchers, and students to exchange and share their experiences and research outcomes on all elements of Cancer Science and Research.
About Event
The conference provides a platform for professionals involved in Bioinformatics, Biomedicine, Biotechnology and Computational Biology to exchange knowledge and gain an insight into the state of the art in the current technology, techniques and solutions in Biotechnology as they have been developed and applied in different countries. Participants include a wide variety of stakeholders from research and academia, to industrial sectors as well as government organizations.
6– 10 March 2023
To foster international participation, this course will be held online
Long-read genome sequencing technologies offer the opportunity to assemble highly contiguous genome sequences for a huge spectrum of organisms, including telomere-to-telomere human chromosomes, fully closed bacterial chromosomes and plasmids, and even entire viral genomes in single reads. A wide variety of tools exist to process long-read sequencing data, from basecalling through to genome assembly, polishing and quality control. Each of these has been developed to optimise the accuracy of the resulting genome assemblies, either in hybrid with paired short-read data, or increasingly using long reads alone.
This course will introduce participants to a range of methods to complete the steps required to process raw Oxford Nanopore Technologies sequencing data into a fully assembled, polished and quality controlled genome assembly, both with and without accompanying short reads, and with and without a reference genome. Over five days, we will include a combination of both theoretical background and practical application using model viral and bacterial datasets, concluding with a full run-through of the assembly, polishing and quality control pipeline at each course participants’ own pace.
This course is intended for researchers interested in learning the background and practical techniques involved in genome assembly using Oxford Nanopore Technologies data. Both beginners and more advanced users are welcome, however a basic understanding on how to use the command line is recommended. The practical sessions will be split into two groups, one for those confident on the command line, and one for those who are less confident.
The instructors are both hybrid wet lab/dry lab scientists and can offer advice on both the practical aspects of Nanopore sequencing and the bioinformatics, although the course will be primarily focussed on bioinformatics.
● Learn the advantages and disadvantages of long-read sequencing
● Understand the steps involved in genome assembly using long read data
● Gain practical experience in choosing and using the optimal tools for a variety of dataset types, including microbiome, bacterial, viral and mammalian
Monday. 2-8 pm Berlin time
Session 1: Introduction, basecalling and demultiplexing
We will start with a general introduction to sequencing and assembly using Oxford Nanopore Technologies sequencers. Participants will be introduced to the theory of how Nanopore sequencing works and how raw Nanopore sequencing data is formatted.
Practical sessions will include using current gold standard basecalling tools to transform raw Nanopore signal data into DNA sequence (including an example of multiplexing/demultiplexing in Nanopore sequencing), and learning how to quality control and filter your basecalled sequencing reads.
This course will introduce biologists and bioinformaticians to the concepts of de novo genome assembly and annotation, providing a theoretical framework and practical examples. A variety of sequencing technologies and their applications to generate haplotype-phased, high-quality reference genomes will be presented and discussed. They include Illumina short reads (for both assembly and gene annotation), PacBio HiFi (‘High Fidelity’) and CLR (‘Continuous Long Read’) reads, Oxford Nanopore long and ultralong reads, as well as scaffolding technologies including optical mapping and proximity ligation (Hi-C). Special attention will be given to quality control throughout the assembly process (e.g. tools such as Genomescope, Merqury, Pretext) as well as to consensus, structural error mitigation and manual curation. The concept of Telomere-to-telomere (T2T) genome assembly, and the means to achieve it, will also be introduced. Annotation tools using Illumina RNA-Seq and Pacbio IsoSeq data will be introduced. By the end of the course the students will be able to understand what is needed to generate an annotated and curated reference genome of high-quality.
The course is aimed at researchers interested in learning more about genome assembly and annotation. It will include information useful to both beginners and more advanced users. We will start by introducing general assembly and annotation concepts and algorithms, providing a historical context. We will then describe all major components of a typical genome assembly workflow using the Vertebrate Genomes Project assembly pipeline as example. We will further analyse the multiple ways a genome can be annotated to maximize its utility for downstream analyses. There will be a mix of lectures and hands-on practical exercises, either using graphical interfaces (https://assembly.usegalaxy.eu/) and basic command line. Prior experience with Linux is welcome but not required. No prior background in DNA sequencing is required.
- Understand the concepts related to de novo genome assembly and annotation for genomes of all sizes, from viruses to mammals
- Learn the strengths and weaknesses of different sequencing technologies, including Illumina short read sequencing, Pacific Biosciences and Oxford Nanopore long read sequencing, as well as scaffolding technologies including optical mapping and proximity ligation (Hi-C), for de novo genome assembly and annotation.
- Gain hands on experience with common tools for de novo genome assembly, assembly quality evaluation, assembly visualization and manual curation
- Hands on experience of feature annotation (e.g. genes, repeats)
Monday– Classes from 2-8 pm Berlin time
Session 1: Introduction
In this kick-off session students will be introduced to genomics. This introduction will include a historical background on DNA sequencing and genome assembly techniques that were developed since the discovery of nucleic acids. It will range from early strategies to popular high throughput approaches, including a general overview of so called ‘third-generation’ approaches (long-read sequencing) and scaffolding approaches (optical mapping, proximity ligation Hi-C). This introduction on DNA sequencing will be accompanied by an overview of the evolution of bioinformatics throughout the 20th Century and the early 21st Century. Special attention will be paid to the evolution of genome assembly approaches, including those that will be employed in the rest of the course. We will discuss some of the latest developments in the field. State-of-the-art applications, targeted to ecological and evolutionary studies, will also be introduced.
Session 2: Principles of bioinformatics in genome assembly algorithms
In this module we will deepen the bionformatic concepts that are pivotal to understand the different steps of a genome assembly pipeline. We will dive into past and current assembly algorithms (greedy, De Brujin graphs, OLC, string graphs) and we will also present the most common methods and file formats used. The importance of raw data evaluation (contaminations, adapter-trimming, read correction) and reference-tools to assess genome properties (e.g. Genomescope2) will be presented. We will also present and discuss popular QC tools to evaluate consensus and structural accuracy (e.g. Merqury, BUSCO). We will also review key concept in genome assembly and evaluation (e.g. N* metrics, consensus accuracy, haplotype switches). This section will be interactive, with hands on questions and cases to be addressed by the students. We will test some of these algorithms with small test datasets and analyse the outputs. Assembly graphs will be visualized and discussed with tools such as Bandage.
This topic is of global relevance and is particularly important, where marine ecosystem services play a significant role in socio-economic development, aquaculture industry and seafood security. The conference program will be structured to include papers dealing with overcoming the challenges through innovative solutions.
The conference provides a platform for professionals involved in Nanomaterials and Biomaterials to exchange knowledge and gain an insight into the state of the art in the current technology, techniques and solutions in Nanomaterials and Biomaterials as they have been developed and applied in different countries. Participants include a wide variety of stakeholders from research and academia, to industrial sectors as well as government organizations.
About the Event
System biology aims to identify and understand the mechanisms that transform genomic information into cellular and organismal behavior using a holistic perspective. In their own research programs, the organizers of this symposium apply cutting-edge genomic technology and systems-level approaches to areas including Environmental Adaptation and Sustainability, and Biomedicine and Health. The internationally distinguished speakers invited for this year's symposium reflect the diverse research areas. This is the eleventh NYUAD Center for Genomics and Systems Biology's Symposium and one of the region's leading symposia on modern biology.
Email nyuad.cgsb.conferenceXI@nyu.edu for more details.
Organized by
Stéphane Boissinot, Professor of Biology; Global Network Professor of Biology, Faculty of Arts and Science, NYU and Director of the Center for Genomics and Systems Biology
Kristin Gunsalus, Professor of Biology, Center for Genomics and Systems Biology, NYU New York; Faculty Director of Bioinformatics, and co-director of Center for Genomics and Systems Biology, NYU Abu Dhabi
Claude Desplan, Silver Professor; Professor of Biology and Neural Science, NYU New York, and co-director of Center for Genomics and Systems Biology, NYU Abu Dhabi
Enas Qudeimat, Head of CGSB Administration and Outreach, NYU Abu Dhabi