Thursday, Jan. 23 -- DAILY AGENDA

SEQUENTIAL DATA WORKSHOPS

Workshops will be on Colab using TensorFlow 2.0 - Keras

***In advance of the workshop, please prepare a single text file (.txt). The file should be a huge plain-text document of words you've written. This could be copy-pasted from past papers, reports, text messages, emails.

8:30 - 9:00 Registration
9:00 - 10:15 Morning Lecture
Chris Tanner, PhD
Lecturer, IACS

As our lives become more intertwined with technology, it's increasingly necessary for technology to have the ability to understand the various modalities of our data, much of which is inherently sequential in nature (e.g., human language and videos). For example, making sense of words predicates understanding the context, especially the preceding words. A class of deep learning models named sequence2sequence (seq2seq) yields state-of-the-art results on a myriad of NLP tasks (e.g., machine translation, natural language understanding, language modeling). For this workshop, we will start with a gentle introduction and then dive into the intricacies, addressing the underlying mechanics, strengths, and weaknesses of the models. Topics include attention, self-attention, and transformers.

10:15 - 10:30 Coffee Break
10:30 - 12:30 Morning Workshop: "Modeling Sequential Data with ELMo, BERT, Transformers, and other Childhood Heroes"

Many of the recent advances in natural language processing are due to the use of large pre-trained language models. In this workshop, you will be introduced to the main models and algorithms in this area. We will then apply these tools in the context of text classification.

Attendees will need a fluent understanding of Python and TensorFlow, and some experience with neural networks.

lab #1: Language Models from N-grams to GPT2: Workshop Link

Josh Feldman
S.M. Data Science ‘20
Will Claybaugh
IACS 2019, Data Scientist at REX
1:30 - 2:30 Morning Workshop, cont.: "Modeling Sequential Data with ELMo, BERT, Transformers, and other Childhood Heroes"
2:45 - 3:45 Afternoon Workshop sponsored by Microsoft Azure:   "Machine Learning for Time Series Forecasting with Python"

Time series data is an invaluable source of information used for future strategy and planning operations everywhere from finance to education and healthcare. In the past few decades, machine learning model-based forecasting has also become very popular in the private and the public decision-making process. In this hands-on 1-hour workshop, Francesca walks you through the core steps for building, training, and deploying your time series forecasting models. You will then gain hands-on experience applying these models to a real-world scenario, using machine learning components available in open source Python packages.

​​

lab #2: Text Classification with BERT: Workshop Link

PREREQUISITES:  Workshops assume fluency in Python and basic machine learning to the level of  Harvard's

CS 109a or a beginner data science/ML course.

5:00 - 6:00 Networking Reception 

ComputeFest participants are invited to enjoy cocktails and appetizers while networking with workshop presenters, industry sponsors, and event attendees. 

3:45 - 4:45 Afternoon Tech Talk sponsored by Amazon Alexa: "The Science Behind Alexa's Entity Resolution"
Han Wang
Applied Scientist
Yue Liu
Software Development Manager

Click on the "submit your information to Amazon" link in the upper right-hand corner. (Must be present to win.)

IACS.SEAS.HARVARD.EDU 

HARVARD INSTITUTE FOR APPLIED COMPUTATIONAL SCIENCE 

33 OXFORD ST. CAMBRIDGE, MA 02138

LRAY@SEAS.HARVARD.EDU 

IACSLogo_RGB_UseOnLightBackground (4).jp