Thursday, Jan. 23 -- DAILY AGENDA
SEQUENTIAL DATA WORKSHOPS
Workshops will be on Colab using TensorFlow 2.0 - Keras
***In advance of the workshop, please prepare a single text file (.txt). The file should be a huge plain-text document of words you've written. This could be copy-pasted from past papers, reports, text messages, emails.
8:30 - 9:00 Registration
9:00 - 10:15 Morning Lecture
As our lives become more intertwined with technology, it's increasingly necessary for technology to have the ability to understand the various modalities of our data, much of which is inherently sequential in nature (e.g., human language and videos). For example, making sense of words predicates understanding the context, especially the preceding words. A class of deep learning models named sequence2sequence (seq2seq) yields state-of-the-art results on a myriad of NLP tasks (e.g., machine translation, natural language understanding, language modeling). For this workshop, we will start with a gentle introduction and then dive into the intricacies, addressing the underlying mechanics, strengths, and weaknesses of the models. Topics include attention, self-attention, and transformers.
10:15 - 10:30 Coffee Break
10:30 - 12:30 Morning Workshop: "Modeling Sequential Data with ELMo, BERT, Transformers, and other Childhood Heroes"
Many of the recent advances in natural language processing are due to the use of large pre-trained language models. In this workshop, you will be introduced to the main models and algorithms in this area. We will then apply these tools in the context of text classification.
Attendees will need a fluent understanding of Python and TensorFlow, and some experience with neural networks.
lab #1: Language Models from N-grams to GPT2: Workshop Link
S.M. Data Science ‘20
IACS 2019, Data Scientist at REX
1:30 - 2:30 Morning Workshop, cont.: "Modeling Sequential Data with ELMo, BERT, Transformers, and other Childhood Heroes"
2:45 - 3:45 Afternoon Workshop sponsored by Microsoft Azure: "Machine Learning for Time Series Forecasting with Python"
Time series data is an invaluable source of information used for future strategy and planning operations everywhere from finance to education and healthcare. In the past few decades, machine learning model-based forecasting has also become very popular in the private and the public decision-making process. In this hands-on 1-hour workshop, Francesca walks you through the core steps for building, training, and deploying your time series forecasting models. You will then gain hands-on experience applying these models to a real-world scenario, using machine learning components available in open source Python packages.
Experience coding in Python
A basic understanding of machine learning and deep learning topics and terminology as well as the mathematics used for machine learning
Links to resources –
Azure Notebooks: https://aka.ms/AzureNB
Python Microsoft: https://aka.ms/PythonMS
Automated Machine Learning Documentation: https://aka.ms/AutomatedMLDocs
What is Automated Machine Learning? https://aka.ms/AutomatedML
Azure Machine Learning Service: https://aka.ms/AzureMLService
Azure Data Science Virtual Machine: https://aka.ms/AzureDSVM
lab #2: Text Classification with BERT: Workshop Link
PREREQUISITES: Workshops assume fluency in Python and basic machine learning to the level of Harvard's
CS 109a or a beginner data science/ML course.
5:00 - 6:00 Networking Reception
ComputeFest participants are invited to enjoy cocktails and appetizers while networking with workshop presenters, industry sponsors, and event attendees.
3:45 - 4:45 Afternoon Tech Talk sponsored by Amazon Alexa: "The Science Behind Alexa's Entity Resolution"
Software Development Manager
Click on the "submit your information to Amazon" link in the upper right-hand corner. (Must be present to win.)