Skip to main content

Tackling the Challenges of Big Data

Learn how to leverage big data to benefit your organization.

Start Date: Jun 21, 2016
Duration: 6 Weeks
Price: $545

Course Description

This Digital Programs course will survey state-of-the-art topics in Big Data, looking at data collection (smartphones, sensors, the Web), data storage and processing (scalable relational databases, Hadoop, Spark, etc.), extracting structured data from unstructured data, systems issues (exploiting multicore, security), analytics (machine learning, data compression, efficient algorithms), visualization, and a range of applications.

Each module will introduce broad concepts as well as provide the most recent developments in research.

The course is taught by a team of world experts in each of these areas from MIT and the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).

CSAIL is the largest research laboratory at MIT and one of the world’s most important centers of information technology research. CSAIL and its members have played a key role in the computer revolution. The lab’s researchers have been key movers in developments like time-sharing, massively parallel computers, public key encryption, the mass commercialization of robots, and much of the technology underlying the ARPANet, Internet, and the World Wide Web.

CSAIL members (former and current) have launched more than 100 companies, including RSA Data Security, Akamai, iRobot, Meraki, ITA Software, and Vertica. The Lab is home to the World Wide Web Consortium (W3C).

With backgrounds in data, programming, finance, multicore technology, database systems, robotics, transportation, hardware, and operating systems, each MIT Tackling the Challenges of Big Data professor brings their own unique experience and expertise to the course.

What You'll Learn

  • Distinguish what is Big Data (volume, velocity, variety), and will learn where it comes from, and what are the key challenges
  • Determine how and where Big Data challenges arise in a number of domains, including social media, transportation, finance, and medicine
  • Investigate multicore challenges and how to engineer around them
  • Explore the relational model, SQL, and capabilities of new relational systems in terms of scalability and performance
  • Understand the capabilities of NoSQL systems, their capabilities and pitfalls, and how the NewSQL movement addresses these issues
  • Learn how to maximize the MapReduce programming model: What are its benefits, how it compares to relational systems, and new developments that improve its performance and robustness
  • Learn why building secure Big Data systems is so hard and survey recent techniques that help; including learning direct processing on encrypted data, information flow control, auditing, and replay
  • Discover user interfaces for Big Data and what makes building them difficult
  • Measure the need for and understand how to create sublinear time algorithms
  • Manage the development of data compression algorithms
  • Formulate the “data integration problem”: semantic and schematic heterogeneity and discuss recent breakthroughs in solving this problem
  • Understand the benefits and challenges of open-linked data
  • Comprehend machine learning and algorithms for data analytics

Want to purchase this course for a group?

You can purchase enrollment codes for this course to distribute to your team

Email Us

Instructors

  Daniela Rus, Co-Director

Daniela Rus, Co-Director Professor, Electrical Engineering and Computer Science

  Sam Madden, Co-Director

Sam Madden, Co-Director Professor, Electrical Engineering and Computer Science

Regina Barzilay

Regina Barzilay Associate Professor, Electrical Engineering and Computer Science

  John Guttag

John Guttag Professor, Electrical Engineering and Computer Science

  Piotr Indyk

Piotr Indyk Professor, Electrical Engineering and Computer Science

  Tommi Jaakkola

Tommi Jaakkola Professor, Electrical Engineering and Computer Science

David Karger

David Karger Professor, Electrical Engineering and Computer Science

  Andrew Lo

Andrew Lo Professor

Ronitt Rubinfeld

Ronitt Rubinfeld Professor, Electrical Engineering and Computer Science

Michael Stonebraker

Michael Stonebraker Adjunct Professor, Electrical Engineering and Computer Science

  Matei Zaharia

Matei Zaharia Assistant Professor, Electrical Engineering and Computer Science

  Nickolai Zeldovich

Nickolai Zeldovich Associate Professor, Electrical Engineering and Computer Science

Course Overview

The course is held over six weeks and will provide the following:

  • Five modules covering 18 topic areas: with 20 hours of video
  • Five assessments to reinforce key learning concepts of each module
  • Case studies
  • Discussion forums for participants to discuss thought provoking questions in medicine, social media, finance, and transportation posed by the MIT faculty teaching the course; share, engage, and ideate with other participants
  • Community Wiki for sharing additional resources, suggested readings, and related links

Participants will also take away:

  • Participants will take away program materials: PDFs of faculty PowerPoint presentations, and resources presented in the course wiki.
  • 90 day access to the archived course (includes videos, discussion boards, content, and Wiki)
Time Requirement/Commitment

Taking into consideration various time zones, this course is self-paced with online accessibility 24/7. Lectures are pre-taped and you can follow along when you find it convenient as long as you finish by the course end date. You may complete all assignments before the course end date, however, you may find it more beneficial to adhere to a weekly schedule so you can stay up-to-date with the discussion forums. There are approximately three hours of video every week. Most participants will spend about five hours a week on course-related activities.

Please note that the edX platform uses Coordinated Universal Time (UTC), which is 5 hours ahead of Eastern Standard Time (EST) and 4 hours ahead of Eastern Daylight Time (EDT). To convert times to your local time zone, please use the following tool: http://www.timeanddate.com/worldclock/converter.html

Browser/Technical Requirements

In order to access our courses, you must have a connection to the Internet. Videos are only available via online streaming - you will not be able to download videos for viewing offline. Please take note of your company's restrictions for viewing content and/or firewall settings.

Our courseware works best with current versions of Chrome, Firefox, or Safari, or with Internet Explorer version 10 and above. For the best possible experience, we recommend switching to an up-to-date version of Google Chrome. If you do not have Chrome installed you can get it for free here: http://www.google.com/chrome/browser/

We are unable to fully support access with mobile devices at this time. While many components of your courses will function on a mobile device, some may not.

Key Benefits

  • Position yourself in your organization as an adept practitioner regarding major technologies and applications in your industry that are driving the Big Data revolution, and position your company to propel forward and stay competitive
  • Engage confidently with management on opportunities and Big Data challenges faced by your industry; analyze emerging technologies and how those technologies can be applied effectively to address real business problems while unlocking the value of data and its potential use for company growth
  • Learn and assess the issues of scalability – make your work more productive – to save time and money
  • Gain valuable insights from globally-renowned MIT Faculty and lecturers, and access to CSAIL research that will differentiate how you and your company break down Big Data to save time and money, while making work more efficient
  • Convenient, flexible schedule with platform access 24 hours a day, from anywhere in the world
  • Earn a Certificate of Completion and 2.0 CEUs from MIT Professional Education, and access a private professional alumni group of like-minded professionals and lifelong learners.
MIT Professional Education Alumni Benefits

After completing this course, participants will become alumni of MIT Professional Education and will receive all the associated benefits and courtesies listed below.

  • Receive exclusive discounts on all future Short Programs and Digital Programs courses
  • Access will be provided to our restricted MIT Professional Education alumni group on LinkedIn; this includes invites to join all MIT Professional Education social media platforms
  • Networking opportunities with other individuals from around the globe working in a variety of industries interested in technology, computer science, entrepreneurship, science, research, and Big Data, among many others
  • Email distribution of our MIT Professional Education newsletter
  • Finally, participants will join the MIT Professional Education alumni mailing list where they will receive advanced notice regarding special announcements on upcoming courses, programs, and events

Earn a Certificate of Completion and CEUs

Students will be required to complete a mandatory entrance survey before access is granted to the platform, videos and other course materials. Upon successful completion of the course and all assessments a Certificate of Completion will be awarded by MIT Professional Education.

To earn a Certificate of Completion in this course, participants should watch all the videos, and complete all assessments by the due date, with an average of 80 percent success rate.

The Certificate of Completion will be awarded by MIT Professional Education after the course has ended.

Grading: Letter grades are not awarded for this course.

Sample Certificate of Completion

Participants of this course who successfully complete all course requirements in order to earn a Certificate of Completion are eligible to receive 2.0 Continuing Education Units (2.0 CEUs).

CEUs are a nationally recognized means of recording noncredit/non-degree study. They are accepted by many employers, licensing agencies, and professional associations as evidence of a participant’s serious commitment to the development of a professional competence.

Acceptance of CEUs depends on the organization to which one is submitting them. If your employer requires any additional information, MIT Professional Education can answer questions and provide information, but we cannot guarantee that any particular organization will accept our CEUs.

CEUs are based on hours of instruction. For example: One CEU = 10 hours of instruction.

CEUs may not be applied toward any MIT undergraduate or graduate level course.

Who Should Participate

Prerequisite(s): This course is designed to be suitable for anyone with a bachelor’s level education in computer science or equivalent work experience, such as working hands-on with IT / technology systems (programming, database administration, data analysis, actuarial work, etc.) No programming experience or knowledge of programming languages is required.

Tackling the Challenges of Big Data is designed to be valuable to both individuals and companies because it provides a platform for discussion from numerous technical perspectives. The concepts delivered through this course can spark idea generation among team members, and the knowledge gained can be applied to their company’s approach to Big Data problems and shape the way business operate today.

The application of the course is broad and can apply to both early career professionals as well as senior technical managers.

Participants will benefit the most from the concepts taught in this course if they have at least three years of work experience.

Participants may include:

  • Engineers who need to understand the new Big Data technologies and concepts to apply in their work
  • Technical managers who want to familiarize themselves with these emerging technologies
  • Entrepreneurs who would like to gain perspective on trends and future capabilities of Big Data technology

Participants reside and work from around the world. See a list of countries and companies from professionals who participated in the first offering of Tackling the Challenges of Big Data.

Course Outline

Modules, Topics, and Faculty


Module One: Introduction and Use Cases

The introductory module aims to give a broad survey of Big Data challenges and opportunities and highlights applications as case studies.

Introduction: Big Data Challenges (Sam Madden)

  • Identify and understand the application of existing tools and new technologies needed to solve next generation data challenges
  • Challenges posed by the ability to scale and the constraints of today's computing platforms and algorithms
  • Addressing the universal issue of Big Data and how to use the data to align with a company’s mission and goals

Case Study: Transportation (Daniela Rus)

  • Data-driven models for transportation
  • Coresets for Global Positioning System (GPS) data streams
  • Congestion-aware planning

Case Study: Visualizing Twitter (Sam Madden)

  • Understand the power of geocoded Twitter data
  • Learn how Graphic Processing Units (GPUs) can be used for extremely high throughput data processing
  • Utilize MapD, a new GPU-based database system for visualizing Twitter in action

Module Two: Big Data Collection

The data capture module surveys approaches to data collection, cleaning, and integration.

Data Cleaning and Integration (Michael Stonebraker)

  • Available tools and protocols for performing data integration
  • Curation issues (cleaning, transforming, and consolidating data)

Hosted Data Platforms and the Cloud (Matei Zaharia)

  • How performance, scalability, and cost models are impacted by hosted data platforms in the cloud
  • Internal and external platforms to store data

Module Three: Big Data Storage

The module on Big Data storage describes modern approaches to databases and computing platforms.

Modern Databases (Michael Stonebraker)

  • Survey data management solutions in today’s market place, including traditional RDBMS, NoSQL, NewSQL, and Hadoop
  • Strategic aspects of database management

Distributed Computing Platforms (Matei Zaharia)

  • Parallel computing systems that enable distributed data processing on clusters, including MapReduce, Dryad, Spark
  • Programming models for batch, interactive, and streaming applications
  • Tradeoffs between programming models

NoSQL, NewSQL (Sam Madden)

  • Survey of new emerging database and storage systems for Big Data
  • Tradeoffs between reduced consistency, performance, and availability
  • Understanding how to rethink the design of database systems can lead to order of magnitude performance improvements

Module Four: Big Data Systems

The systems module discusses solutions to creating and deploying working Big Data systems and applications.

Security (Nickolai Zeldovich)

  • Protecting confidential data in a large database using encryption
  • Techniques for executing database queries over encrypted data without decryption

Multicore Scalability (Nickolai Zeldovich)

  • Understanding what affects the scalability of concurrent programs on multicore systems
  • Lock-free synchronization for data structures in cache-coherent shared memory

User Interfaces for Data (David Karger)

  • Principles of and tools for data visualization and exploratory data analysis
  • Research in data-oriented user interfaces

Module Five: Big Data Analytics

The analytics module covers state-of-the-art algorithms for very large data sets and streaming computation.

Fast Algorithms I (Ronitt Rubinfeld)

  • Efficiency in data analysis

Fast Algorithms II (Piotr Indyk)

  • Advanced applications of efficient algorithms
  • Scale-up properties

Data Compression (Daniela Rus)

  • Reducing the size of the Big Data file and its impact on storage and transmission capacity
  • Design of data compression schemes such as coresets to apply to Big Data set

Machine Learning Tools (Tommi Jaakkola)

  • Computational capabilities of the latest advances in machine learning
  • Advanced machine learning algorithms and techniques for application to large data sets

Case Study: Information Summarization (Regina Barzilay)

Applications: Medicine (John Guttag)

  • Utilize data to improve operational efficiency and reduce costs
  • Analytics and tools to improve patient care and control risks
  • Using Big Data to improve hospital performance and equipment management

Applications: Finance (Andrew Lo)

  • Learn how big data and machine learning can be applied to financial forecasting and risk management
  • Analyze the dynamics of the consumer credit card business of a major commercial bank
  • Recognize and acquire intuition for business cases where big data is useful and where it isn't

Participants' Comments

“As a CTO, I really appreciated being brought up to speed on the many aspects of a fast-moving tech area. The in-depth discussions of the typical use cases, differentiators, and pros & cons of each technology were very valuable and more objective and insightful than all the buzzy, best-foot-forward marketing hype that seems to surround every product.” Mark Paquette, CTO, thedatabank, inc., UNITED STATES


“The course takes you through the vastness of Big Data technologies, processes, algorithms and architectural approaches and provides you with the building blocks of a Big Data strategy for your project/company. The greatest professors of MIT join their forces in order to demystify what Big Data really is, from advanced GPU clusters to data cleaning processes. The course is bold, straight to the point, detailed, and lives up to the reputation of what is probably the greatest engineering university in the world.” Vlad Marin, Big Data Architect, Airbus S.A.S., FRANCE


“I left the course with a big toolbox to handle data strategies which have made a huge impact on our small startup company. The knowledge I gained from this course has saved us hundreds of hours of work.” Tommy Otzen, CEO, Networker.net, DENMARK


“The course provides an end-to-end view of what disciplines and specialties are involved in Big Data solutions, and stimulates participants to explore the most recent research on the subject.” Alexandre Lima, Technical Delivery Manager, Hewlett Packard, BRAZIL


I have taken many technical courses, and this course has given me a much broader view of the possibilities for projects with Big Data.” Cesar Siqueira, Advisory IT Specialist, IBM of Brazil, BRAZIL


“The MIT course on Big Data has proven to be a very complete course. It offers not only the opportunity to delve into the different components of the Big Data ecosystem, but also to gain significant insights through exchanges with fellow students. A must do!” Jurgen Jannssens, Senior Consultant, TETRADE Consulting, BELGIUM


“I thought the course positively impacted me. Having the information condensed and delivered in a comprehensive and intelligent way was a huge asset. It helped me understand the power and complexities in the world of Big Data.” Mimi Slaughter, COO, Tower 3 Ventures, UNITED STATES


“I was working with Big Data previously, with my team of graduate interns, testing Big Data use cases, but I was missing some new developments and structured information since I left university 9 years back. Having attended this course, I am now able to remove the gaps, become aware of what is going on in research and academics, and I have better insight into the problems with Big Data. With this certificate, people across departments now recognize me as an SME.” Hemant Kumar, Associate Architect in Advance Analytics and Big Data, IBM Global Services, SINGAPORE


“This course helped me to obtain a better and wider vision of the issues related to the world of Big Data. Now, thanks to this acquired knowledge, I have a whole new perspective on the steps that should be applied to Big Data projects, and I can make better decisions in all my business tasks.” Adrià López, Project Manager, e-laCaixa, SPAIN

FAQs

Who can register for this course?
Unfortunately, US sanctions do not permit us to offer this course to learners in or ordinarily residing in Iran, Cuba, Sudan, and the Crimean region of Ukraine. MIT Professional Education truly regrets that US sanctions prevent us from offering all of our courses to everyone, no matter where they live.

What do I need to do to register for the course?
Go to mitprofessionalx.mit.edu, click on the course you would like to register for, and click “Add to Cart.” You may be prompted to first register for an mitprofessionalx account if you do not have one already. Complete this process, then continue with checkout and pay for the course. Once you are given access to the course, the first assignment will be to complete the mandatory entrance survey before you can gain access to the videos and other course materials.

How do I register a group of participants?
There are two ways to register multiple individuals at once.

  1. Once the course is added to your cart, you can select the number of enrollments you would like to purchase. You can then pay using a valid credit card.
  2. For a group of 5 or more individuals, you can pay via invoice. To be invoiced, please email mitprofessionalx@mit.edu with the number of individuals in your group, and instructions to register will be provided. Please note that our payment terms are net zero, and all invoices must be paid prior to the course start date. Failure to remit payment before the course begins will result in removal from the course. No extensions or exceptions will be granted.

How should I pay?
Individual registrants must complete registrations and pay online with a valid credit card at the time of registration. MIT Professional Education accepts globally recognized major credit or debit cards that have a Visa, MasterCard, Discover, American Express or Diner's Club logo. Payment must be received in full; payment plans are not available.

Invoices will not be generated for individuals, or for groups of less than 5 people. However, all participants will recieve a payment receipt.

How long is the course?
The course is held over six weeks, and is entirely asynchronous. Lectures are pre-taped and you can follow along when you find it convenient, as long as you finish all required assignments by August 1, 2016, at 7:30 p.m. Eastern Daylight Time (11:30 p.m. UTC). You may complete all assignments before the due date, however, you may find it more beneficial to adhere to a weekly schedule so you can stay up-to-date with the discussion forums.

How long will the course material be available online?
The materials will be available to registered and paid participants for 90 days after the course end date, November 1, 2016. No extensions may be granted.

When will I get access to the course site?
Instructions for accessing the course site will be sent to all paid registrants via email by the course launch date. In order to receive these instructions, please add mitprofessionalx@mit.edu to your “trusted senders” list. If you have not received these instructions by the course start date, please email mitprofessionalx@mit.edu.

Participants are required to provide some personal information via a short mandatory course entrance survey. You will be able to access the survey on the course start date, June 21, 2016. Please be advised that a failure to provide said information will mean that participants will be unable to access course material.

Please see our Terms of Service page for our detailed policies, including terms and conditions of use.

How many hours per week will I have class or homework?
There are approximately three hours of video every week. You will spend additional time on multiple choice assessments, readings, and discussion forums. Most participants will spend about five hours a week on course-related activities.

Please note that the edX platform uses Coordinated Universal Time (UTC), which is 5 hours ahead of Eastern Standard Time (EST) and 4 hours ahead of Eastern Daylight Time (EDT). To convert times to your local time zone, please use the following tool: http://www.timeanddate.com/worldclock/converter.html

How do I know if this course is right for me?
Carefully review the course description page, which includes a description of course content, objectives, and target audience, and any required prerequisites.

Are there prerequisites or advance reading materials?
MIT Professional Education strongly recommends a bachelor’s degree in computer science and three years’ minimum work experience, but the course is open to any interested participant. No advance reading is required.

Who will be participating in this course?
Professionals with diverse personal, business, and academic backgrounds from the U.S. and around the world will participate. They include scientists, engineers, technicians, managers, consultants, and others, and they come from industry, government, military, non-profit, and academia.

I have never taken a course on the edX platform before. What can I do to prepare?
Prior to the first day of class, participants can take a demonstration course on edx.org that was built specifically to help students become more familiar with taking a course on the edX platform.

What reference materials will be available at the end of the course?
Participants will have 90-day access to the archived course (includes videos, discussion boards, content, and Wiki).

What materials will participants keep at the end of the course?
Participants will take away program materials: PDFs of faculty PowerPoint presentations, and resources presented in the course Wiki.

Will I receive an MIT Professional Education Certificate?
Participants who successfully complete the course and all assessments will receive a Certificate of Completion. This course does not carry MIT credits or grades, however, an 80% pass rate is required in order to receive a Certificate and CEUs.

Will I receive MIT credits?
This course does not carry MIT credits. MIT Professional Education offers non-credit/non-degree professional programs for a global audience. Participants may not imply or state in any manner, written or oral, that MIT or MIT Professional Education is granting academic credit for enrollment in this professional course. None of our Digital courses or programs award academic credit or degrees.

Will I earn Continuing Education Units (CEUs)?
Course participants who successfully complete all course requirements are eligible to receive 2.0 Continuing Education Units (CEUs) from MIT.

CEUs are a nationally recognized means of recording non-credit/non-degree study. They are accepted by many employers, licensing agencies, and professional associations as evidence of a participant’s serious commitment to the development of a professional competence.

CEUs are based on hours of instruction. For example: One CEU = 10 hours of instruction.

CEUs may not be applied toward any MIT undergraduate or graduate level course.

After I complete this course, will I be an MIT alum?
MIT alumni status is not granted, but instead, MIT Professional Education alumni status is amongst the benefits PE offers.

Are video captions available?
Each video for this course has been transcribed and the text can be found on the right side of the video when the captions function is turned on. Synchronized transcripts allow students to follow along with the video and navigate to a specific section of the video by clicking the transcript text. Students can use transcripts of media-based learning materials for study and review.

I need to cancel my registration. Are there any fees?
Cancellation requests must be submitted to mitprofessionalx@mit.edu. Cancellation requests received after June 7, 2016 will not be eligible for a refund.

To submit your request, please include your full name and order number in your email request. Refunds will be credited to the credit card used when you registered and may take up to two billing cycles to process. MIT Professional Education Digital Programs and edX have no obligation to issue a refund after June 7, 2016, but if you believe a refund is warranted, please email us at mitprofessionalx@mit.edu.

Can I transfer/defer my registration for another session or course?
Admission and fees paid cannot be deferred to a subsequent session; however, you may cancel your registration and reapply at a later date.

Can someone else attend in my place?
We cannot accommodate any substitution requests at this time. Please review the time commitment section and course schedule above to ensure you are able to participate in the course before you register.

What is the registration deadline?
Individual registrations must be completed by June 28, 2016. For group sales, purchases can take place up until June 20, 2016. Please note that once registration has closed, no late registrations or cancellations will be granted.

Enroll