ITLApplied  Computational Mathematics Division
ACMD Seminar Series
Attractive Image NIST
 
Up


Machine Learning on Massive Data Sets

Alexander Gray
College of Computing, Georgia Institute of Technology

Tuesday, April 26, 2011 15:00-16:00,
Building 101, Lecture Room F
Gaithersburg
Tuesday, April 26, 2011 13:00-14:00,
Room 4552
Boulder

Abstract:

This talk will discuss the new statistical and computational foundations demanded by next-generation challenges in data analysis. Two challenges which keep increasing in importance and ubiquity are challenges of scale: massive datasets and various curses of dimensionality. New learning methods and new general algorithmic strategies for dealing with the fundamental ``inner-loop'' computations at the root of large classes of statistics and machine learning methods, both classical and modern will be highlighted. The work is general enough that it impacts other areas of scientific computing, such as physical simulation and linear algebra. Applications in a wide variety of areas will be given, as well as an overview of our unique open-source machine learning library.

Speaker Bio: Alexander Gray received Bachelor's degrees in Applied Mathematics and Computer Science from UC Berkeley and a PhD in Computer Science from Carnegie Mellon University, and worked in the Machine Learning Systems Group of NASA's Jet Propulsion Laboratory for 6 years. He currently directs the FASTlab (Fundamental Algorithmic and Statistical Tools Laboratory) at Georgia Tech, consisting of ~20 people including 12 PhD students, which works on the problem of how to perform machine learning/data mining/statistics on massive datasets, and related problems in scientific computing and applied mathematics. Employing a multi-disciplinary array of technical ideas (from discrete algorithms and data structures, computational geometry, computational physics, Monte Carlo methods, convex optimization, linear algebra, distributed computing), the lab has developed the current fastest algorithms for several fundamental statistical methods, and also develops new statistical machine learning methods for difficult aspects of real-world data, such as in astrophysics and biology. This work has enabled high-profile scientific results which have been featured in Science and Nature, and has received a National Science Foundation CAREER award, three best paper awards, and three best paper award nominations. He has given tutorials and invited talks on efficient algorithms for machine learning at venues including ICML, NIPS, SIAM Data Mining, and is a member of the National Academies Committee on the Analysis of Massive Data. He is a frequent invited speaker in the emerging area of astrostatistics/astroinformatics.


Presentation Slides: PPT


Contact: J. E. Terrill

Note: Visitors from outside NIST must contact Robin Bickel; (301) 975-3668; at least 24 hours in advance.



Privacy Policy | Disclaimer | FOIA
NIST is an agency of the U.S. Commerce Department.
Last updated: 2011-04-27.
Contact