You are currently browsing the Markus Breitenbach weblog archives for the day July 11, 2010 8:56 pm.
- Advertising (1)
- Artificial Intelligence (AI) (13)
- Classification (3)
- Clustering (1)
- Coding / Programming (8)
- Cryptography (1)
- Data Mining (22)
- Economy / Investing (1)
- ewrt linux (2)
- Fixing Stuff (8)
- Machine Learning (32)
- Math (2)
- Politics (3)
- Predictive Modeling (5)
- Psychology (3)
- Ramblings (26)
- Random (9)
- Security (16)
- Society (13)
- Sociology (4)
- spam (3)
- Statistics (20)
- January 28, 2012 4:56 pm: Will 2012 be the year of Big Data?
- August 14, 2011 10:41 pm: UK plans to exempt data mining from copyright laws
- June 21, 2011 3:26 am: Risk Assessment of Rare Events in adversarial Scenarios
- March 26, 2011 7:57 pm: How Kinect body tracking works and how Machine Learning helped
- March 1, 2011 11:58 am: European Court of Justice ruling (indirectly) on what cannot be used in Insurance Risk Models
- December 11, 2010 8:35 pm: Mining of Massive Datasets
- December 4, 2010 2:28 pm: Ideas on communicating risks and probabilities to the general public
- October 17, 2010 5:48 pm: Birthday Paradox
- August 5, 2010 1:06 am: Elo Scores and Rating Contestants
- July 11, 2010 8:56 pm: GraphLab & Parallel Machine Learning
Blogroll
Uncategorized
Useful Links
- January 2012
- August 2011
- June 2011
- March 2011
- December 2010
- October 2010
- August 2010
- July 2010
- June 2010
- February 2010
- January 2010
- November 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
Archive for July 11, 2010 8:56 pm
GraphLab & Parallel Machine Learning
July 11, 2010 8:56 pm by Markus.
Interesting article: GraphLab: A New Framework for Parallel Machine Learning
From the abstract:
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems.
Given all the talk about Map-Reduce, Hadoop etc. this seems like a logical next step to make scaling data mining to large data sets a lot easier.
Posted in Data Mining, Machine Learning | Print | 1 Comment »