Skip navigation
Help

US West

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.
Original author: 
Todd Hoff

This is a guest post by Yelp's Jim Blomo. Jim manages a growing data mining team that uses Hadoop, mrjob, and oddjob to process TBs of data. Before Yelp, he built infrastructure for startups and Amazon. Check out his upcoming talk at OSCON 2013 on Building a Cloud Culture at Yelp.

In Q1 2013, Yelp had 102 million unique visitors (source: Google Analytics) including approximately 10 million unique mobile devices using the Yelp app on a monthly average basis. Yelpers have written more than 39 million rich, local reviews, making Yelp the leading local guide on everything from boutiques and mechanics to restaurants and dentists. With respect to data, one of the most unique things about Yelp is the variety of data: reviews, user profiles, business descriptions, menus, check-ins, food photos... the list goes on.  We have many ways to deal data, but today I’ll focus on how we handle offline data processing and analytics.

In late 2009, Yelp investigated using Amazon’s Elastic MapReduce (EMR) as an alternative to an in-house cluster built from spare computers.  By mid 2010, we had moved production processing completely to EMR and turned off our Hadoop cluster.  Today we run over 500 jobs a day, from integration tests to advertising metrics.  We’ve learned a few lessons along the way that can hopefully benefit you as well.

Job Flow Pooling

0
Your rating: None


Update Your Thinking About Communication: 4 Iron Laws: Dr. David Weber: TEDxHampstead

David Weber has worked in the field of organizational development since the late 1970s. At University of Southern California, he earned the MS degree in instructional design, and a Ph.D. in organizational communication at University of Denver. Career assignments and professional projects have enabled David to live and work around the world. During his career, clients have included Xerox, Disney, Frito-Lay, US Marine Corps, General Motors, US West, Hoechst, Waste Industries, Petro Viet Nam (the national oil company of Viet Nam) and many other large and small firms, in both the private and public sectors. He spent several years as an expatriate executive in organizational development in Iran, Indonesia, and Japan. Serving as an award-winning teacher and researcher at the university level since the 1996, David is currently a professor at University of North Carolina Wilmington, in the Department of Communication Studies, teaching courses there in applied organizational communication, consulting skills and organizational research methods. He has published in the business press and professional and academic media. In thespirit of ideas worth spreading, TEDx is a program of local, self-organized events that bring people together to share a TED-like experience. At a TEDx event, TEDTalks video and live speakers combine to spark deep discussion and connection in a small group. These local, self-organized events are branded TEDx, where x = independently organized TED event. The TED <b>...</b>
From:
TEDxTalks
Views:
83

5
ratings
Time:
14:29
More in
Education

0
Your rating: None



Maybe you're a Dropbox devotee. Or perhaps you really like streaming Sherlock on Netflix. For that, you can thank the cloud.

In fact, it's safe to say that Amazon Web Services (AWS) has become synonymous with cloud computing; it's the platform on which some of the Internet's most popular sites and services are built. But just as cloud computing is used as a simplistic catchall term for a variety of online services, the same can be said for AWS—there's a lot more going on behind the scenes than you might think.

If you've ever wanted to drop terms like EC2 and S3 into casual conversation (and really, who doesn't?) we're going to demystify the most important parts of AWS and show you how Amazon's cloud really works.

Read the rest of this article...

Read the comments on this post

0
Your rating: None