Can Google's QUIC be faster than Mega Man's nemesis, Quick Man?
Google, as is its wont, is always trying to make the World Wide Web go faster. To that end, Google in 2009 unveiled SPDY, a networking protocol that reduces latency and is now being built into HTTP 2.0. SPDY is now supported by Chrome, Firefox, Opera, and the upcoming Internet Explorer 11.
But SPDY isn't enough. Yesterday, Google released a boatload of information about its next protocol, one that could reshape how the Web routes traffic. QUIC—standing for Quick UDP Internet Connections—was created to reduce the number of round trips data makes as it traverses the Internet in order to load stuff into your browser.
Although it is still in its early stages, Google is going to start testing the protocol on a "small percentage" of Chrome users who use the development or canary versions of the browser—the experimental versions that often contain features not stable enough for everyone. QUIC has been built into these test versions of Chrome and into Google's servers. The client and server implementations are open source, just as Chromium is.
angry tapir writes "Researchers at Microsoft Research have produced a prototype software system that can be used on smartphones to infer a user's mood. The 'MoodScope' system produced by researchers uses smartphone usage patterns to determine whether someone is happy, calm, excited, bored or stressed and could potentially add a new dimension to to mobile apps (as well as, as the researchers note, open up a Pandora's Box of privacy issues). The researchers created a low-power background service for iPhones and Android handsets that (with training) can offer reasonable detection of mood and offers and API that app developers could hook into."
Read more of this story at Slashdot.
This is the first in a two-part series exploring Butterfly Labs and its lineup of dedicated Bitcoin-mining hardware. In part one, we look at the company and the experiences customers have had with it. In part two, to be published on June 29, we share our experiences running a Bitcoin miner for a couple weeks. Spoiler alert: we made money.
The more I dig into Bitcoin, the stranger it gets. There’s gray-market online gambling and Russian-operated futures markets—to say nothing of the virtual currency’s wild ride over the last several months. It’s full of characters with names like “artforz” and “Tycho,” supposedly two of the largest Bitcoin holders out there. Of course, like most things Bitcoin, it’s nearly impossible to know for sure.
While reporting on a Bitcoin-based gambling story earlier this year, I interviewed Bryan Micon, who works with a Bitcoin-based poker site called Seals With Clubs. (To continue the lack of information, Micon won’t say who owns the site.) Micon has taken it upon himself to investigate what he believes are Bitcoin-related scams—such as the ill-fated Bitcoin Savings and Trust online bank—and he makes public pronouncements about them.
Erasure codes are one of those seemingly magical mathematical creations that with the developments described in the paper XORing Elephants: Novel Erasure Codes for Big Data, are set to replace triple replication as the data storage protection mechanism of choice.
The result says Robin Harris (StorageMojo) in an excellent article, Facebook’s advanced erasure codes: "WebCos will be able to store massive amounts of data more efficiently than ever before. Bad news: so will anyone else."
Robin says with cheap disks triple replication made sense and was economical. With ever bigger BigData the overhead has become costly. But erasure codes have always suffered from unacceptably long time to repair times. This paper describes new Locally Repairable Codes (LRCs) that are efficiently repairable in disk I/O and bandwidth requirements:
These systems are now designed to survive the loss of up to four storage elements – disks, servers, nodes or even entire data centers – without losing any data. What is even more remarkable is that, as this paper demonstrates, these codes achieve this reliability with a capacity overhead of only 60%.
They examined a large Facebook analytics Hadoop cluster of 3000 nodes with about 45 PB of raw capacity. On average about 22 nodes a day fail, but some days failures could spike to more than 100.
LRC test results found several key results.
- Disk I/O and network traffic were reduced by half compared to RS codes.
- The LRC required 14% more storage than RS, information theoretically optimal for the obtained locality.
- Repairs times were much lower thanks to the local repair codes.
- Much greater reliability thanks to fast repairs.
- Reduced network traffic makes them suitable for geographic distribution.
- LRC test results found several key results.
- Disk I/O and network traffic were reduced by half compared to RS codes.
I wonder if we'll see a change in NoSQL database systems as well?
- Erasure Coding vs. Replication: A Quantitative Comparison
- Ceph - a distributed object store.
One of the biggest personal data collectors around is getting ready to open its vaults to the public. According to Forbes, you'll soon be able to request your personal files from Acxiom, a marketing company that holds a database on the interests and details of over 700 million people. That database reportedly holds information on consumers' occupations, phone numbers, religions, shopping habits, and health issues, to name a few. That data has traditionally been given only to marketers — for a fee, of course — but Acxiom has decided to let consumers peer into its database as well. Whether individuals will have to pay too is still up for debate, but it's been decided that a person can only view their own file.
This is a guest post by Yelp's Jim Blomo. Jim manages a growing data mining team that uses Hadoop, mrjob, and oddjob to process TBs of data. Before Yelp, he built infrastructure for startups and Amazon. Check out his upcoming talk at OSCON 2013 on Building a Cloud Culture at Yelp.
In Q1 2013, Yelp had 102 million unique visitors (source: Google Analytics) including approximately 10 million unique mobile devices using the Yelp app on a monthly average basis. Yelpers have written more than 39 million rich, local reviews, making Yelp the leading local guide on everything from boutiques and mechanics to restaurants and dentists. With respect to data, one of the most unique things about Yelp is the variety of data: reviews, user profiles, business descriptions, menus, check-ins, food photos... the list goes on. We have many ways to deal data, but today I’ll focus on how we handle offline data processing and analytics.
In late 2009, Yelp investigated using Amazon’s Elastic MapReduce (EMR) as an alternative to an in-house cluster built from spare computers. By mid 2010, we had moved production processing completely to EMR and turned off our Hadoop cluster. Today we run over 500 jobs a day, from integration tests to advertising metrics. We’ve learned a few lessons along the way that can hopefully benefit you as well.
Job Flow Pooling
Tumblr Creative Director Peter Vidani
New York City noise blares right outside Tumblr’s office in the Flat Iron District in Manhattan. Once inside, the headquarters hum with a quiet intensity. I am surrounded by four dogs that employees have brought to the workspace today. Apparently, there are even more dogs lurking somewhere behind the perpendicular rows of desks. What makes the whole thing even spookier is that these dogs don’t bark or growl. It’s like someone’s told them that there are developers and designers at work, and somehow they’ve taken the cue.
I’m here to see Tumblr’s Creative Director Peter Vidani who is going to pull the curtain back on the design process and user experience at Tumblr. And when I say design process, I don’t just mean color schemes or typefaces. I am here to see the process of interaction design: how the team at Tumblr comes up with ideas for the user interface on its website and its mobile apps. I want to find out how those ideas are shaped into a final product by their engineering team.
Back in May, Yahoo announced it was acquiring Tumblr for $1.1 billion. Yahoo indicated that Tumblr would continue to operate independently, though we will probably see a lot of content crossover between the millions of blog posts hosted by Tumblr and Yahoo’s search engine technology. It’s a little known fact that Yahoo has provided some useful tools for UX professionals and developers over the years through their Design Pattern Library, which shares some of Yahoo’s most successful and time-tested UI touches and interactions with Web developers. It’s probably too early to tell if Tumblr’s UI elements will filter back into these libraries. In the meantime, I talked to Vidani about how Tumblr UI features come to life.
Often I write small methods (maybe 10 to 15 lines of code) that need to be reused across two projects that can't reference each other. The method might be something to do with networking / strings / MVVM etc. and is a generally useful method not specific to the project it originally sits in.
So how should you track shared snippets across projects so you know where your canonical code resides and know where it's in production when a bug needs to be fixed?
Smári McCarthy, in his Twitter bio, describes himself as a "Information freedom activist. Executive Director of IMMI. Pirate."
On Friday, two Icelandic activists with previous connections to WikiLeaks announced that they received newly unsealed court orders from Google. Google sent the orders earlier in the week, revealing that the company searched and seized data from their Gmail accounts—likely as a result of a grand jury investigation into the rogue whistleblower group.
Google was forbidden under American law from disclosing these orders to the men until the court lifted this restriction in early May 2013. (A Google spokesperson referred Ars to its Transparency Report for an explanation of its policies.)
On June 21, 2013, well-known Irish-Icelandic developer Smári McCarthy published his recently un-sealed court order dating back to July 14, 2011. Google sent him the order, which included McCarthy's Gmail account metadata, the night before. The government cited the Stored Communications Act (SCA)(specifically a 2703(d) order) as grounds to provide this order.