Skip navigation
Help

Formal sciences

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.
Original author: 
Todd Hoff

Erasure codes are one of those seemingly magical mathematical creations that with the developments described in the paper XORing Elephants: Novel Erasure Codes for Big Data, are set to replace triple replication as the data storage protection mechanism of choice.

The result says Robin Harris (StorageMojo) in an excellent article, Facebook’s advanced erasure codes: "WebCos will be able to store massive amounts of data more efficiently than ever before. Bad news: so will anyone else."

Robin says with cheap disks triple replication made sense and was economical. With ever bigger BigData the overhead has become costly. But erasure codes have always suffered from unacceptably long time to repair times. This paper describes new Locally Repairable Codes (LRCs) that are efficiently repairable in disk I/O and bandwidth requirements:

These systems are now designed to survive the loss of up to four storage elements – disks, servers, nodes or even entire data centers – without losing any data. What is even more remarkable is that, as this paper demonstrates, these codes achieve this reliability with a capacity overhead of only 60%.

They examined a large Facebook analytics Hadoop cluster of 3000 nodes with about 45 PB of raw capacity. On average about 22 nodes a day fail, but some days failures could spike to more than 100.

LRC test results found several key results.

  • Disk I/O and network traffic were reduced by half compared to RS codes.
  • The LRC required 14% more storage than RS, information theoretically optimal for the obtained locality.
  • Repairs times were much lower thanks to the local repair codes.
  • Much greater reliability thanks to fast repairs.
  • Reduced network traffic makes them suitable for geographic distribution.
  • LRC test results found several key results.
  • Disk I/O and network traffic were reduced by half compared to RS codes.

I wonder if we'll see a change in NoSQL database systems as well? 

Related Articles

0
Your rating: None
Original author: 
Carl Franzen

Quantum-smartcard-qkard-los-alamos_large

It's not quite a quantum internet — yet. But researchers at Los Alamos National Laboratory in New Mexico have developed a new, ultra-secure computer network that is capable of transmitting data that has been encrypted by quantum physics, including video files. The network, which currently consists of a main server and three client machines, has been running continuously in Los Alamos for the past two and a half years, the researchers reported in a paper released earlier this month. During that time, they have also successfully tested sending critical information used by power companies on the status of the electrical grid. Eventually they hope to use it to test offline quantum communication capabilities on smartphones and tablets.

Continue reading…

0
Your rating: None

hal380The advent of Salesforce Marketing Cloud and Adobe Marketing Cloud demonstrates the need for enterprises to develop new ways of harnessing the vast potential of big data. Yet these marketing clouds beg the question of who will help marketers, the frontline of businesses, maximize marketing spending and ROI and help their brands win in the end. Simply moving software from onsite to hosted servers does not change the capabilities marketers require — real competitive advantage stems from intelligent use of big data.

Marc Benioff, who is famous for declaring that “Software Is Dead,” may face a similar fate with his recent bets on Buddy Media and Radian6. These applications provide data to people who must then analyze, prioritize and act — often at a pace much slower than the digital world. Data, content and platform insights are too massive for mere mortals to handle without costing a fortune. Solutions that leverage big data are poised to win — freeing up people to do the strategy and content creation that is best done by humans, not machines.

Big data is too big for humans to work with, at least in the all-important analytical construct of responding to opportunities in real time — formulating efficient and timely responses to opportunities generated from your marketing cloud, or pursuing the never-ending quest for perfecting search engine optimization (SEO) and search engine marketing (SEM). The volume, velocity and veracity of raw, unstructured data is overwhelming. Big data pioneers such as Facebook and eBay have moved to massive Hadoop clusters to process their petabytes of information.

In recent years, we’ve gone from analyzing megabytes of data to working with gigabytes, and then terabytes, and then petabytes and exabytes, and beyond. Two years ago, James Rogers, writing in The Street, wrote: “It’s estimated that 1 Petabyte is equal to 20 million four-door filing cabinets full of text.” We’ve become jaded to seeing such figures. But 20 million filing cabinets? If those filing cabinets were a standard 15 inches wide, you could line them up, side by side, all the way from Seattle to New York — and back again. One would need a lot of coffee to peruse so much information, one cabinet at a time. And, a lot of marketing staff.

Of course, we have computers that do the perusing for us, but as big data gets bigger, and as analysts, marketers and others seek to do more with the massive intelligence that can be pulled from big data, we risk running into a human bottleneck. Just how much can one person — or a cubicle farm of persons — accomplish in a timely manner from the dashboard of their marketing cloud? While marketing clouds do a fine job of gathering data, it still comes down to expecting analysts and marketers to interpret and act on it — often with data that has gone out of date by the time they work with it.

Hence, big data solutions leveraging machine learning, language models and prediction, in the form of self-learning solutions that go from using algorithms for harvesting information from big data, to using algorithms to initiate actions based on the data.

Yes, this may sound a bit frightful: Removing the human from the loop. Marketers indeed need to automate some decision-making. But the human touch will still be there, doing what only people can do — creating great content that evokes emotions from consumers — and then monitoring and fine-tuning the overall performance of a system designed to take actions on the basis of big data.

This isn’t a radical idea. Programmed trading algorithms already drive significant activity across stock markets. And, of course, Amazon, eBay and Facebook have become generators of — and consummate users of — big data. Others are jumping on the bandwagon as well. RocketFuel uses big data about consumers, sites, ads and prior ad performance to optimize display advertising. Turn.com uses big data from consumer Web behavior, on-site behaviors and publisher content to create, optimize and buy advertising across the Web for display advertisers.

The big data revolution is just beginning as it moves beyond analytics. If we were building CRM again, we wouldn’t just track sales-force productivity; we’d recommend how you’re doing versus your competitors based on data across the industry. If we were building marketing automation software, we wouldn’t just capture and nurture leads generated by our clients, we’d find and attract more leads for them from across the Web. If we were building a financial application, it wouldn’t just track the financials of your company, it would compare them to public filings in your category so you could benchmark yourself and act on best practices.

Benioff is correct that there’s an undeniable trend that most marketing budgets today are betting more on social and mobile. The ability to manage social, mobile and Web analysis for better marketing has quickly become a real focus — and a big data marketing cloud is needed to do it. However, the real value and ROI comes from the ability to turn big data analysis into action, automatically. There’s clearly big value in big data, but it’s not cost-effective for any company to interpret and act on it before the trend changes or is over. Some reports find that 70 percent of marketers are concerned with making sense of the data and more than 91 percent are concerned with extracting marketing ROI from it. Incorporating big data technologies that create action means that your organization’s marketing can get smarter even while you sleep.

Raj De Datta founded BloomReach with 10 years of enterprise and entrepreneurial experience behind him. Most recently, he was an Entrepreneur-In-Residence at Mohr-Davidow Ventures. Previously, he was a Director of Product Marketing at Cisco. Raj also worked in technology investment banking at Lazard Freres. He holds a BSE in Electrical Engineering from Princeton and an MBA from Harvard Business School.

0
Your rating: None


TEDxOxford - Kevin Warwick - Cyborg Interfaces

In this talk Kevin Warwick, professor of Cybernetics at Reading University presents his talk on Cyborgs at TEDxOxford on 26th September 2011. He presents ideas on bringing back sight to the blind, allowing humans to see with sonar, and communicating with thought alone by combining artificial components with humans. TEDxOxford is a conference run entirely by students of Oxford University for young people. To find out more about TEDxOxford see www.tedxoxford.com AboutTEDx: In the spirit of ideas worth spreading, TEDx is a program of local, self-organized events that bring people together to share a TED-like experience. At a TEDx event, TEDTalks video and live speakers combine to spark deep discussion and connection in a small group. These local, self-organized events are branded TEDx, where x = independently organized TED event. The TED Conference provides general guidance for the TEDx program, but individual TEDx events are self-organized.* (*Subject to certain rules and regulations)
From:
TEDxTalks
Views:
522

10
ratings
Time:
19:46
More in
Education

0
Your rating: None


TedxVienna - Robert Trappl - Are we sheep when we dream of electric androids?

www.tedxvienna.at www.facebook.com Robert Trappl is professor and head of the Institute of Medical Cybernetics and Artificial Intelligence, Center for Brain Research, Medical University of Vienna, Austria. He is head of the Austrian Research Institute for Artificial Intelligence in Vienna, which was founded in 1984. He holds a PhD in psychology (minor: astronomy), a diploma in sociology (Institute for Advanced Studies, Vienna), and is engineer for electrical engineering. Mr. Trappl has published more than 180 articles, he is co-author, editor or co-editor of 34 books and is Editor-in-Chief of "Applied Artificial Intelligence: An International Journal" and "Cybernetics and Systems: An International Journal", both published by Taylor & Francis, USA. He has been giving lectures and courses and working as a consultant for national and international companies and organizations (OECD, UNIDO, WHO). AboutTEDx In the spirit of ideas worth spreading, TEDx is a program of local, self-organized events that bring people together to share a TED-like experience. At a TEDx event, TEDTalks video and live speakers combine to spark deep discussion and connection in a small group. These local, self-organized events are branded TEDx, where x = independently organized TED event. The TED Conference provides general guidance for the TEDx program, but individual TEDx events are self-organized.* (*Subject to certain rules and regulations) www.tedxvienna.at www.facebook.com
From:
TEDxTalks
Views:
155

9
ratings
Time:
20:36
More in
Science & Technology

0
Your rating: None

In this abridgement of the first chapter of new book Imaginary Games, game designer, philosopher, and writer Chris Bateman, best known for the game Discworld Noir, examines the game-as-art debate from an interesting new angle.

0
Your rating: None