Revue de liens du 8 février 2013

Hadoop Optimization: With DMExpress, organizations can realize the full potential of Hadoop to capitalize on the big opportunities that come with Big Data. - http://www.syncsort.com/Solutions/HadoopOptimization.aspx

Joyent Offers Hadoop Solution for Big Data Challenges: Joyent announced a new Apache Hadoop-based solution, built using the Hortonworks Data Platform (HDP), that allows companies to run enterprise-class Hadoop on the high-performance Joyent Cloud. - by John Rath - http://www.datacenterknowledge.com/archives/2013/01/24/joyent-enters-big-data-hadoop-solution/

Iteratees in Big Data at Klout: Tweet // At Klout we calculate social influence of users across several social networks and we must be able to collect and aggregate data from these networks scalably and within a meaningful amount of time. We then deliver these metrics to our users visually. - by Naveen Gattu - http://engineering.klout.com/2013/01/iteratees-in-big-data-at-klout/

Arc Diagrams in R: Les Miserables: An arc diagram is a graphical display to visualize graphs or networks in a one-dimensional layout. The main idea is to display nodes along a single axis, while representing the edges or connections between nodes with arcs. - by Gaston Sanchez - http://www.r-bloggers.com/arc-diagrams-in-r-les-miserables/

Facebook Scales Servers with Retooled Chef: Facebook has adopted Opscode’s Private Chef to help manage its fast-growing infrastructure, like this data hall in the company’s North Carolina data center.The new version of Chef has been rewritten to enhance its scalability. - by Rich Miller - http://www.datacenterknowledge.com/archives/2013/02/04/facebook-uses-retooled-chef-to-manage-infrastructure/

Hadoop Cluster using Whirr – BYON in AWS VPC – Part 1: Key challenge in getting a hadoop cluster through Whirr – BYON over Amazon AWS VPC, is that each instance in the Hadoop clusters should have a hostname which is traceable both by forward and reverse dns lookup in their network. - by cloudkinetics - http://cloudkinetics.wordpress.com/2013/02/04/hadoop-cluster-using-whirr-byon-in-aws-vpc/

JP Morgan, Citi, Bank of America, and Wells Fargo Hope Big Data Leads to Better Earnings : The CIO Report: JPMorgan Chase & Co., the largest commercial bank in the U.S., generates a vast amount of credit card information and other transactional data about U.S. consumers. Several months ago, it began to combine that database, which includes 1. - by Michael Hickins - http://mobile.blogs.wsj.com/cio/2013/02/06/banks-using-big-data-to-discover-new-silk-roads/

From Zero to Impala in Minutes: This was post was originally published by U.C. Berkeley AMPLab developer (and former Clouderan) Matt Massie, on his personal blog. Matt has graciously permitted us to re-publish here for your convenience. - by Matt Massie - http://blog.cloudera.com/blog/2013/02/from-zero-to-impala-in-minutes/

Exploration statistique des données d'affaires : point de vue: Michael ALBO : Je vais faire un petit détour par l'histoire récente pour te répondre. Lorsque j'ai commencé ma carrière durant les années 90, je me souviens combien je peinais à trouver de l'information détaillée et fiable pour réaliser mes analyses de marché. - http://www.decideo.fr/Exploration-statistique-des-donnees-d-affaires-point-de-vue_a5838.html

Le Big Data prend d’assaut les boutiques: Le Big Data va certainement révolutionner le secteur de la distribution. Et ca risque bien d’aller bien au delà de la « simple » analyse des tickets de caisse. - by admin - http://www.legrandbi.com/2013/01/big-data-retail/

Six Types Of Analyses Every Data Scientist Should Know:  Jeffrey Leek, Assistant Professor of Biostatistics at John Hopkins Bloomberg School of Public Health, has identified six(6) archetypical analyses. As presented, they range from the least to most complex, in terms of knowledge, costs, and time. In summary, 1. - by Dr. J - http://datascientistinsights.com/2013/01/29/six-types-of-analyses-every-data-scientist-should-know/

Top 5 Reasons to Get Your Graph On: The NOSQL world is crowded with data stores that offer variations on key-value storage. These so-called aggregate-oriented# stores often have beneficial operational characteristics when compared to relational databases but little modeling expressiveness. - by Guest Author - http://siliconangle.com/blog/2013/01/22/top-5-reasons-to-get-your-graph-on/

Hortonworks Unveils Desktop Version of Hadoop, Expedites Automation: Hortonworks launched a new trial version of Hadoop that enables users to run the analytics engine on a desktop with minimal setup.  It’s infinitely more accessible than downloading and deploying the Apache versions of these projects. - by Maria Deutscher - http://siliconangle.com/blog/2013/01/22/hortonworks-unveils-desktop-version-of-hadoop-expedites-automation/