Google data archives not compatible with Crystal Reports … oh noz!

MapReduce [ the algorithm that google uses for massively parallel computation) … is:

数据挖掘论坛

1. A giant step backward in the programming paradigm for large-scale data intensive applications

2. A sub-optimal implementation, in that it uses brute force instead of indexing 数据挖掘实验室

3. Not novel at all — it represents a specific implementation of well known techniques developed nearly 25 years ago

4. Missing most of the features that are routinely included in current DBMS 数据挖掘研究院

5. Incompatible with all of the tools DBMS users have come to depend on

数据挖掘交友

In related news, cars that run off of nothing but sunlight and air are 数据挖掘研究院

1. A giant step backward 数据挖掘工具

2. A sub-optimal implementation, in that they don’t use gasoline 数据挖掘论坛

3. Not novel at all — we had solar powered and compressed air powered cars 25 years ago 数据挖掘研究院

4. Missing most of the features that are routinely included in current gasoline powered cars 数据挖掘交友

5. Incompatible with all of the tools that gasoline-engine mechanics use 数据挖掘工具

Holy !@#$, these guys are dense.
数据挖掘研究院

The database community has learned the following three lessons from the 40 years that have unfolded since IBM first released IMS in 1968.

* Schemas are good.

* Separation of the schema from the application is good.

数据挖掘工具

* High-level access languages are good.

数据挖掘工具

MapReduce has learned none of these lessons and represents a throw back to the 1960s, before modern DBMSs were invented.
数据挖掘研究院

Look, I get their points. I like relational databases myself (or, rather, SQL-style databases, which a true database theorist will point out are not “true” relational databases).

数据挖掘研究院

…but arguing with success is kind of hard. I assert that it is objective truth that no relational database can possibly do the things that MapReduce does. 数据挖掘工具

To say that MapReduce stinks, because it “learned none of these lessons” is bunk. The Google guys are not dimwits. They clearly made a decision to trade off some features for others.

The feature of winning the search engine wars and making people into billionaires is a pretty good one, IMO.

数据挖掘论坛

 

The MapReduce community seems to feel that they have discovered an entirely new paradigm for processing large data sets. In actuality, the techniques employed by MapReduce are more than 20 years old.

数据挖掘实验室

Huh? They do? 数据挖掘研究院

MapReduce obviously has tons of ancestors, including vector processing.

数据挖掘工具

Who says that it’s a new concept? 数据挖掘实验室

  数据挖掘论坛

4. MapReduce is missing features 数据挖掘研究院

All of the following features are routinely provided by modern DBMSs, and all are missing from MapReduce: 数据挖掘工具

* Bulk loader — to transform input data in files into a desired format and load it into a DBMS

* Indexing — as noted above

数据挖掘论坛

* Updates — to change the data in the data base

数据挖掘交友

* Transactions — to support parallel update and recovery from failures during update 数据挖掘论坛

* Integrity constraints — to help keep garbage out of the data base

数据挖掘交友

* Referential integrity — again, to help keep garbage out of the data base

* Views — so the schema can change without having to rewrite the application program

数据挖掘交友

In summary, MapReduce provides only a sliver of the functionality found in modern DBMSs.

Oh noz! 数据挖掘论坛

A clever five year old (?) tool is less polished and complete that some dusty hidebound, thirty year old alternative concept. 数据挖掘研究院

In related news, few of the kids getting admitted to MIT and CalTech this year have 401(k)s as well funded as typical fifty year old engineers. 数据挖掘论坛

 

5. MapReduce is incompatible with the DBMS tools

数据挖掘实验室

A modern SQL DBMS has available all of the following classes of tools: 数据挖掘交友

* Report writers (e.g., Crystal reports) to prepare reports for human visualization

数据挖掘论坛

* Business intelligence tools (e.g., Business Objects or Cognos) to enable ad-hoc querying of large data warehouses

* Data mining tools (e.g., Oracle Data Mining or IBM DB2 Intelligent Miner) to allow a user to discover structure in large data sets

数据挖掘工具

* Replication tools (e.g., Golden Gate) to allow a user to replicate data from on DBMS to another 数据挖掘实验室

* Database design tools (e.g., Embarcadero) to assist the user in constructing a data base.
数据挖掘论坛

True.

On the other hand, Modern SQL DBMS are incompatible with Google, Google Maps, Orkut, etc. 数据挖掘工具

I’m sure that the Google execs are ** so ** upset that Crystal reports doesn’t run on their data.

An “interesting” article by David J. DeWitt and Michael Stonebraker. 数据挖掘工具

If you wondered what getting put out to pasture by a bounch of young turks sounds like, this is it.

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:WEBUS搜索引擎与数据挖掘研究院合作
下一篇:Top 5 Search Engine Marketing Trends 2008
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • Mercator: A Scalable, Extensible Web Cra
  • 什么是垂直搜索引擎(之二)
  • Writing a web crawler
  • 互联网搜索的未来
  • 国家版权局版权司副司长许超:关于搜索引擎
  • 百度数分钟内闪电裁员 企业软件事业部遭抛
  • 我对垂直搜索引擎的几点认识
  • Google Patent Filings by the Dozen
  • Manageability - Open Source Web Crawlers
  • 微软卡位第三代搜索技术 认为Google将很快
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 谷歌宣布进军可替代能源 计划投资4.4万亿美
  • 搜索大战成Web 2.0操作系统之争
  • 7月美国搜索市场环比增长2% 雅虎微软成输家
  • 网页面向搜索引擎的搜索引擎优化
  • 史上最具技术创新的10大搜索引擎
  • Google如何预测下一届美国总统
  • 微软1亿美元收购语义搜索引擎Powerset
  • 很黄很暴力:人肉搜索引擎
  • OpenSocial只不过是Google公关骗局
  • 数据之美 百度GOOGLE统计的秘密
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静