|
本帖最后由 hci 于 2013-12-3 18:38 编辑
'数据挖掘而己"说得好像这是什么易事。容易的话也不会现在这么大的big data hype
针对个人,抓spy,抓对一个是一个,抓不对cost很小,容易, 因为这是回答“given entity A, test if it has property X", or at most "given a list of entities, find the subset that have property X"。The initial set is known.
要搞个谁谁谁有什么枪的,针对所有人的单子,很难, 原因是要回答“what are the enties that have property X”, there is no initial set, or the initial set includes everyone in US. Very expensive to do. 要不然不会想让大家註册,目的就reduce difficult , reduce cost. 而且这儿搞错的cost 很大.
|
|