Text mining market scoping
I vigorously resist estimating market sizes, due to mutliple levels of definitional problems -- what products are in the market, which revenue dollars should be associated with which product, etc. But I've been talking so much about text mining recently, in the aftermath of an excellent text mining conference, that questions on the subject keep getting posed to me.
So here are a few thoughts and data points.
- I estimate that SPSS and SAS have several hundred customers each for text data mining, narrowly construed.
- In addition, SPSS has many hundreds more customers for text mining as specifically applied to opinion surveys, and a bunch more text mining customers that don't fit neatly into either of the first two groups I cited. Based on this, they have a compelling claim to be the text mining market leader.
- As a wild guess, I estimate that Oracle has in the dozens of text mining customers total, not counting text mining done by other vendors against data in Oracle databases.
- The leaders among the specialist text mining vendors seem to have a few dozen customers each. Inxight is a special case exception because they OEM technology to lots of other search and text mining vendors, including SAS.
- As noted in the post linked above, medical-discovery text mining is around a $10 million market, which isn't a lot given the large amount of smart and important work being done in the area.
I think these numbers will get a lot bigger soon. Text mining is a very hot area.



