Hey there Global Readers,
Today we are going to introduce you to a great Search technology known as Solr ( built off of the original software product known as Lucene which is free ). You can find Solr freely available at: http://lucene.apache.org/solr/ where you can download it and host it on your servers once the proper requirements are met.
Solr is a powerful search server which allows you to scan and index large quantities of data whether that be web pages, databases, text files, XML data, word / PDF files, and a plethora of other formats. It is built off of the modern and powerful Java language which is quite mature and enterprise ready.
Solr can be complemented with another software called Nutch ( also Java based ) which includes a web crawler so that you can feed your search engine new content like Google, Bing, Yahoo, and any other major search players do. Both of these softwares are free and there are many additional plugins that are also freely available. Free software such as these are labeled Open Source which means the code that makes them up is viewable and editable by the developers and media firms hosting, coding / programming / developing, and managing them ( although some open source licenses have different variations of what you can and cannot do ).
Many firms offer Enterprise ready versions with support packages in a commercial manner for larger business needing to use SOLR to search over major websites, databases, and files to properly index and return search results in a fast and reliable manner. One such amazing firm is Lucid Imagination ( http://www.lucidimagination.com/ ) whom I have spoken with many times about someday starting a project in this field for several clients, partners, and internal needs.
Now the breath of reality… Google is powered by over 75,000 servers according to several websites we have researched about the issue which we found via Google itself ( almost a bit ironic using Google to explain to others how to build their own Google ). The world’s data is growing at an unprecedented rate and accurately scanning, searching, and returning results on that much data is a major problem which requires massive budgets that truly only large players such as Microsoft and Google can deliver in the present time. Although one can setup their own search server with enough background knowledge in the area, it is difficult and costly to really build out very far without amassing human capital, financial backing, business infrastructure and long term goals. We would just like to share our knowledge with the world and other entrepreneurs who may be interested in this topic so that perhaps they can innovate new features for search that will change the world and even introduce new competition into the mix for the search engines. Go get em tiger!
—-
Who uses Solr/Lucene?
http://wiki.apache.org/lucene-java/PoweredBy
Has a major list but by no means completely conclusive of every major firm using it today ( many hide it as a competitive secret but we are the Global Good Group and like giving out information ).
AOL
AMAZON
APPLE
DISNEY
Hi5
IBM
and even the official Bob Dylan http://www.bobdylan.com/