YaCy
hideThis article has multiple issues. Please help or discuss these issues on the talk page. (Learn how and when to remove these template messages)
|
Original author(s) | |
---|---|
Developer(s) | YaCy community |
Initial release | 2003[1] |
Stable release | 1.922
/ 14 October 2019 |
Repository | github |
Written in | Java |
Operating system | Cross-platform |
Type | Overlay network, Search engine |
License | GPL-2.0-or-later |
Website | yacy |
YaCy (pronounced "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks.[2][3] Its core is a computer program written in Java distributed on several hundred computers, as of September 2006, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database (so called index) which is shared with other YaCy-peers using principles of P2P networks. It is a search engine that everyone can use to build a search portal for their intranet and to help search the public internet clearly.
Compared to semi-distributed search engines, the YaCy-network has a decentralized architecture. All YaCy-peers are equal and no central server exists. It can be run either in a crawling mode or as a local proxy server, indexing web pages visited by the person running YaCy on their computer. (Several mechanisms are provided to protect the user's privacy). Access to the search functions is made by a locally running web server which provides a search box to enter search terms, and returns search results in a similar format to other popular search engines.
YaCy was created in 2003 by Michael Christen.[4]
System components[]
YaCy search engine is based on four elements:[5]
- Crawler
- A search robot that traverses between web pages, analyzing their content.[6]
- Indexer
- Creates a reverse word index (RWI) i.e. each word from the RWI has its list of relevant URLs and ranking information. Words are saved in form of word hashes.[7]
- Search and administration interface
- Made as a web interface provided by a local HTTP servlet with servlet engine.[8]
- Data storage
- Used to store the reverse word index database utilizing a distributed hash table.
Search-engine technology[]
- YaCy is a complete search appliance with user interface, index, administration and monitoring.
- YaCy harvests web pages with a web crawler. Documents are then parsed, indexed and the search index is stored locally. If your peer is part of a peer network, then your local search index is also merged into the shared index for that network.
- A search is started then the local index contributes together with a global search index from peers in the YaCy search network.
- The YaCy Grid is a second-generation implementation of the YaCy peer-to-peer search. A YaCy Grid installation consists of micro-services that communicate using the MCP.
- The YaCy Parser is a microservice that can be deployed using Docker. When the Parser Component is started, it searches for a MCP and connects to it. By default the local host is searched for a MCP but you can configure one yourself.
YaCy platform architecture[]
YaCy uses a combination of techniques for the networking, administration, and maintenance of indexing the search engine including blacklisting, moderation, and communication with the community. Here is how YaCy performs these operations:
- Community components
- Web forum[9]
- Statistics
- XML API
- Maintenance
- Web Server
- Indexing
- Crawler with Balancer
- Peer-to-Peer Server Communication
- Content organization
- Blacklisting and Filtering
- Search interface
- Bookmarks
- Monitoring search results
Distribution[]
YaCy is available ini packages for Linux, Windows, Macintosh and also as Docker Image. YaCy can also be installed on any other operation system either by manually compiling it, or using a tarball.[10] YaCy requires Java 8, OpenJDK 8 is recommended.
The Debian package can be installed from a repository available at the subdomain of the project's website.[11][12][13] The package is not maintained in the official Debian package repository yet.[14] [15] [16]
See also[]
- Dooble – an open-source web browser with an integrated YaCy Search Engine Tool Widget
References[]
- ^ "Ich entwickle eine P2P-basierende Suchmaschine. Wer macht mit?". Heise Online (in German). 2003-12-15. Retrieved 2018-05-09.
- ^ "YaCy takes on Google with open source search engine". The Register. 2011-11-29. Retrieved 2012-04-16.
- ^ "YaCy: It's About Freedom, Not Beating Google". PC World. 2011-12-03. Retrieved 2012-04-16.
- ^ "Ich entwickle eine P2P-basierende Suchmaschine. Wer macht mit?". Heise Online (in German). 2003-12-15. Retrieved 2018-05-09.
- ^ "YaCy Technology Architecture". YaCy.net. Retrieved 2012-02-14.
- ^ GitHub: YaCy Grid Crawler, YaCy Search Engine, 2021-02-28, pp. yacy / yacy_grid_crawler, retrieved 2021-03-11
- ^ GitHub: YaCy Grid Parser, YaCy Search Engine, 2021-02-28, pp. The YaCy Grid is the second-generation implementation of YaCy, retrieved 2021-03-11
- ^ GitHub: YaCY Search, YaCy Search Engine, 2021-02-28, pp. yacy / yacy-search forked from cream/yacy-search, retrieved 2021-03-11
- ^ "forum.yacy.de". Retrieved 6 June 2017.
- ^ "Download - YaCy". yacy.net. Retrieved 2021-07-27.
- ^ "En:DebianInstall". YaCyWiki. Retrieved 6 October 2019.
- ^ D, Emery. "Basic guide to search engines". pwd. yacy -- distributed web crawler and search engine. pwd. YaCy. Retrieved 2021-07-27. Check
|archive-url=
value (help) - ^ "Dev:TaskSharing". YaCyWiki. Retrieved 6 October 2019.
- ^ "#452422 - RFP: yacy -- distributed web crawler and search engine". Debian Bug report logs. Retrieved 2 May 2020.
- ^ Azizi Search Engine Script
- ^ "digital agency". pwd. YaCy: Distributed Search Engine. 2021-07-21. YaCy. Retrieved 2021-07-27 – via YaCy. Check
|archive-url=
value (help)
Wikimedia Commons has media related to YaCy. |
Further reading[]
External links[]
- Anonymity networks
- Distributed data storage
- Free search engine software
- Free web crawlers
- Internet properties established in 2003
- Internet search engines
- Java platform software
- Cross-platform software
- Software using the GPL license
- Java (programming language) software
- Peer-to-peer software