To many people, "geek" and "nerd" are synonyms, but in fact they are a little different. Consider the phrase "sports geek" — an occasional substitute for "jock" and perhaps the arch-rival of a "nerd" in high-school folklore. If "geek" and "nerd" are synonyms, then "sports geek" might be an oxymoron. (Furthermore, "sports nerd" either doesn't…
Building and operating large-scale information retrieval systems used by hundreds of millions of people around the world provides a number of interesting challenges. Designing such systems requires making complex design tradeoffs in a number of dimensions, including (a) the number of user queries that must be handled per second and the response latency to these requests, (b) the number and size of various corpora that are searched, (c) the latency and frequency with which documents are updated or added to the corpora, and (d) the quality and cost of the ranking algorithms that are used for retrieval. In this talk I'll discuss the evolution of Google's hardware infrastructure and information retrieval systems and some of the design challenges that arise from ever-increasing demands in all of these dimensions. I'll also describe how we use various pieces of distributed systems infrastructure when building these retrieval systems. Finally, I'll describe some future challenges and open research problems in this area.
A. Hadgu, A. Aregawi, and A. Beaudoin. (2021)cite arxiv:2112.08191Comment: 4 pages, 2 figures, 35th Conference on Neural Information Processing Systems (NeurIPS 2021) demonstrations track.
P. Karisani, and E. Agichtein. Proceedings of the 2018 World Wide Web Conference on World Wide Web, page 137--146. International World Wide Web Conferences Steering Committee, (2018)
A. Hadgu, S. Abualhaija, and C. Niederée. The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, page 1305--1308. ACM, (2018)
B. Gambäck, F. Olsson, A. Argaw, and L. Asker. Proceedings of the First Workshop on Language Technologies for African Languages, page 104--111. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
H. Liu, E. Milios, and J. Janssen. Proceedings of the 6th annual ACM international workshop on Web information and data management, page 16--22. ACM, (2004)
E. Aramaki, S. Maskawa, and M. Morita. Proceedings of the Conference on Empirical Methods in Natural Language Processing, page 1568--1576. Stroudsburg, PA, USA, Association for Computational Linguistics, (2011)
S. Chandra, L. Khan, and F. Muhaya. Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, page 838-843. (October 2011)
E. Bakshy, J. Hofman, W. Mason, and D. Watts. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, page 65--74. New York, NY, USA, ACM, (2011)
J. Bergmann, A. Hadgu, and R. Jäschke. Proceedings of the Workshop on Natural Language Processing and Computational Social Science, Hannover, Germany, (May 2016)
A. Hadgu, N. Lotze, and R. Jäschke. Proceedings of the Workshop on Natural Language Processing and Computational Social Science, Hannover, Germany, (May 2016)
X. Wang, L. Tokarchuk, F. Cuadrado, and S. Poslad. Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, page 311--315. New York, NY, USA, ACM, (2013)
X. Wang, L. Tokarchuk, and S. Poslad. Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, page 395-398. (August 2014)
J. Zou, F. Fekri, and S. McLaughlin. Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, page 1586--1589. New York, NY, USA, ACM, (2015)
Y. Zhang, J. Tang, Z. Yang, J. Pei, and P. Yu. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1485--1494. New York, NY, USA, ACM, (2015)
O. Tsur, and A. Rappoport. Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, page 643--652. New York, NY, USA, ACM, (2012)
N. Diakopoulos, and D. Shamma. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, page 1195--1198. New York, NY, USA, ACM, (2010)
X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang. Proceedings of the 20th ACM International Conference on Information and Knowledge Management, page 1031--1040. New York, NY, USA, ACM, (2011)
I. Weber, V. Garimella, and A. Batayneh. Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, page 290--297. New York, NY, USA, ACM, (2013)
B. Hecht, L. Hong, B. Suh, and E. Chi. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, page 237--246. New York, NY, USA, ACM, (2011)
S. Vieweg, A. Hughes, K. Starbird, and L. Palen. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, page 1079--1088. New York, NY, USA, ACM, (2010)
T. Sakaki, M. Okazaki, and Y. Matsuo. Proceedings of the 19th International Conference on World Wide Web, page 851--860. New York, NY, USA, ACM, (2010)
A. Olteanu, S. Vieweg, and C. Castillo. Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work &\#38; Social Computing, page 994--1009. New York, NY, USA, ACM, (2015)